Assembly support for GMP on AMD64


Purpose

This is a patch to gmp-4.2.x for AMD64 architecture. The 4.2.x version comes with basic assembly support. This patch gives substantial speed-up.
Only a few functions have been written:
The assembly code is mostly a 64 bit translation of the k7 assembly code that is available in GMP. The main modifications are:

Changes

There is almost no change compared to the patch for 4.1.4. The multiplication has been slighlty improved (around 3.15 cyc/limb) but most of the improvement in the gmpbench score comes from modifications in the C code of GMP between the 2 versions.

Disclaimer

The code has been reasonably well tested. I used the program tests/devel/try that tests quite a few bug possibilities. Nonetheless, since there are less users, the correctness of the code is less likely than for any official GMP code.

Bugs

Please send comments and bugs to gaudry@lix.polytechnique.fr and *not* to the official GMP developpers: they have nothing to do with this code.

Performance

I've got a multiply bench of around 55000 on a 2.4 GHz Opteron (was 41500 with the plain 4.2). The whole gmpbench score is about 10000 (was 8200 before patch).

Download / install