Assembly support for GMP on AMD64


This is a patch to gmp-4.2.x for AMD64 architecture. The 4.2.x version comes with basic assembly support. This patch gives substantial speed-up.
Only a few functions have been written:
The assembly code is mostly a 64 bit translation of the k7 assembly code that is available in GMP. The main modifications are:


There is almost no change compared to the patch for 4.1.4. The multiplication has been slighlty improved (around 3.15 cyc/limb) but most of the improvement in the gmpbench score comes from modifications in the C code of GMP between the 2 versions.


The code has been reasonably well tested. I used the program tests/devel/try that tests quite a few bug possibilities. Nonetheless, since there are less users, the correctness of the code is less likely than for any official GMP code.


Please send comments and bugs to and *not* to the official GMP developpers: they have nothing to do with this code.


I've got a multiply bench of around 55000 on a 2.4 GHz Opteron (was 41500 with the plain 4.2). The whole gmpbench score is about 10000 (was 8200 before patch).

Download / install