Assembly support for GMP on AMD64


Purpose

This is a patch to gmp-4.1.4 in order to have some assembly loops for the AMD64 architecture. According to the GMP webpage, assembly support is planned only for the version 5.0. For those (like me) who can hardly wait, here is some stuff in order to have decent timings. However, this is still far from the estimated optimal performance given on the GMPBench web page (see Performance section, below).
Only a few functions have been written:
  • add_n
  • sub_n
  • addmul_1
  • submul_1
  • mul_basecase
  • sqr_basecase

  • The assembly code is mostly a 64 bit translation of the k7 assembly code that is available in GMP. The main modifications are:
  • The ABI for function calls is not the same: up to 6 parameters are passed in registers, not on the stack.
  • Change movl to movq, eax to rax, etc... That's the easy part.
  • In an unrolled loop, the size of the unrolled code is not the same, so the computation of the jump is different.
  • Disclaimer

    The code has been reasonably well tested. I used the program tests/devel/try that tests quite a few bug possibilities. Nonetheless, since there are less users and only one developer (!), the correctness of the code is less likely than for any official GMP code.

    Bugs

    Please send comments and bugs to gaudry@lix.polytechnique.fr and *not* to the official GMP developpers: they have nothing to do with this code.

    Performance

    I've got a multiply bench of around 48000 on a 2.4 GHz Opteron (was 27000 without the asm functions). The whole gmpbench score is about 8850 (was 5700 without asm, optimal estimated at 11000).
    Options: CFLAGS = "-O2 -fomit-frame-pointer -funroll-loops -mcpu=k8"

    Download / install

  • Get the gmp-4.1.4 archive and unpack it, thus creating a directory /path_to_gmp/gmp-4.1.4/
  • Download the mpn_amd64 archive and unpack it. In the directory of mpn_amd64, run ./install /path_to_gmp/gmp-4.1.4
  • cd /path_to_gmp/gmp-4.1.4
  • ./configure with your favorite options
  • make ; make check ; make install