HomepagePublicationsTalksTeXmacsMathemagix |
We describe a new, highly optimized implementation of number theoretic
transforms on processors with SIMD support (AVX, AVX-512, and Neon). For
any prime modulus and any
order of the form
, our
implementation can automatically generate a dedicated codelet to compute
the number theoretic transform of order
over
. New speed-ups were
achieved by relying heavily on non-normalized modular arithmetic and
allowing for orders
that are
not necessarily powers of two.
Authors:
Keywords: number theoretic transform, finite field, codelet, algorithm, integer multiplication, FFT
View: Html, TeXmacs, Pdf, BibTeX