HomepagePublicationsTalksTeXmacsMathemagix |
We describe a new, highly optimized implementation of number theoretic transforms on processors with SIMD support (AVX, AVX-512, and Neon). For any prime modulus and any order of the form , our implementation can automatically generate a dedicated codelet to compute the number theoretic transform of order over . New speed-ups were achieved by relying heavily on non-normalized modular arithmetic and allowing for orders that are not necessarily powers of two.
Authors:
Keywords: number theoretic transform, finite field, codelet, algorithm, integer multiplication, FFT
View: Html, TeXmacs, Pdf, BibTeX