Parallel modular multiplication using 512-bit advanced vector instructions: RSA fault-injection countermeasure via interleaved parallel multiplication

Benjamin Buhrow, Barry Gilbert, Clifton Haider

Research output: Contribution to journalArticlepeer-review

Abstract

Applications such as public-key cryptography are critically reliant on the speed of modular multiplication for their performance. This paper introduces a new block-based variant of Montgomery multiplication, the Block Product Scanning (BPS) method, which is particularly efficient using new 512-bit advanced vector instructions (AVX-512) on modern Intel processor families. Our parallel-multiplication approach also allows for squaring and sub-quadratic Karatsuba enhancements. We demonstrate 1.9× improvement in decryption throughput in comparison with OpenSSL and 1.5× improvement in modular exponentiation throughput compared to GMP-6.1.2 on an Intel Xeon CPU. In addition, we show 1.4× improvement in decryption throughput in comparison with state-of-the-art vector implementations on many-core Knights Landing Xeon Phi hardware. Finally, we show how interleaving Chinese remainder theorem-based RSA calculations within our parallel BPS technique halves decryption latency while providing protection against fault-injection attacks.

Original languageEnglish (US)
Pages (from-to)95-105
Number of pages11
JournalJournal of Cryptographic Engineering
Volume12
Issue number1
DOIs
StatePublished - Apr 2022

Keywords

  • AVX-512
  • CRT-RSA
  • Fault-injection countermeasure
  • Montgomery multiplication

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Parallel modular multiplication using 512-bit advanced vector instructions: RSA fault-injection countermeasure via interleaved parallel multiplication'. Together they form a unique fingerprint.

Cite this