Breaking RSA with too few random padding bytes
Table of Contents
Introduction
I was presented with a challenge by a friend who dared me to solve it, specifically targeting the exploitation of RSA with insufficient random padding bytes.
The challenge had the following description:
To maximize the amount of space for a text message, Alice had the very dubious idea of requiring that all text messages sent to her be padded using only one random byte. The format of the plaintext messages sent to her is
| 0 | random_byte | padding_zeros | text_message | --- the bytes n-1 n-2 n-3 k k-1 0 --- the byte numbers MSB LSB
where
• n is the number of bytes of her public modulus, i.e., its number of bits divided by 8 (rounded upwards),
• random_byte is a single byte with a random value (0 to 255),
• padding_zeros are one or more bytes with value zero, and
• text_message is a UTF-8 encoded text message (the first character of the text message is stored in the least significant byte of the message, see example below). The text message, without the NUL C-style string terminator, is stored in k bytes
Alice’s 2048-bit public modulus is
$$ \begin{aligned} &n=32256584853881668819643226749483462544517280786196195255582\\ &67807412552284139095654078296201440843156590987754424571814090\\ &09228314141139330943826858041810287156287891738485415520908563\\ &81959547800109346250918023077411316486779059209905502142149463\\ &49691567542517177994080214123680253910044644586494846900992974\\ &54249961612370685073608213864469821870612710050353962139785366\\ &61615000916707597139419106311209764388516023878827425893886409\\ &70165327741764697900069661961528413593944646774167931434958123\\ &40592927067214756828744957792258547598849849659332782499778983\\ &57852295765822796204816960068087631764580573733332129094366877 \end{aligned} $$
and her public exponent is
$$ \begin{aligned} &e=65537=2^{16}+1. \end{aligned} $$
We were given two tasks in this challenge, firstly to complete a small training example to test our code and secondly to complete our final mission.
Walkthrough
Small training example
The small training example is based on the mission challenge but with a smaller scope. For this example, we are given the following description:
Before tackling the true mission, develop code that solves the following smaller example. Although we provide here all information for this smaller example (including Alice’s private data and the actual padding bytes and original plaintext message), recover the plaintext using only n, e, c1, and c2.
Private data:
p = 91385177868893188439606205446657433154550775543347736767839078918307136071207 q = 126919665154092992838405484421396465545153294878449134175022513263373469020623
Public data:
n = 11598576175167152956424461782394541439955617267494049729274868685493179955542949179810730474765255703849636600732122155392474369596059442252066674279501961 e = 17
The plaintext message is ”Test”, which becomes (in ASCII and UTF-8, ’T’ is 84, ’e’ is 101,’s’ is 115 and ’t’ is 116)
$$ \begin{aligned} &m=84+256×101+256^{2}×115+256^{3}×116=1953719636 \end{aligned} $$
The padding random bytes of the two messages are 133 and 147, and so the properly padded messages to be encrypted are, respectively,
$$ \begin{gather*} &m_1=m+256^{62}×133 \\ &300742762100458034307461803396035672109\\ &903987980206577533777394443832292544679474729900439732442444438\\ &23771410421958152397183544515322048677457800947028 \end{gather*} $$
The corresponding encrypted messages are
$$ \begin{gather*} &c_1=45689543713982413127880466796619733702071362918064735282188 \\ &139557967018242699881750940143160979352750713318253423828978300\\ &26928706806283261576848215938045 \end{gather*} $$ and $$ \begin{gather*} &c_2=10774777440091235821269290404723502603355688369384505562693\\ &265479722439768712101419248994746280675715013811585517386981365\\ &362526469147816190061167310265864 \end{gather*} $$
We were also given the following hint: Coppersmith, Franklin, Patarin, and Reiter
.
By searching the hint we can see that it’s a reference to the paper Low-Exponent RSA with Related Messages, in short, the paper explains how to retrieve the plaintext if we have two related cyphertexts and the messages are related to each other.
On the challenge, the ciphertexts given are the same message but with different padding, in this case, a different bit. Knowing this we know that the messages are related to each other. In the training example, we saw that the difference between the paddings is 14. In this case, we will use this value as the variable t.
The first step to using the approach given in the paper is to represent the messages as polynomials and get the relation between them. By checking the way the messages are generated we can achieve the following results:
$$
\begin{gather*}
&m_1=m+256^{62}×133\\
&m_2=m+256^{62}×147\\
&m_1=m_2+256^{62}×14
\end{gather*}
$$
Now we suppose that the messages are encrypted with an exponent of 17.
We have the following expression:
$$
\begin{gather*}
&c_i=m_i\times133\bmod N,\space i=1,2
\end{gather*}
$$
Then we can calculate c1, c2 related to m by using the following formula:
$$
\begin{gather*}
&c_1=m^{17}\bmod N\\
&c_2=(m+256^{62}\times t)^{17}\bmod N
\end{gather*}
$$
We can then relate them to each other:
$$
\begin{gather*}
&c_2-c_1=(m+256^{62}\times t)^{17}-m^{17}\bmod N
\end{gather*}
$$
Due to lack efficiency in this aproach, our approach won’t be as direct as the first part of the paper. Instead, we will take advantage of the following shortcut given by the paper:
Fortunately there is an easier method. Let z denote the unknown message m. Then z satisfies the following two polynomial relations: $$ \begin{gather*} &z^{2}-c_1= 0\bmod N\\ &(z+1)^{5}-c_2= 0\bmod N \end{gather*} $$ where the ci are treated as known constants. Apply the Euclidean algorithm to find the greatest common divisor of these two univariate polynomials over the ring Z/N: $$ \begin{gather*} &gcd(z^{5}-c_1, (z+1)^{5}-c_2)\in\Zeta/\Nu[z]. \end{gather*} $$ This should yield the linear polynomial z - m (except possibly in rare cases).
Knowing this, we can let z denote the unknown message m. Therefore z satisfies the following two polynomial relations: $$ \begin{gather*} &z^{17}-c_1= 0\bmod N\\ &(z+256^{62}\times t)^{17}-c_2= 0\bmod N \end{gather*} $$ We can finally translate this into code. To help us with the task of doing the complicated math in python, we will be using the library sagemath and the project crypto-attacks. By translating the math to code we get the following code:
from sage.all import Zmod
from helper import fast_polynomial_gcd
N = 11598576175167152956424461782394541439955617267494049729274868685493179955542949179810730474765255703849636600732122155392474369596059442252066674279501961
c1 = 4568954371398241312788046679661973370207136291806473528218813955796701824269988175094014316097935275071331825342382897830026928706806283261576848215938045
c2 = 10774777440091235821269290404723502603355688369384505562693265479722439768712101419248994746280675715013811585517386981365362526469147816190061167310265864
t = 14
e = 17
# Create the polynomial ring over Z field
x = Zmod(N)["x"].gen()
g1 = x ** e - c1
g2 = (x + 256**62 * t) ** e - c2
g = -fast_polynomial_gcd(g1, g2).monic()
# Get the value of the plaintext
m = int(g[0])
# Remove the padding
m = m.to_bytes((m.bit_length() + 7) // 8, 'little')
# Remove random bit at the end of the message
m = m[:-1]
# Convert to ascii
m = m.decode('ascii')
print("Flag:", m)
And with this, we were able to complete the training example.
Mission
Our final mission was described as follows:
Bob sent Alice the following ciphertext (using Alice’s public modulus and exponent)
$$ \begin{gather*} &c_1=1399970625549014455421308400736210030945662558362953572598\\ &382444074245011451002192362849938580065576870469751387389273086\\ &206217906450301744222857785063104818003557444229393404173050387\\ &741532623600586655046171463075044697765722226994570144187331662\\ &652782035424087087418856432695472090608317855177421617414712003\\ &761481102837113900141860890235282302483616444962916326328759432\\ &975306029490358302629740992800354180841301004276530602003470566\\ &643500357632172084094528868862301495303453451552190565880607255\\ &925026878441544105927652895555098514816004352740078530089879045\\ &3936675517505792768625755579501439957445247311207034994.\\ \end{gather*} $$ That was a message Mallory was interested in, so he intercepted it and forced Bob to resend the message again (but with a different padding byte). The ciphertext of the second message was
$$ \begin{gather*} &c_2=7330646540408462499487732988734435881882600143917250125234\\ &725645295265436844855956345867015460257851867152226082992604583\\ &380722353979123823769153663750127378381181947124303124717022628\\ &317971445602980523815915703319718627892152151763224999403696019\\ &840134053621522434775044673990084838715747559195164385146719276\\ &035186448119274230106962051610865462775732255984526858086851588\\ &348829328052439726843671068196575220476292945021334692065439186\\ &080314715245574910218285041885631078655049765029101223651172336\\ &925026878441544105927652895555098514816004352740078530089879045\\ &086726816604586051074113029899282669743291791900574376.\\ \end{gather*} $$ Your mission, should you choose to accept it, is to decipher the message.
We can see that the approach to carrying out the mission is similar. The only differences are that we don’t know the difference between the random bits used as padding (t
) and in this case, the public modulus and exponent to be used are Alice’s.
To get to know the variable t we must brute force it. This will lead to the code taking a lot of time to get the ciphertext. However, it’s the only approach.
The final code to solve the mission, taking into account the need for brute force and the change to the public modulus and exponent, is the following:
from sage.all import Zmod
from helper import fast_polynomial_gcd
N = 32256584853881668819643226749483462544517280786196195255582678074125522841390956540782962014408431565909877544245718140900922831414113933094382685804181028715628789173848541552090856381959547800109346250918023077411316486779059209905502142149463496915675425171779940802141236802539100446445864948469009929745424996161237068507360821386446982187061271005035396213978536661615000916707597139419106311209764388516023878827425893886409701653277417646979000696619615284135939446467741679314349581234059292706721475682874495779225854759884984965933278249977898357852295765822796204816960068087631764580573733332129094366877
e = 65537
c1 = 13999706255490144554213084007362100309456625583629535725983824440742450114510021923628499385800655768704697513873892730862062179064503017442228577850631048180035574442293934041730503877415326236005866550461714630750446977657222269945701441873316626527820354240870874188564326954720906083178551774216174147120037614811028371139001418608902352823024836164449629163263287594329753060294903583026297409928003541808413010042765306020034705666435003576321720840945288688623014953034534515521905658806072559250268784415441059276528955550985148160043527400785300898790453936675517505792768625755579501439957445247311207034994
c2 = 7330646540408462499487732988734435881882600143917250125234725645295265436844855956345867015460257851867152226082992604583380722353979123823769153663750127378381181947124303124717022628317971445602980523815915703319718627892152151763224999403696019840134053621522434775044673990084838715747559195164385146719276035186448119274230106962051610865462775732255984526858086851588534412481383870183265812676914984402719899236047964431291778087348829328052439726843671068196575220476292945021334692065439186080314715245574910218285041885631078655049765029101223651172336086726816604586051074113029899282669743291791900574376
# t = ?
# t is the difference between the value of the random bit used on m1 and m2, this bit value is from 0 to 255
# The value of t is unknown, so we we need to bruteforce it to find the value of the plaintext
# We try every possible value of t from -255 to 255
for t in range(-255, 256):
print("Trying t = {}".format(t), end="\r")
# We calculate the value of the plaintext by using the formula from the paper
# We use the function fast_polynomial_gcd to calculate the gcd of the two polynomials
# The function fast_polynomial_gcd is based on the code from the github repository "https://github.com/jvdsn/crypto-attacks"
# Create the polynomial ring Z[x]
x = Zmod(N)["x"].gen()
# Create the two polynomials
p1 = x**e - c1
p2 = (x + 256**254 * t)**e - c2
# Calculate the gcd of the two polynomials
gcd = -fast_polynomial_gcd(p1, p2).monic()
# Try to find the value of the plaintext
try:
m = int(gcd[0])
# Remove the padding
m = m.to_bytes((m.bit_length() + 7) // 8, 'little')
# Remove random bit at the end of the message
m = m[:-1]
# Convert to ascii
m = m.decode('ascii')
print("Flag:", m)
break
except:
pass
After running it for a couple of hours we should be greeted with the following plaintext:
Flag: You shall not use too few random padding bytes.
This was only achieved due to the generation of the ciphertext using only one random byte as padding and due to the replay of the message.