Vulnerabilities of RSA
Not only that prime numbers for RSA should be chosen sufficiently large but also “sufficiently smart”. The following post demonstrates some of the attacks on the RSA, whose security relies on the problem of factorizing a large composite number into two prime factors.
I will assume you know, how the RSA works and how variables are derived from one another. If not, first have a look a the key generation process.
n must NOT be prime
If n is prime, the Euler totient function(tells us how many non-negative numbers smaller than n-1 is coprime with n) is equal to n-1, therefore we can solve the following system of equations for p and q:
Choose safe primes
p is a safe prime if the following holds true:
where p1 is a sufficiently large prime in this form also referred to as Sophie Germain prime. SG primes are safe against the p-1 Pollard factorization method.
The difference |p-q| should not be too small
Factorization goes like this:
- Pick the smallest possible natural number d, such that d² > n holds
- Check whether h is a perfect square: d²-n = h² (In other words, you calculate d²-n and try to take a square root of the result — if it is a natural number you found a perfect square)
- If h is a perfect square, we can factorize n = (d +h)*(d - h)=p*q (just turned the equation around)
- If h is not a perfect square we continue by checking: (d+1)²-n = h²
- Continue until you get a perfect square. Eventually, you should get a square, but the question is rather whether this happens in a reasonable time.
This is called a Fermat factorization algorithm and for minor differences between prime factors, you can effectively factorize n. More on that you can read on Wiki.
Let’s see an example on a huge number n (I’m using gmpy2 library that optimizes arithmetic operations on large numbers in python) how we can decrypt a cipher-text:
import gmpy2# Extended Euclidian algorithm
def egcd(a, b):
x,y,u,v = 0,1,1,0
while a != 0:
q,r = b//a,b%a
m,n = x-u*q, y-v*q
b,a,x,y,u,v = a,r,u,v,m,n
gcd = b return x# Let this be a cipher text encrypted with public key (n, e)
# Generally only the persos with the private key d should be able to # decrypt the text
# ct = m^e mod n
ct = 57325663599943282953411176778760672692001588182270568285477299701301601855034844543857749488699188792444564252696999671126121225262339599684941190805281481530036909901633557747562177212143921951993688236795819575234528918435597825076881022591304477615266001534550688094803477541040688025582469652272459312306980037204095860344390987430032504297556253624723178455630374903473945937195328657623095335348825262669742156357630798697690477854408730009055709420320421471878170419877
8076757624273410630218770228694776965608326700246677477443108683578524190717830814601948297008202853563977842409909023771577825773256513437271035474705285420915255439169706383775886106307200607630073448902808142803254757759974342996672000749920462973505249898850941070843559266264297882666503549349135549583204269040628470247815066953770676158517215889006175528203645217775080795057961639783677482580753503078199490697308594767235780716748270076726# Public key (n, e)
n = 1869578692404843692322851853790504675785490458941418816789937357228584999694229481781082576516004784967793396134042249283731541326522693989189672781989605367915467207173921917764601173241003275079144891798109323179882563958434067801607159395991293988128979523500485675739199386415393849926760028894957048405194696264543096632626266113459760373772648042130342168769029700837387351405620281318401168210834520710877989913969345751805302638927721067161987258238119709324242788800684044845924947196434207131958051848544612331436580528999574831395608419325858276755014492392928240007721159544167457326123991416480753423984310450347787167210183775589168689021239431549646289225543921262925762854468723108863565081485561010957193372906716767453636146225697688527085520729322590370367846553675015245648558243086523704781262012616424483189936559549788076211532572418932359209054902532723352725511759965792485553995457211293158370352311
e = 65537# Assume the difference between prime factors is small
# and that we have found a perfect square in the form (d+1)²-n = h²
nn = gmpy2.mpz(n)
d = gmpy2.add(gmpy2.isqrt(nn), 1) # d^2 ~ n
hs = gmpy2.sub(gmpy2.mul(d, d), nn) # hs is perfect square
h = gmpy2.isqrt(hs)
p = gmpy2.add(d, h) # p = d + h
q = gmpy2.sub(d, h) # p = d — h
assert nn == gmpy2.mul(p, q) # sanity checkprint(“p = “, p)
print(“q = “, q)
print(“Difference =”, p — q)# Now that we manage to factorize the n, let's decrpyt the text
# Calculate Euler-totient function
fi = gmpy2.add(gmpy2.sub(gmpy2.sub(nn, p), q), 1)# Calculate exponent d using EEA (Note: e*d mod φ(n) = 1)
d = egcd(e, fi)# Decrypt ciphertext
pt = pow(ct, d, nn)
print( "pt: " + int(pt).to_bytes(384, byteorder='big').decode("utf-8"))
Size of e and d exponents
Besides the fact that the following relation should hold for the exponents:
there are still some things you need to keep in mind when determining the values:
- Parameter d should be chosen sufficiently large(80 bit ≤), since if d is a small number an attacker could potentially use brute force to decrypt ciphertext by trying all possible combinations of d:
- Parameter e usually has a value of either 3 or 65537. This way there are only two 1 in a binary representation of both numbers. Consequently, the Square & Multiply algorithm can perform exponentiation much more efficiently (only two multiplications).
Follow along with my stories, next up I will post about different factorization methods and intuitively explain them to gain better insight into how we can factorize large numbers such as RSA number n.
Happy Cryptographing!