Long Long Time (VKA CTF 2024; crypto)

Автор: Inssurg3nt
Один из жителей, наплевав на все ограничения, запустил этот скрипт в далеком 1987 году, когда переехал в город-Z, и вот наконец-то он отработал и выдал нам результат! Правда, он забыл добавить вывод ключа…

Author: Inssurg3nt
One of the residents, ignoring all the restrictions, ran this script back in 1987, when he moved to city Z, and finally it completed execution and displayed the result! However, he forgot to display the key…

Task files: give
Solve script: solve.py

Truly a long time

The task contains the script that supposedly completed execution since 1987. It contains a VERY inefficient implementation of a linear recurrence in $Z_{p}$ (aka integers modulo a prime) that can barely compute itself at $n = 32$ , let alone the values used, which are around $2^{128}$ (yes, 37 years is not even close to being enough).
The script is appropriately named DH.py, as it does do Diffie-Hellman key exchange, except it uses a linear recurrence: both parties calculate the terms corresponding to their private keys, exchange those values and advance the recurrence starting at what they got from the other party. It’s plain to see that this algorithm does generate a valid shared key.

Cool, and…?

Basically, we are Bob and we need to recover Alice’s secret. This amounts to inverting the recurrence (calculating $n$ ), analogous to the discrete logarithm problem for classic DH. As we will see later down the line, this problem actually is equivalent to computing a certain discrete logarithm.

Fibonacci, but not quite

Linear recurrence reminded me of lateralus. This is an interesting task, which involves inverting a closed-form solution for Fibonacci sequence in $Z_{p^{2}}$ to obtain a discrete logarithm problem, which can be solved using a clever trick. It’s actually very similar to what we have here, as the Fibonacci sequence is a prime example of a linear recurrence with constant coefficients (but the coefficients in it are ones). There is a helpful wiki page about those: click. This page contains everything needed to make the script fast.

Characteristic polynomial

Imagine a recurrence:

f (n) = a_{1} f (n - 1) + a_{2} f (n - 2) + \dots + a_{k} f (n - k)

As per that wiki page, we are interested in this polynomial, called characteristic (letting recurrence coefficients equal $a_{i}$ ):

p (x) = x^{k} - a_{1} x^{k - 1} - a_{2} x^{k - 2} - \dots - a_{k}

It’s roots $x_{i}$ partially define the closed form of the recurrence:

f (n) = c_{1} x_{1}^{n} + c_{2} x_{2}^{n} + \dots + c_{k} x_{k}^{n}

$c_{i}$ are dependent on the starting values of the recurrence.

For example, Fibonacci:

Recurrence: $f (n) = f (n - 1) + f (n - 2)$
Characteristic polynomial: $p (x) = x^{2} - x - 1; p (ϕ) = 0, p (ψ) = 0$
Closed form: $f (n) = c_{1} ϕ^{n} + c_{2} ψ^{n}$
As it turns out, for starting values $f (0) = 0; f (1) = 1$ we have $c_{1} = - c_{2} = \frac{1}{5}$
So, in the end, we have: $f (n) = \frac{ϕ ^{n} - ψ ^{n}}{5}$

Praise Sagemath

Let’s start by finding the roots:

F = GF(modulus)
(x,) = PolynomialRing(F, "x").gens()
characteristic = x**SIZE - sum(
	F(coef) * x ** (SIZE - i - 1) for i, coef in enumerate(coefs)
)
roots = characteristic.roots()

You should get $k$ roots, accounting for multiplicities.

Lacking roots?
Sometimes¹, you might not get the expected $k$ roots. There may even be no roots at all! That doesn’t mean the recurrence is unsolvable. Instead, it makes it even more interesting :)
This situation is similar to the familiar reals $R$ . If we don’t have enough roots, we just make it a bit more complex, switching over to an extension field $C$ , aka $R [i]$ ( $R$ adjoint $i$ ).
In the case of $Z_{p}$ (or $GF (p)$ , equivalently), the extension is $GF (p^{n})$ (which is not integers modulo $p^{n}$ but something like a polynomial ring over $Z_{p}$ ), and it can be automatically found by sagemath (as always):
SF = characteristic.splitting_field("a")
print(SF) # Finite Field in a of size 17^13
roots = characteristic.roots(SF, multiplicities=False)
# [a ^ 7 + 10*a^3 + 10, ... ] (not a real example)

Now we only need to find $c_{i}$ values. Consider the equations for the (known) starting values:

f (0) = c_{1} x_{1}^{0} + c_{2} x_{2}^{0} + \dots + c_{k} x_{k}^{0}

f (1) = c_{1} x_{1}^{1} + c_{2} x_{2}^{1} + \dots + c_{k} x_{k}^{1}

⋮

f (k - 1) = c_{1} x_{1}^{k - 1} + c_{2} x_{2}^{k - 1} + \dots + c_{k} x_{k}^{k - 1}

In terms of the unknowns $c_{i}$ this is a linear system, which can be solved in terms of matrices, as usual:

mat = matrix(
	SF,
	[[root**i for (root, mul) in roots] for i, init in enumerate(inits)],
)
vec = vector(SF, [SF(init) for i, init in enumerate(inits)])
closed_form_coefs = mat.solve_right(vec)

Now we have everything needed to assemble the closed form solution to the recurrence:

def closed_form(n):
	return sum(
		coef * root**n for (root, mul), coef in zip(roots, closed_form_coefs)
	)

What now?

We have the terms $f (n), f (n + 1), \dots, f (n + d)$ at our disposal. Let’s see what kind of equations we are working with (let n be the secret, which we want to find out):

f (n) = c_{1} x_{1}^{n} + c_{2} x_{2}^{n} + \dots + c_{k} x_{k}^{n}

f (n + 1) = c_{1} x_{1}^{n + 1} + c_{2} x_{2}^{n + 1} + \dots + c_{k} x_{k}^{n + 1}

⋮

f (n + d) = c_{1} x_{1}^{n + d} + c_{2} x_{2}^{n + d} + \dots + c_{k} x_{k}^{n + d}

After some simple algebraic manipulation:

f (n) = c_{1} x_{1}^{n} + c_{2} x_{2}^{n} + \dots + c_{k} x_{k}^{n}

f (n + 1) = c_{1} x_{1} x_{1}^{n} + c_{2} x_{2} x_{2}^{n} + \dots + c_{k} x_{k} x_{k}^{n}

⋮

f (n + d) = c_{1} x_{1}^{d} x_{1}^{n} + c_{2} x_{2}^{d} x_{2}^{n} + \dots + c_{k} x_{k}^{d} x_{k}^{n}

If we (similarly to lateralus!) focus on $x_{i}^{n}$ , this turns into a linear system again. Except in lateralus, the resulting discrete log was in $Z_{p^{2}}$ , which allowed to calculate it efficiently.
This time, the problem is in $Z_{p}$ (and $p - 1$ isn’t smooth), so it’s practically unsolvable. So no luck, huh…

Not so fast!

Actually, if the coefficients were random and the roots would’ve been all distinct, that would’ve been true. But in reality, I lied to you. In this task the coefficients are special: there is a double root for the characteristic polynomial. That makes the closed form a bit different (let $x_{1}$ be the double root):

f (n) = c_{1} x_{1}^{n} + c_{2} n x_{1}^{n} + \dots + c_{k} x_{k - 1}^{n}

The process of finding the coefficients is practically the same (see solve script).

Is that even worse? Or…

You might think there’s no way having $n$ both in the exponent and in the coefficient is better:

f (n) = c_{1} x_{1}^{n} + c_{2} n x_{1}^{n} + \dots + c_{k} x_{k - 1}^{n}

f (n + 1) = c_{1} x_{1}^{n + 1} + c_{2} (n + 1) x_{1}^{n + 1} + \dots + c_{k} x_{k - 1}^{n + 1}

⋮

f (n + d) = c_{1} x_{1}^{n + d} + c_{2} (n + d) x_{1}^{n + d} + \dots + c_{k} x_{k - 1}^{n + d}

But just look at how we can again choose $k$ variables to make the system linear:

f (n) = c_{1} x_{1}^{n} + c_{2} n x_{1}^{n} + \dots + c_{k} x_{k - 1}^{n}

f (n + 1) = (c_{1} x_{1} + c_{2} x_{1}) x_{1}^{n} + c_{2} x_{1} n x_{1}^{n} + \dots + c_{k} x_{k - 1} x_{k - 1}^{n}

⋮

f (n + d) = (c_{1} x_{1}^{d} + c_{2} d x_{1}^{d}) x_{1}^{n} + c_{2} x_{1}^{d} n x_{1}^{n} + \dots + c_{k} x_{k - 1}^{d} x_{k - 1}^{n}

See it? The variables are $x_{i}^{n}$ and a special one $n x_{1}^{n}$ . So we bypass discrete logarithm entirely by solving this system and calculating $\frac{n x _{1}^{n}}{x _{1}^{n}}$ !

Problem is, this doesn’t work at all for the values given…

Please check your tasks…

The reason turns out to be that in process of constructing the output of the script that supposedly was ran, the author somehow swapped the order of the given terms, ruining the solution. I have spent hours looking for a non-existent bug in my program just because of that.

S_b.reverse()

…
Run solve_test.py to verify that the method works for a random $n$ .

Anyway, how did it happen?

I don’t know. This probably was more reasonable for them, because their solution is very different. The proposed solution uses matrices all the way, as there is a way to express the recurrence as matrix multiplication, and the given values are a special case in which calculating the logarithm is easy. The corresponding math is Linear Recurrences and Jordan normal form.
There is definitely a deeper connection between my solution and theirs, but I’m yet to think that through.

Anyway, this is all I wanted to tell, thank you for reading!

Even though this doesn’t happen in this task (or you have just confused the order of coefficients as I did at first), I really wanted to include this here to showcase how there are natural extensions for $GF (p)$ . ↩

📝 Writeup dump

Long Long Time (VKA CTF 2024; crypto)

Truly a long time

Cool, and…?

Fibonacci, but not quite

Characteristic polynomial

Praise Sagemath

What now?

Not so fast!

Is that even worse? Or…

Please check your tasks…

Anyway, how did it happen?

Table of Contents

📝 Writeup dump

Long Long Time (VKA CTF 2024; crypto)

Truly a long time

Cool, and…?

Fibonacci, but not quite

Characteristic polynomial

Praise Sagemath

What now?

Not so fast!

Is that even worse? Or…

Please check your tasks…

Anyway, how did it happen?

Footnotes

Table of Contents