Related Skills:

Data Science and Databases27-minute read

Getting Started With the SRVB Cryptosystem

This article will give you an introduction to the principles behind public-key cryptosystems and introduce you to the Santana Rocha-Villas Boas (SRVB) cryptosystem, developed by the author of the article and prof. Daniel Santana Rocha. The algorithm authors are making a campaign that includes a financial reward to anyone who manages to crack the code.

Last updated: Sep 21, 2023

authors are vetted experts in their fields and write on topics in which they have demonstrated experience. All of our content is peer reviewed and validated by Toptal experts in the same field.

Last updated: Sep 21, 2023

authors are vetted experts in their fields and write on topics in which they have demonstrated experience. All of our content is peer reviewed and validated by Toptal experts in the same field.

Yuri da Silva Villas Boas

Yuri has experience in C++ and a good background in mathematics, statistics, and physics. He developed the SRVB cryptosystems.

Previously At

Introduction

Information security is a fascinating field of knowledge that can involve anything from theoretical computer science to software engineering, and even observe the psychology of human error.

Cryptography is now one of the many anonymous technological heroes of our daily life. Social networks, web banking, military intelligence, and any other information system that deals with sensitive information all rely heavily on cryptography. Cryptography allows us to have privacy, which some consider the 12th human right.

This article will give you an introduction to the principles behind public-key cryptosystems and introduce you to the Santana Rocha-Villas Boas (SRVB), a cryptosystem developed by the author of the article and Prof. Daniel Santana Rocha. At the time of writing, the algorithm authors are cooking up a campaign that includes a financial reward to anyone who manages to crack the code. Since the article will cover the algorithm functionality in detail, this is the best place to start the pursuit for the prize. More information is available on the SRVB site.

What Is a Cryptosystem?

Alice and Bob talk over an insecure channel

Cryptography is any method to hamper the interpretability of a message, while still allowing a way to feasibly interpret it as long as a specific instruction is provided, which is usually the so called “key.” While this is a very broad definition that encompases even the earliest techniques, it is worthy to mention that this does not cover everything there is to information security.

The technological race between encryption methods and ways to crack them is expected to never have a definitive winner. Each new generation is expected to raise the standards of information security and cryptanalysis, which is the set of techniques to systematically decipher/break encrypted messages, i.e., bypass the security or encryption.

In order for a cryptosystem (given technique of cryptography) to be deemed secure by its users, it has to receive the approval of the international community of experts, and thus be publicly known, which is known as Kerckhoffs’ principle. Yet, this very condition exposes the system to scrutiny from the world’s cryptanalysis community, who will try to devise ways of systematically breaking the encryptions.

How can one make a given encryption process secret enough that only the intended agents can decrypt it, while at the same time public enough that the world’s cryptanalysis community can attest its robustness? The answer is a component which is a key element of Cryptology: the key. A key of a cryptosystem is a parameter for either the encryption or decryption algorithms, or both. By keeping the parameters secret, rather than the algorithm family, both contradictory requirements can be achieved. Provided that the family of parameters is large enough (possibly infinite) and each of its components can be proved to have adequate properties, it will not be feasible for attackers to determine the parameters merely by inspection.

Finally, in order for a key to work effectively, it must be easily produced but nearly impossible to guess. With the computational power of today’s computers, a computer would need less time to decipher a human generated key than any human would need to even imagine it, on top of the fact that it’s not cost effective to generate keys in that way anyway. Because of that, keys tend to be generated by an algorithm.

A key generating algorithm must not create predictable/reproducible output as a result of typical use. Since algorithms perform procedures without any intelligent process to it, the question becomes how this can be done. The answer is turning key generating algorithms into maps that transform a large amount of truly random bits into keys. Truly random bits can be acquired from the operating system, which generates them from uncertainty in the universe. Some sources would be your mouse movement, network package latencies, or even dedicated hardware, depending on the application.

Public-Key Cryptosystems

Prepare to be surprised yet again, because we shall now introduce a concept that seemingly contradicts what we have just said: the public key.

So far, we have seen the creation of secure communication after secret parameters (keys) have been securely exchanged, but the problem of exchanging the parameters over a public channel remains. Right now, we will introduce a concept that solves a slightly more palpable problem: the creation of a secure channel.

Suppose Alice is working with Bob, and they want to keep their work interactions secure using encryption. They can meet and exchange their encryption keys by giving each other a USB flash drive with their keys on them. But what if Alice and Bob are located on different continents. How to establish a secure channel where there are none existing?

Sending keys via email wouldn’t be an option, since their competitor Eve can intercept the exchange and use their keys to read all the encrypted data afterward. If they had any other digital channel through which they could transmit this sensitive data, then they wouldn’t need encryption, and thus keys, in the first place. Sending the key via physical mail still could also be intercepted, for it requires exchanging sensitive information to begin with. Sending a steganographed key by hiding within other data would be clever, but useless unless the sender is sure that the recipient, and the recipient alone, is aware of the existence of such a message and knows how extract it. As it happens, the awareness of the existence of a message together with the description of how to extract the key from the data is a type of key by itself, which brings us back to square one.

The solution is devising a cryptosystem for which the encryption parameter is not enough to feasibly construe the original message^[1], and keeping to yourself the parameter that would allow construing of the message. We call that parameter the private key. Based on the private key, one can feasibly generate a set of parameters for an encryption tool without making those new parameters themself able to reveal what the private key is. That set of parameters can be publicly shared, because it’s not that important who can encrypt something, as long as only one person can decrypt it. Since this set of parameters for the encryption tool can be made public, it is called a public key.

Cryptography where the encryption and decryption keys differ, and the former can not feasibly be used to deduce the latter, is called asymmetric cryptography, while symmetric cryptography is what we have when those keys are the same, or are easily deduced from one another.

Alice sends Bob her public key, which can only be used to encrypt things that only she can decrypt (with her private key, which she keeps privately) and, conversely, Bob sends Alice his public key, which can only be used to encrypt things that he alone can decrypt (by means of his private key, which he also keeps privately). But how can one possibly publish a parameter for an encryption algorithm from which one cannot deduce the exact inverse algorithm?

The answer lies in the field of mathematics that is closest related to programming, the computational complexity theory. Anyone who has delved deep enough into mathematical problems has heard about transformations that are easy to do in one way, but hard to do the inverse. In calculus, for example, finding a textbook derivative typically is a straightforward exercise, while doing the inverse (like solving any slightly non trivial integral or textbook physical ODE or PDE, for example) requires a more investigative process of first hypothesizing families of functions that are guaranteed (or at least plausible) to contain the solution(s) and solve inverse problems to find solutions from these families.

An example that is actually utilized in the RSA cryptosystem is multiplying big primes versus factoring the resulting product without already knowing the factors. Doing multiplication is trivial, but factoring requires you to randomly^[2] guess the prime factors, which have hundreds of digits. An important property of the operation is the need for it to scale well. Adding a few digits on the size of the prime numbers in RSA will result in a key that requires thousands of times more operations to crack while adding a tiny increase in complexity to the encryption process. Very roughly speaking, the product of prime numbers is used for encrypting, while the pair of prime factors are used for decrypting.

With all of this in mind, let’s take a look at how the SRVB cryptosystem works.

The Underlying Algorithm - Looking into SRVB

The Subset Sum Problem

Like any other public-key cryptosystem, SRVB relies on the difficulty of solving a particular problem that is easy to produce. In SRVB’s case, it’s the subset sum problem, which can be described as follows:

Given the integer $w$ and $v_1, \cdot \cdot \cdot, v_N \in Z$ find the sequence $b_1, \cdot \cdot \cdot, b_N \in {0,1}$, such that $ \sum_{i = 1}^{N} v_i b_i = w $.

Clearly, this problem can be produced by randomly picking N integers, randomly choosing a subset of them and summing up this subset - which is trivial.

A brute-force search would have a complexity of $ O( N * 2^N ) $, calculating for each combination of values of the $b$s. A more efficient approach was given by Horowitz and Sahni in 1972, with a complexity of $ O ( N * 2 ^ {N / 2} ) $. The problem is at least as hard if we replace the $b$s and $w$ with $k$-dimensional vectors of integers. The realm where this more difficult problem is to be held, however, must also have an isomorphism with a ring where an easier version of the same problem takes place, as we shall see below. For this reason, SRVB exploits the subset sum problem within Gaussian integers, where $ k = 2 $.

There is a special case where this problem gets easy to compute through the use of a greedy algorithm. If we sort a sequence scaling factors $ v_1, \cdot \cdot \cdot, v_N $ in which each integer in the sequence is larger than the sum of all the integers that came before it ($ \forall i , \sum_{j=1}^{i-1} v_j < v_i $), one could use a greedy approach to find the sequence b in linear time. A sequence with the described properties is called a superincreasing sequence.

Here is a simple description of the greedy solution for this case:

Start with $ i = N $, the currently observed factor, and $ w’ = w $, the remainder of $w$
If the current scaling factor is too big to fit into what remains of $w$, meaning $ v_i > w’$, set $b_i = 0$ and continue to the next step. Otherwise we know that the scaling factor needs to be in the sequence (since the rest of the factors is smaller than $v_i$) and we set $b_i = 1$.
$ w’ \Leftarrow w’ - v_i * b_i$, $i \Leftarrow i - 1$. If $i > 0$, return to step 2.
Verify that, now, $w’ == 0$, otherwise the problem was corrupted

This works because we know that all multipliers smaller than the currently observed one can’t collectively cover as much of the (sub)sum $ w’ $ as the current multiplier can. So if the remaining sum is larger than the current factor, we know for sure that all following factors together will fail to sum up to it without the current factor’s help. On the other hand, since all multipliers are positive, if the current factor $ v_i $ is larger than the remaining sum $ w’ $, adding any other factor would make the result surpass $ w’ $ even more.

Let’s set up a notation for the rest of the article. We choose $ v_1, \cdot \cdot \cdot, v_n $ and $w$ to be the factors and sum of the superincreasing sequence, while $ u_1, \cdot \cdot \cdot, u_n $ and $y$ will be the publicly available parameters that are provided to recover $ b_1, \cdot \cdot \cdot, b_n $.

With a sequence $ u_1, \cdot \cdot \cdot, u_n $ picked so it is not superincreasing, and number $y$ being publicly available, not enough information is publicly provided to recover the sequence $ b_1, \cdot \cdot \cdot, b_n $. However, if there is a superincreasing sequence $ v_1, \cdot \cdot \cdot, v_n $ that could take the place of sequence $ u_1, \cdot \cdot \cdot, u_n $, one could use this sequence to solve the problem with a greedy approach.

Below we shall show how that works.

Use of Subset Sums in a Previous Cryptosystem

In 1978, Ralph Merkle and Martin Helman, devised a way to exploit those two knapsack paradigms and the linearity of the modulus operation to build the connection between the two sequences described in the previous section - thus conceiving a Public-Key Cryptosystem. The idea was to transform the easy knapsack (the one consisting of the superincreasing vector of integers) into the hard one (the one lacking this property) by means of a linear operation (the modulus) with secret operands. The transformation consists of multiplying every $v_i$ by a $\theta$ and taking the remainder of this product by an $\alpha$, where $\alpha \gt \sum_{i=1}^N v_i$ and $\gcd(\alpha, \theta) = 1$ (two constraints that will soon be justified). The result, the sequence $u_1, \ldots, u_N$, is not superincreasing anymore, and can be considered a hard knapsack.

A random permutation of the sequence $u_1, \ldots, u_N$ would be given to the party that wants to encrypt and send a message to us. The permutation is done so that a third party has a harder time of guessing what the original superincreasing sequence is. Bit blocks of a message are used as the coefficients $b_1, \ldots, b_N$. Encryption is done by multiplying the key sequence with this sequence of coefficients, and summing up the multiplications into a result, that we shall label $y$. Only the owner of the private key can transform $y$ into the corresponding $w$ that would be obtained if these same $b_1, \ldots, b_N$ blocks would have been multiplied with the easy integers (the sequence $v_1, \ldots, v_N$). That is done by means of multiplying $y$ by $\theta^{-1}$, the multiplicative inverse of $\theta$ modulus $\alpha$ (whose existence depends on that condition of $\gcd(\alpha, \theta) = 1$). In other words, $(\theta * \theta^{-1}) \bmod \alpha = 1$. After that, we calculate $w = (y * \theta^{-1}) \bmod a$. The reason this is guaranteed to work is the linearity of modulus, i.e., that the linear combination of the remainders equals the remainder of the linear combination.

If we consecutively apply the definition of $u$, the quotient ring, and the property of linearity of the modulus operator, we see the correspondence:

$ \begin{align} y & = \sum_{i=1}^N b_iu_i \newline & = \sum_{i=1}^N b_i (v_i * \theta \bmod \alpha) \newline & = \sum_{i=1}^N b_i * v_i * \theta \bmod \alpha \newline & = \left[ \sum_{i=1}^N b_i * v_i * \theta \right] \bmod \alpha \newline & = \left[ \sum_{i=1}^N b_i * v_i \right] * \theta \bmod \alpha \newline & = w * \theta \bmod \alpha \end{align} $

And thus the easy sum $w$ can be recovered by multiplying both sides with $\theta^{-1}$ and taking the modulus with $\alpha$. To do so, one has to know both $\alpha$ and $\theta$ (that allow one to easily calculate $\theta^{-1}$), which are to be kept private as parts of the private key.

One single constraint, $\alpha \gt \sum_{i=1}^N v_i$, was left unjustified and here goes the explanation for it: The equality between the second and third lines consists of an equality between residue classes of the quotient ring of integers modulo $\alpha$. In other words, it only states the equality of the remainder of the members when divided by $\alpha$, which is not a sufficient condition for the equality of the members themselves. As a result, more than one vector of $b$ values could be mapped into a single $y$, which is prevented by limiting the maximum possible subsum (i.e. the sum of all parcels $v_i$) $w_{max}$ to be smaller than $\alpha$ for any combination of $b$ values.

Just like any other instance of daily life, complete knowledge of all hypotheses is often impossible and never easy. As a result, we have to rely on mathematical intuition when judging if a cryptosystem is safe to use, which provides us no actual guarantees. Six years after its creation, the MH cryptosystem was broken with an attack by A. Shamir. There were several more attempts to revive MH, many of which also failed.

The Santana Rocha - Villas Boas (SRVB) Cryptosystem

In 2016, after a few brainstorms with this article’s author about differently inspired^[3] possibilities of a cryptosystem, Daniel Santana Rocha had the idea of substituting $\theta$ and $\alpha$ by Gaussian Integers. For more technical reasons, this approach leads to the resistance against the aforementioned Shamir attack.

He also conceived a bit block consisting of many steps of the previously described linear combination of a hard knapsack. In each one of them, a new (Gaussian) integer, equivalent to one, higher than the sum of all previous would be added at the end of the sequence, while the currently smallest would be dropped.

A different and yet elegantly analogous constraint applies for $\alpha$, which is now a Gaussian integer. We require $w_{max} \leq \vert\alpha\vert^2$. The reason is very hard to formalize, but fortunately it can be easily intuited from an elegant description:

Imagine a square lattice in the plane of complex numbers, whose side is a hypotenuse of a right triangle of catheti a and b, parallel to the real and imaginary axes. An example of such lattice is given below. Guassian integers modulo $\alpha = a + bi$ can be represented by points located within such a lattice. Within such a lattice there are $\vert\alpha\vert^2$ distinct points. If we pick a large enough $\alpha$, we can fit all of the linear combinations of the easy knapsack.

This is the graphic representation of the isomorphism $f : Z/n \rightarrow Z[i]/(\alpha)$, where $n = 13$ and $\alpha = a + bi = 3 + 2i$ (note that $n$ and $\alpha$ indeed satisfy $n = \vert \alpha \vert ^ 2 = a^2 + b^2 $ as required). The dots represent the Gaussian Integers, i.e., the complex numbers $a + bi$, where $a$ and $b$ are integers. As usual, the horizontal axis represents the real part, while the vertical represents the imaginary part. Thus, moving one dot to the right or left is equivalent to adding +1 or -1 to its current value, respectively. Likewise, moving one dot upwards or downwards corresponds to adding +i or -i, respectively.

The red dots are those equivalents to $0 \bmod(\alpha)$. Apart from these, we also colored 4 more pairs of dots.

Color	Equivalent to … mod α
Orange	$1$
Green	$i$
Blue	$-1-i$
Violet	$1-i$

The isomorphism $f$ is defined by the transformation of the $i$th element of the cyclic sequence $(0,1,2,\cdot\cdot\cdot,10,11,12,0,1,2,\cdot\cdot\cdot)$ into the $i$th element of the also cyclic sequence of dots in the figure, that adheres to the following rules:

It starts from the red dot of the first row;
It follows the horizontal right arrows; except that
When following the arrows leads the sequence outside of the lattice, it would reach one of the non-black points. Instead of going to that point, it jumps to the same colored point (i.e., the equivalent point modulo $\alpha$) inside of the same square; and finally
When this non-black point happens to be red (which happens after all the other colors have been passed through), the sequence jumps to the uppermost red dot, thus reinitiating the cycle;

In order to map at least $Y$ natural integers, one must take a square with area $\vert\alpha\vert^2$ (where $\vert\alpha\vert = \sqrt{a^2 + b^2}$ is, by Pythagorean theorem, the measure of its side) at least as high.

Notice that each “jump” changes the row number $r$ to $r \leftarrow (r + b)(mod a + b)$ if one counts the rows from top down, and, equivalently, to $r \leftarrow (r + a)(mod a + b)$ if one counts from bottom up. Hence, the necessary and sufficient condition for each row (and dot) to be roamed exactly once on each cycle is that the size of the jumps is coprime with the number of rows, or, in other words, $gcd(a,a + b) = gcd(b,a + b) = 1$. Due to properties of the greatest common divisor operator, both of these are equivalent to $gcd(a,b) = 1$.

Y. S. Villas Boas noticed the need for coprimality of $a$ and $b$. In order to preserve superincreasingness, each new integer $w$ added at the end of the sequence needs to surpass the sum of all current ones ($w > \sum_{i=1}^k v_i$). He observed that in order to achieve this, their multiplying coefficients $b_i$ would have to be at least 1, and thus, each bit could not be mapped into the coefficients 0 and 1. If the coefficients were 0 and 1, only the block composed only of ones would satisfy the superincreasingness. For this reason, the bits 0 and 1 were then mapped respectively to the multiplying coefficients 1 and 2.

Finally, and less trivially: during each step of the decoding, one new integer $v_1$ is to be found as solution of the equation $b_1 v_1 = v_{n+1} - \sum_{i = 2}^{n} b_i v_i$, where all $v_i$ and $b_i$ are known for $1 < i \leq n$. Since we also don’t know $b_1$, we end up with a system with one equation and two variables, which leaves us with one degree of freedom. To correct this, we have to arbitrate one positive value (for the sake of simplicity, 1) to be always assigned to $b_1$, thus eliminating the ambiguity. Thus, since the first coefficient is fixed to 1, in order to encode $n$ bits on each step, our sequences of integers must be $n + 1$ elements long.

One final technicality to resolve is the case in which the size in bytes of the message is not a multiple of the block size. In other words, what to do with the possible remaining bytes of the last data block so that, once the data blocks are recovered, all the bytes of the original content are preserved, but no more than them? We solved that by repeating the last byte of the message once. This copy is, then, followed by a random sequence on which each component is required only to be different from the previous one. Thus, when the data blocks are retrieved, the last of them or, in the worst case, the one before the last is truncated in the last repetition of bytes.^[4]

Now let’s show an example of the SRVB cryptosystem.

We start with the parameters:

$k = 4$;

$m = 4$;

which yields a block length of $l = 4 * 4 = 16$, and a superincreasing sequence of $k + 1 = 5$ natural integers, that is operated—i.e., linearly combined, appended with the result of this linear combination, and reduced to its last $k + 1$ elements—$m = 4$ times;

For the sake of simplicity, let the following be the (superincreasing) vector of $v_i$:

$(1,2,4,8,16)$

Indeed, the sequence of the first five powers of 2, just because this is the superincreasing sequence with five elements and the smallest strictly positive possible values (thus avoiding a 0 in the public key, which would trivially give away the correspondent 0 of its counterpart).

As we said earlier, for $\alpha = a + bi$, the integers $a$ and $b$ must be coprime, and according to the new constraint we mentioned, we have to request that $a^2 + b^2 = \vert\alpha\vert^2 \geq W$. A quick calculation yields $W = 1590$. Since $\sqrt{1590} \simeq 39.8$, a very convenient $\alpha$ to be chosen would be $\alpha = 39 + 40i$, for it is just large enough to accommodate an isomorphism with a ring of integers with at least 1590 components, while also satisfying $gcd(a,b)=1$. Also, we need to pick a $\theta$ such that $gcd(a,\theta)=1$^[5] and preferably that is not too low, so that $(a_1 * \theta) \% \alpha \neq v_1 * \theta$, (also to avoid giving away information). $\theta = 60$ does the job, yielding:

$-19-1i,1+38i,3-3i,6-6i,12-12i$^[6]

Let us get our hands dirty, then. Take the message Hello Toptal!. First we map it into an array of bytes according to ASCII and our convention for truncating data blocks:

01001000 01100101
01101100 01101100
01101111 00100000
01010100 01101111
01110000 01110100
01100001 01101100
00100001 00100001

Since our data block is 16 bits = 2 bytes long, and our message has 13 characters, we end up with 7 data blocks of 2 bytes each, where the last one contains two times the bit representation of the character ‘!’ . Let us encrypt the first block step by step. Pay close attention because the bits of each byte are taken in order of their significance.

$m=01001000 01100101$

First nibble of first byte: $(0,0,0,1)\rightarrow(1,1,1,1,2)\cdot(-19-1i,1+38i,3-3i,6-6i,12-12i)=15+4i$
Second nibble of first byte: $(0,0,1,0)\rightarrow(1,1,1,2,1)\cdot(1+38i,3-3i,6-6i,12-12i,15+4)=49+9i$
First nibble of second byte: $(0,1,0,0)\rightarrow(1,1,2,1,2)\cdot(3-3i,6-6i,12-12i,15+4i,49+9i)=106-10i$
Second nibble of second byte: $(0,1,1,0)\rightarrow(1,1,2,2,1)\cdot(6-6i,12-12i,15+4i,49+9i,106-10i)=252-2i$

And thus, we have just mapped

“He” $\Rightarrow(12-12i,15+4i,49+9i,106-10i,252-2i)$

The rest of the encrypted message is as follows:

“ll” $\Rightarrow(12-12i,21-2i,61-3i,185-31i,367-59i)$

“o “ $\Rightarrow(12-12i,25+33i,65+32i,111+44i,244+124i)$

“To” $\Rightarrow(12-12i,9+10i,46+12i,149+5i,277+31i)$

“pt” $\Rightarrow(12-12i,3+16i,46+12i,73+23i,201+49i)$

“al” $\Rightarrow(12-12i,4+54i,44+53i,117+193i,231+389i)$

”!!” $\Rightarrow(12-12i,4+54i,32+65i,63+92i,121+247i)$

Now, for the decryption, we have $\theta^{-1}=60^{-1}=27+i$:

$($”He” $\Rightarrow)$ $(12-12i,15+4i,49+9i,106-10i,252-2i)*\theta^{-1}\%\alpha=(16,47,93,223,527)$

Now, comes the greedy algorithm:

First, we subtract each contributing factor multiplied by one, because they are known to have contributed at least once for the last, yielding:

Second nibble of the second byte:

$T \leftarrow (527-233-93-47-16) = 148$

$(T \geq 223) = (148 \geq 223) = false \Rightarrow b_1=0 ; T \leftarrow (T - 0 * 223) = 148$

$(T \geq 93) = (148 \geq 93) = true \Rightarrow b_2 = 1; T \leftarrow (T - 1 * 93) = 55$

$(T \geq 47) = 55 \geq 47) = true \Rightarrow b_3 = 1; T \leftarrow (T - 1 * 47) = 8$

$(T \geq 16) = 8 \geq 16) = false \Rightarrow b_4 = 0; T \leftarrow (T - 0 * 16) = 8$

Result: 0110;

Remainder: 8 (added to the beginning of the sequence as new lowest element);

Drop 527 from the final of the current sequence;

First nibble of the second byte:

$T \leftarrow (233-93-47-16-8) = 59$

$(T \geq 93) = (59 \geq 93) = false \Rightarrow b_1 = 0; T \leftarrow(T - 0 * 93) = 59$

$(T \geq 47) = (59 \geq 47) = true \Rightarrow b_2 = 1; T \leftarrow (T - 1 * 47) = 12$

$(T \geq 16) = (12 \geq 16) = false \Rightarrow b_3 = 0; T \leftarrow (T - 0 8 16) = 12$

$(T \geq 8) = (12 \geq 8) = true \Rightarrow b_4 = 1; T \leftarrow (T - 0 * 12) = 4$

Result: 0101;

Remainder: 4 (added to the beginning of the sequence as new lowest element);

Drop 233 from the final of the current sequence;

Second nibble of the first byte:

$T \leftarrow (93 - 47 - 16 - 8 - 4) = 18$

$(T \geq 47) = (18 \geq 47) = false \Rightarrow b_1 = 0; T \leftarrow (T - 0 * 47) = 18$

$(T \geq 16) = (18 \geq 16) = true \Rightarrow b_2 = 1; T \leftarrow (T - 1 * 16) = 2$

$(T \geq 8) = (2 \geq 8) = false \Rightarrow b_3 = 0; T \leftarrow (T - 0 * 8) = 2$

$(T \geq 4) = (2 \geq 4) = false \Rightarrow b_4 = 0; T \leftarrow (T - 0 * 4) = 2$

Result: 0100;

Remainder: 2 (added to the beginning of the sequence as new lowest element);

Drop 93 from the final of the current sequence;

First nibble of the first byte:

$T \leftarrow (47-16-8-4-2) = 17$

$(T \geq 16) = (17 \geq 16) = true \Rightarrow b_1 = 1; T \leftarrow (T - 1 * 16) = 1$

$(T \geq 8) = (1 \geq 8) = false \Rightarrow b_2 = 0; T \leftarrow (T - 0 * 8) = 1$

$(T \geq 4) = (1 \geq 4) = false \Rightarrow b_3 = 0; T \leftarrow (T - 0 * 4) = 1$

$(T \geq 2) = (1 \geq 4) = false \Rightarrow b_4 = 0; T \leftarrow (T - 0 * 2) = 1$

Result: 1000;

Remainder: 1 (added to the beginning of the sequence as new lowest element);

Drop 47 from the final of the current sequence;

As a result, we have recovered the data block: “01001000 01100101” containing the first two characters “He”, of our message. Not only that, we have also correctly retrieved our private-key superincreasing sequence $(1,2,4,8,16)$.

The other data blocks maps go in the same way.

Comparison with the Major Public-Key Cryptosystems

SRVB is:

Totally free and public;
Considerably simpler and easier to understand and implement than ECC^[7];
Abundant on keys and hence virtually costless, contrary, for example, to RSA, which relies on prime numbers, which are expensive.

We can already sum up the most likely vulnerabilities. Since SRVB descends from MH, it is easy to suspect it would be vulnerable to a generalization of the Shamir attack, or some other that breaks it. Though the author had reasons to believe that this would not be the case, no confirmation of it has yet been done (which is very typical when dealing with cryptosystems).

What’s Next?

Rocha observed a few generalizations for the quotient rings to be used, that can possibly lead to an increase in the complexity of the cryptanalysis. We shall investigate these possibilities as well.

The Linear Algebraic Obscuring

As it happens, during the development and documentation of SRVB, Villas Boas came up with yet another approach to improve the concept of knapsack public-key cryptosystem that will not be explained in much details herein, in order to this article not to become too long and tiresome, but here is a skimming of it. If you don’t succeed in grasping it, don’t worry, just stay tuned for the release of our next article, in which we shall go into the details more thoroughtly: See the subset sum $y$, of the, say, $N$ quotient ring elements $u_i$ (that are isomorphically correspondent to the positive integers $v_i$ of the super increasing sequence, as before) as a multiplication of a row vector of these $u_i$ by a column vector $B$ (for binary) of zeroes and ones^[8].

$y = \sum_{i=1}^n u_i b_i = (u_1,\cdot\cdot\cdot,u_n)^T\cdot(b_1,\cdot\cdot\cdot,b_n)$=UB,

where $b_i \in {0,1}$

Now, imagine that, instead of this vector of $u_i$, you have, on the left a $n$ by $N$ (with $n < N$) matrix V, having the $v_i$ (the integers from the superincreasing sequence) vector as (without loss of generality) its first row and all the other entries filled up with noise. Notice that, now, multiplying V by the same bits vector B gives you a column vector W that has $w$ as its first entry and noise in the others. Now, take this V matrix and multiply it by a random^[9] n by n matrix R on the left, defining the n by N matrix P:

P = RV

The idea is that Bob uses P as his new public key. Because of the randomness of R, the vector

$Y = PB = RV B = RW$

has the information $w$ obscured by the noise in other rows of V. Bob also calculates in advance and stores the row vector S that satisfies:

$R^T S^T = e_1$

When, Alice sends Y to Bob, he just finds SY, because:

$SY=S(PB)=S((RV)B)=SRVB= {e_1}^T R^{-1}((RV)B)=$

(so far only definitions)

${e_1}^T (V B)={e_1}^T W = w$

(Now, exploiting the associativity to cancel the Rs)

and then proceeds as before to extract the vector of $b_i$ from $w$ by means of the greedy algorithm.

So, in one word, the Linear Algebraic Obscuring exploits the associativity of matricial multiplication (the fact that we could expand the terms and then operate their components in a new order provided we that preserved the sequence of all the operands in the sequence) to ‘linearly’ scramble and then filter (in the encryption and in the decryption respectively) noise to/from $w$. And in limit case $n = 1$, the system elegantly goes back to original to Rocha’s approach.

The SRVB Campaign – a prize challenge

As explained before, in accordance to Kerckhoffs’ principle, experience shows that antiquity of a publicly known unbroken cryptosystem is the main source of reliability of it, far more than any theoretical argumentation by the own authors, apart from anything else, because definitive proof of efficacy of the algorithms usually are not available.

And here it is our great chance to put these concepts to practice to earn a big money prize: Aware of this fact, we launched the aforementioned campaign, which is basically a crowdfunding for a prize automatically awarded to the first one that manages to break a message. The money is going to be converted to bitcoins in a given wallet whose private key will be SRVB encrypted and published so that anyone able to decipher it can simply take the money anonymously, no questions asked.

Acknowledgments

This article in particular and the whole SRVB Crypto project in general have been greatly helped by Prof. Charles F. de Barros, assistant professor at the Federal University of São João Del Rei. Prof. Barros provided us a technical review of the SRVB cryptosystem itself. I judged it necessary to submit to before daring to publish, and that I would certainly take much longer to do by myself.

In addition, I also would like to extend my deep gratitude to professor Adnan Ademovic for his extraordinarily insightful, attentive, and patient work as the editor of this article.

Glossary

Cryptology/Cryptography: The science/practice of making a message nearly impossible to interpret without a specific set of instructions (in this context, the so-called "key"), thus allowing agents who share these instructions to communicate securely even if tapped by third parties;
Steganography: The science/practice of hiding a message within another by means of adding apparently innocuous modifications to the latter;
Key generation: the mapping of (expectedly) random inputs into (as random) valid keys;
Encryption/Encoding: The easily computable mapping of an easily interpretable message into another that is either hard or impossible to construe, by means of an (as randomly specified) element the one correspondent to the key of a (by Kerckhoffs' principle, publicly known and validated) family of algorithms;
Decryption/Decoding: The easily computable mapping that consists of the inverse of the previous, also being definable by an (as randomly specified, and therefore, unknown and hard to be guessed by third parties) element _again, the one correspondent to the key_ of a (by the same principle, usually known) family of algorithms, that thus outputs the original message when input with the encrypted one;
Cryptosystem: A triad of the family of encoding algorithms, the family of corresponding decrypting algorithms, and a key generating algorithm;
Allies: The agents with whom the communication is intended, and who are expected to act according to the protocol's rules;
Enemies/Third Parties: The agents with whom the communication is not intended, but try, nevertheless, to eavesdrop the communication and bypass the security enhanced by the cryptosystem;
Secure Channel: Any protocol (procedure) for communication that is easy to use while also effectively preventing third parties to feasibly construe what is meant by its users;
Insecure Channel: any channel (i.e., protocol or procedure) that, as the name suggests, is not a secure channel;
Breaking a Key: The process of discovering a key by means of public pieces of information (like an encrypted message or public key) other than the key itself to be discovered and that was not expected to feasibly allow the discovery of the key. Since the information that results from this process grants the decryption of the messages, this is a particular case of…
Breaking/Deciphering a message: Any way to deduce the original content of an encrypted message only by means of the encrypted message itself and other pieces of information that were not expected to suffice for the deduction of the original content;
Breaking a Cryptosystem: Discovery of a systematic way to feasibly break whatever message is encrypted by this method under whatever parameter;
Kerckhoffs' Principle/Desideratum/Axiom/Law: A cryptologic principle named after the dutch cryptographer Auguste Kerckhoffs, according to which, in order for a cryptosystem to be deemed secure, everything about it except by its (private) keys must be of public knowledge;
Key: Secret parametrization of a cryptosystem allowing it to be infeasible to be guessed (and consequently broken) by third parties while also being validated by the community of [cryptanalysts](https://www.toptal.com/developers) (in accordance to Kerckhoffs' principle);
Symmetric Cryptosystem: Any cryptosystem on which any parameter for the encoding suffices for easily deducing the parameter for decoding, and, for this reason, must be kept private. One can rephrase it by saying that the encryption and decryption parameters are both equivalent to the key;
Asymmetric/Public Key Cryptosystem: Any cryptosystem for which there is a way to express the parameters for the encoding that does not suffice to feasibly deduce the correspondent parameter for decoding, allowing it to be sent to allies through an insecure channel, or even made public, and thus creating a secure channel where there was none;
Public Key: A component of an asymmetric cryptosystem that suffices for parameterizing the encryption but does not suffice for feasibly deducing the decryption parameter, i.e., the…
Private Key: A component of an asymmetric cryptosystem that suffices for parameterizing the decryption and thus must be kept privately and usually also suffices for parameterizing the encryption;

^[1] Note that, here, the phrases decipher or decrypt do not apply, because before a given cryptosystem (with all its components, including its keys) is well defined, one cannot classify a given method of construing the original message from the encrypted one as the intended communication (decryption) or an attack (deciphering).

^[2] Though there are techniques to improve this guessing work, it still remains far more difficult discovering than multiplying them.

^[3] The first inspiration was to look at the tree of the tau conjecture, an infinite tree rooted in the number 1, whose other nodes consist of integers resulting in one binary operation of sum, subtraction, or multiplication between previous nodes, possibly one node operated with itself. The theory’s goal relates to the depth, in this tree, in which each integer first appears. It is apparently hard to find large numbers in low branches (even if we relax the need for it), but is it immediate to check if a given sequence of operations indeed produces a given result.

^[4] This is surely not the most natural approach, but by adopting this, one ensures that this byte filling (called padding)…

Conceals the size of the padding (differently, for example of ciphertext stealing) and obscures the end of the message, thus rendering chosen-plaintext attacks more difficult;
Enhances a distribution of bits as close to uniform as possible;

If the last blocks of every message were known to be systematically biased in a far from uniform distribution, an attacker could exploit this fact to do a statistical cryptanalysis, for any given sample of messages would be statistically more representative than otherwise. In other words, the last blocks being statistically less diverse, they become they become the weakest links of the message.

^[5] Make no mistake: this greatest common divisor is Gaussian, while the previous is real.

^[6] …which is not perfect, because it easily gives away the fact that the last three components are proportional to 1, 2, and 4. But, again, for the sake of simplicity, we will ignore this detail.

^[7] …which is so complex that there are a few notorious cases of failure to implement and maintain protocols based on it.

^[8] Here, we will not adopt Rocha’s approach of a multiple linear combinations block, which also allows us to randomly permute the bis to obscure them even more.

^[9] Though not totally random. Its transpose must span the subspace generated by the vector $ e_1 = (1,0,…,0) $.

Hire a Toptal expert on this topic.

Hire Now

World-class articles, delivered weekly.

Join the Toptal^® community.

Hire a Developer or Apply as a Developer