Lookups in Lurk: Part 2
Photo credit: Gary Miller, Attribution (CC BY 2.0)
This is a continuation of the first blog post about the architecture of lookups in Lurk. In part 1 we discussed the use of the logUp lookup technique for different shards of the Lurk execution to communicate their computed values. We continue in this post by noting a soundness issue if one were to apply the logUp technique naively to Lurk and SP1.
Overflowing multiplicities
As stated in part 1, the soundness of logUp is restricted to the case when the number of lookups is less than the order of the field $\left \FF \right$. For small fields like BabyBear, and reasonablysized programs like the Tendermint light client example in SP1, this upper bound on the soundness is met in practice.
The problem
The soundness of logUp being dependent on the number of lookups amounts to the issue of overflowing multiplicities. Recall that a verifier should be confident of a lookup argument provided by the prover if the total summed accumulator across all shards equals zero.
In the context of SP1’s VM, the main CPU chip checks the state transitions between every consecutive
clock cycles. Depending on the current state of the registers, the table will dispatch an operation
to be performed by a separate and specialized chip, such as a 32bit arithmetic operation, or a call
to one of the available precompiles. These operations are encoded as queries which contain both the
input and result, which the prover then send
s via the lookup argument to be received
and
constrained by the specialized chip. We refer the reader to part 1 of this
series on the details about send
and receive
. Briefly, send
and receive
can add or subtract
the quantity.
from the Memoset accumulator. Here $m$ is the multiplicity of the query $q$, and $r$ and $\gamma$ is randomness provided by the verifier (or a random oracle in the noninteractive setting).
Concretely, a query is always sent using a multiplicity equal to 1, though it is possible to receive it with an arbitrary multiplicity if it was sent multiple times. This is the case when the CPU reads an instruction from the fixed program chip, or when a byte operation is requested from a preprocessed table containing all possible input/output pairs.
Since the logUp accumulator counts the number of occurrences of a query using an element in $\FF$, an issue emerges when we consider what happens when a query was sent exactly $\FF$ times, since the contribution to the sum would be
$\left\FF\right \cdot \frac{1}{r  t} = 0 \;\; \mathrm{in} \;\; \FF.$A cheating prover would be able to avoid sending the value entirely, while still convincing the verifier that the logUp accumulator summed to zero.
This was an issue in previous versions of SP1’s VM where it would’ve theoretically been possible to return an incorrect result to the CPU chip and then run the same query in a loop until it would effectively be nullified and not need to be proved. This issue was solved by including the CPU’s clock as part of the query, thereby ensuring that queries occurring at different cycles would always be different. While it is still possible for the same query to be sent and received within the same cycle, it is easy to count the maximum possible number of interactions within a cycle and check that it is less than $\FF$.
The main drawback of this approach is that the specialized chips may have to prove the same
operation multiple times since each query will be matched up with a separate receive
. In essence,
we are forced to always use multiplicity equal to 1. We refer the reader to the excellent
presentation on shared randomness in distributed proving for more details ^{1}.
The solution
The fix for this overflow issue implemented in Lair (a fieldoriented intermediary representation used by Lurk) is
surprisingly simple, and the original inspiration again comes from the memory checking work of ^{2}. An especially
attractive feature of this fix is that it does not require any changes to the Sphinx prover as it is
built on top of the send
and receive
primitives already available. Similar memory checking principles underlie the
Nexus and Jolt ZKVM memory arguments ^{3}^{4}.
Taking inspiration from the fix implemented in SP1, we add a requirement that whenever a query is inserted into the multiset, the prover must prepend it with a unique nonce. The goal is to ensure that if the same query was inserted from another location in the execution trace, the elements added to the multiset would be different, and therefore both have multiplicity equal to 1. As a sideeffect, whenever the prover wants to remove an element from the set, they must nondeterministically supply the nonce used during insertion.
For simplicity, we can define the nonce as the index of the row within the trace where the insertion occurred, as this is both easy to constrain and to keep track of during proving. Note though that this may allow some queries to be duplicated in the set, and it can be extended with additional information such as the shard or trace index within a proof.
The Lair lookup argument builds upon the above abstraction and removes the restriction that queries may only be read once. The goal is to be able to prove queries once, and read them as many times as it is requested across the entire proof. The mechanism which enables this is a form of memorychecking argument, though simplified to the case where items from memory are written once but are immutable.
More details
We begin by introducing the following small wrapping functions that restrict the usage of the primitives to only allow singular multiplicity.
fn insert(&mut acc, q: &[F]) {
// equivalent to
// acc += 1 / (r + rlc(q, gamma))
send(acc, q, multiplicity = 1)
}
fn remove(&mut acc, q: &[F]) {
// equivalent to
// acc = 1 / (r + rlc(q, gamma))
receive(acc, q, multiplicity = 1)
}
The comments above describe the update that is performed over the logUp accumulator, using the
verifier’s random challenges r
and gamma
. The function rlc(q, gamma)
evaluates the random
linear combination of the query q
with powers of gamma
(this is the fingerprint from above).
These wrappers establish an interface for dealing with abstract multiset, where elements (which take the form of queries) can only be inserted and removed one by one. This abstraction only allows us to prove that a query was copied from one position to another, since each inserted query must be removed exactly once to ensure the multiset is empty across all traces in a proof. This in itself does not prevent a malicious prover from performing the same set operation $\FF$ times, and it remains the responsibility of the caller to ensure that each query can only be included a bounded number of times.
More precisely, the following pseudocode implements the new provide
and require
methods
described above:
fn provide(&mut acc: F, final_nonce: F, final_count: F, query: [F; m]) {
// `receive` the last query with its nonce and counter
remove(acc, [final_nonce, final_count] ++ query);
// `send` the query back with nonce = counter = 0
insert(acc, [0, 0] ++ query);
}
fn require(&mut acc: F, prev_nonce: Option<F>, prev_count: Option<F>, query: [F; m]) {
// increment the counter
let new_count = prev_count.unwrap_or(0) + 1;
// Assert that the `new_count` is nonzero
assert_ne(new_count, 0);
// Generate a new nonce using the `row_id`, or return 0 on the first `require`
let new_nonce = if prev_nonce.is_some() { get_row_id() } else { 0 };
// `receive` the previous query with its nonce and counter
remove(acc, [prev_nonce.unwrap_or(0), prev_count.unwrap_or(0)] ++ values);
// `send` the query back with new nonce and counter
insert(acc, [new_nonce, new_count] ++ values);
}
A few notes are in order:
new_count
inrequire
is not constrained to belast_count
incremented. The only constraint onnew_count
is that it is nonzero. This constraint is implemented by adding an additional column ofnew_count⁻¹
and constraint thatnew_count * new_count⁻¹ = 1
. We mention this for when we eventually discuss the performance impact of these changes. The value of
new_nonce
is determined by therow_id
for the lookup query. In practice, this is implemented by adding an additional incrementing counter to the main execution trace with a constraint ensuring therow_id
increments for adjacent rows. Eventually, using powers of the multiplicative generator of the 2adic subgroup of $\FF$ as a row identifier could remove the cost for the prover committing to therow_id
column.  Finally, the
final_nonce
,final_count
,prev_nonce
, andprev_count
inputs are all kept track of by the prover. There are no guarantees that these are actually the corresponding quantities. The soundness guarantee does not assume the prover is honest though, so only if the prover behaves rationally they will be able to generate a verifiable lookup argument.
The way the prover and verifier use the above constructions is simple: In the original Sphinx lookup
argument wherever one originally used a require
or send
, the claim is that replacing them with a
receive
or provide
will yield a correct and sound lookup argument which avoids the issues of
overflowing multiplicities.
Completeness and soundness
We now give informal arguments for the completeness and soundness of the above protocol. An analysis of the protocol can be performed on the level of the global logUp accumulator, or the multiset abstraction.
For completeness, a prover can simply follow the pseudocode and recommendations above for the lookup
queries: Incrementing the counter and using the row_id
as the nonce. As long as the same query is
not require
d more than $\left \FF \right$ for the same row_id
then the constraint that
new_counter ≠ 0
will not restrict the prover from generating a permutation trace satisfying the
constraints. The state of the accumulator after all the require calls for a fixed lookup value $z$
is given as:
which is a telescoping sum simplifying to
$\frac{1}{r + \rlc(nonce_{N}, count_{N}, z;\gamma)}  \frac{1}{r + \rlc(0, 0, z;\gamma)}.$After all the require
queries are filled in nondeterministically by the prover, they can generate
the provide
permutation traces by utilizing the final_nonce
and final_count
values (denoted
$nonce_N$ and $count_N$ above). The term added to the accumulator is exactly the inverse of the
above telescoped value, and the final accumulator can be verified to equal zero.
Soundness follows by noting that if the final accumulator sum is zero (and that there were a nonzero
number of lookups) then the first require
would contribute a summand of the form
Because a require
call asserts that new_count
be nonzero, the only way an honest prover can
eliminate this term from the accumulator is by provide
ing the appropriate query with its
final_count
and final_nonce
.
Recall that the only constraint on new_count
is that it be nonzero, so the prover may supply
whatever nonzero value they want for new_count
in a require
call. A cheating prover may
therefore try to overflow the accumulator by require
ing the same query a large number of times. If
we further assume that the number of traces and lookup queries per row are bounded (so that their
product is less than $\left \FF \right$) then this term does not run the risk of overflowing.^{5}
Another avenue a cheating prover can take is by provide
ing a special query with the same
fingerprint as one that was being require
d (but not being equal to it). But an appeal to the
SchwartzZippel bound similar to the argument found in ^{6} ensures that it is exceedingly unlikely
a cheating prover can cook up this cheating query.
Cost
In an ideal world, each lookup query could be handled by a single send
or receive
, but there is
additional computational overhead by moving to the provide
and require
semantics. Though this
extra cost is unfortunate, you cannot put a price on security.
Because provide
requires a call to send
and receive
under the hood, there is a $2 \times$
overhead on the number of additional lookup columns. Just as for provide
, a call to require
also
entails two columns for each call. There is also an additional column and constraint corresponding
to the extra check that new_counter ≠ 0
. Finally, there is an additional row_id
column with
additional constraint that is added to all of the traces participating in the lookup argument.
Because of the batching of constraints for two interactions, this means that every call to provide
will add an additional permutation trace column and every receive
has two. Overall this means
there is roughly a $2 \times$ overhead for the new provide
and require
semantics. Despite this
we have found this overhead to be manageable in practice. See the recent performance
benchmarks to see the benefits.
Try it out!
Try out the Lurk VM using the current implementation of Lurk. Also come join us on our public Zulip to chat with the team and ask any questions you may have. Let us know if you’d like to see any other explainers like this, or if you have any other Lurk content you’re interested in. Thanks for reading!
$\gdef\FF{\mathbb F} \gdef\rlc{\mathrm{rlc}}$Footnotes

ZK11: Sharing Randomness in Distributed Proving  Tamir Hemo (Succinct) (https://www.youtube.com/watch?v=S5HFiI0KspM) ↩

M. Blum, W. Evans, P. Gemmell, S. Kannan, and M. Naor, “Checking the correctness of memories,” in [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science, 1991, pp. 90–99. doi: 10.1109/SFCS.1991.185352. ↩

Keynote: Memory checking in foldingbased zkVMs  Jens Groth (Nexus) (https://www.youtube.com/watch?v=xnzjC_9vUzs) ↩

A. Arun, S. Setty, and J. Thaler, “Jolt: SNARKs for Virtual Machines via Lookups,” 2023. Accessed: Aug. 11, 2023. [Online]. Available: https://eprint.iacr.org/2023/1217 ↩

This is not an onerous requirement as the bound is far from being met in practice. ↩

U. Haböck, “Multivariate lookups based on logarithmic derivatives,” 2022, 2022/1530. Available: https://eprint.iacr.org/2022/1530 ↩