-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement IdpfPoplar #381
Implement IdpfPoplar #381
Conversation
Heeerree we go!!! 🛵 A few high-level, somewhat opinionated suggestions (worth discussing). They mostly have to do with not over-indexing on generality:
|
c6b7478
to
e4586c4
Compare
Rebased to pick up a Clippy fix, and pushed another commit to get rid of the |
Some notes before I stop thinking about this for a while: On my machine, the IDPF key generation benchmark is consistently running at a throughput of ~75KiB/s. Right now, the evaluation benchmark has a throughput of ~100 KiB/s on 1-byte inputs, ~80 KiB/s on 16-byte inputs, and ~20 KiB/s on 256-byte inputs. Profiling the largest case shows that CPU time is dominated by hashing bit slices for cache lookups and insertions. With a better designed cache, we could keep the throughput from dropping so much. Since IDPF evaluations will be almost certainly part of a tree walking algorithm, I think a cache built on a very small ring buffer, with lookups done via a linear scan, would make sense. So long as the ring buffer capacity is more than 1.5x the number of candidate prefixes being considered, LRU eviction from the ring buffer would work great. We can make sure our eventual replacement for On the other hand, production deployments will either bring their own cache implementation, backed by durable/distributed storage, or choose to trade off recomputing seeds instead of storing them. Perhaps the performance of the cache implementations Separately, I have an idea for an optimization for the evaluation side. Currently, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've started digging in. I think the first step is to carve out a PR for the Field255 implementation; see inline comments.
@@ -199,6 +181,26 @@ pub trait FieldElement: | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pow()
and inv()
are defined for any field, not just FFT-friendly ones. The only methods that are FFT-specific are generator_order()
, generator()
, and root()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I had just moved those two because Poplar wouldn't need them (it just squares), so "FFT friendly field element" is a slight misnomer because of that. I could implement those two for a cleaner API boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I would not have Field255
implement them. Rather, I would just stick a todo!("implement when needed")
in there.
Cargo.toml
Outdated
@@ -11,10 +11,12 @@ resolver = "2" | |||
|
|||
[dependencies] | |||
aes = { version = "0.8.2", optional = true } | |||
bitvec = "1.0.1" # TODO: replace this with a simplified bit vector struct after getting a working prototype together |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, re-implementing bitvec is not necessarily the right choice. At least while IDPF is behind the experimental flag, it would be worth keeping this dependency.
src/field255.rs
Outdated
@@ -0,0 +1,305 @@ | |||
//! Finite field arithmetic for `GF(2^255 - 19)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping there would be a higher-level API for arithmetic in this field that we could take advantage of, i.e., that we wouldn't use fiat-crypto directly. However after doing some research, this appears not to be the case.
The best rust implementation of GF(2^255-19) is dalek, but they don't expose the field arithemtic directly. I also noticed that the code here is essentially a re-implementation of dalek's old fiat backend. As I understand it, maintenance has moved to the zkcrypto project (https://github.com/zkcrypto/curve25519-dalek-ng). Interestingly, there the maintainers have removed the fiat backend entirely.
I'm not sure what the best path forward here would be. If the zkcrypto folks are willing to consider exposing the field arithmetic (zkcrypto/curve25519-dalek-ng#20), then I think that would be our best bet. In the meantime, re-implementing the fiat backend is probably our best option.
This code is delicate and will require careful review. Let's move splitting the FFT functionality of FieldElement
and implementing Field255
into a separate PR, then rebase this PR on top. I will see if I can get some feedback from folks on my team who have experience with fiat-crypto.
src/vdaf/idpf.rs
Outdated
@@ -0,0 +1,1620 @@ | |||
//! This module implements the incremental distributed point function (IDPF) described in | |||
//! [[BBCG+21]] and [[draft-irtf-cfrg-vdaf-03]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec is based on BBCG+21, so I would just say "This module implements the IDPF described in [[draft-irtf-cfrg-vdaf-03]]`.
src/vdaf/idpf.rs
Outdated
@@ -0,0 +1,1620 @@ | |||
//! This module implements the incremental distributed point function (IDPF) described in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this module go in the root of the crate? Like flp
, it's not necessarily VDAF-specific.
src/vdaf/idpf.rs
Outdated
Leaf([FL; OUT_LEN]), | ||
} | ||
|
||
/// An additive secret share of an IDPF output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale for having a different type for "secret shared" outputs? Mathematically speaking, IdpfOutput
and IdpfOutputShare
are necessarily the same. In my opinion, this distinction doesn't need to be made a this level of abstraction, and in fact can get in the way. (C.f. the combinatorial blow up in merge()
method below.)
FWIW, @tgeoghegan and I had a similar debate about whether to define a Polynomial
type in order to distinguish it from a vector. My point of view is that the cost of this abstraction is too high.
src/vdaf/idpf.rs
Outdated
|
||
/// An output from evaluation of an IDPF at some level and index. | ||
#[derive(Debug, PartialEq, Eq)] | ||
pub enum IdpfOutput<const OUT_LEN: usize, FI, FL> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a heads up: If/when we decide to implement Doplar, we're going to need to generalize the type here: Instead of a vector over a field, we will have a mixture of field elements and bit strings. (Mathematically speaking, all that is required is that the output forms a group.)
I don't think we should change anything now, just keep in mind that we'd have to refactor at some point.
src/vdaf/idpf.rs
Outdated
#[derive(Debug, Clone)] | ||
pub struct IdpfPoplar<P, const L: usize, const OUT_LEN: usize> { | ||
bits: usize, | ||
_phantom: PhantomData<(P, [(); OUT_LEN])>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler seems to be fine with not using OUT_LEN
.
_phantom: PhantomData<(P, [(); OUT_LEN])>, | |
_phantom: PhantomData<P>, |
src/vdaf/idpf.rs
Outdated
_phantom: PhantomData<(P, [(); OUT_LEN])>, | ||
} | ||
|
||
impl<P, const L: usize, const OUT_LEN: usize> IdpfPoplar<P, L, OUT_LEN> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple observations:
- The only only stored by
IpdfPoplar
is the bit length. - The type is quite complex, although the complexity is not actually used.
I wonder if the public methods on IdpfPoplar
(gen
and eval
) should just be standalone functions that take the bit length as input.
src/vdaf/idpf.rs
Outdated
|
||
let mut byte = [0u8]; | ||
seed_stream.fill(&mut byte); | ||
let control_bits = [(byte[0] & 1) != 0, ((byte[0] >> 1) & 1) != 0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly worried about the comparisons here (i.e., x != 0
) being a side channel. It should be sufficient to represent the control bits as u8
's right?
src/vdaf/idpf.rs
Outdated
control_bits_1[keep] ^ (previous_control_bits[1] & control_bit_correction_words[keep]); | ||
|
||
let seeds_corrected = [ | ||
if previous_control_bits[0] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to rewrite this code to be constant time, i.e., not branch on the value of the secret control bits. (I'm making a note to adjust the spec accordingly.) One idea is to convert the control bits to bitmasks via wrapping subtraction:
let control_bit_mask_0 = 0_u8.wrapping_sub(previous_control_bits[0] as u8);
let control_bit_mask_1 = 0_u8.wrapping_sub(previous_control_bits[1] as u8);
for i in 0..L {
seed_0[keep].0[i] ^= seed_correction_word.0[i] & control_bit_mask_0;
seed_1[keep].0[i] ^= seed_correction_word.0[i] & control_bit_mask_1;
}
src/vdaf/idpf.rs
Outdated
} | ||
|
||
/// An auxiliary function that acts as a pseudorandom generator, returning field elements. | ||
fn convert<F>(&self, seed: &Seed<L>) -> (Seed<L>, [F; OUT_LEN]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: It might be easier to oparate on [u8; L]
directly rather than Seed<L>
.
As an intermediate step, I think we could just run the Coq code ourselves to stamp out functions implementing field math for our fields. One of the binaries produces implementations using Montgomery representations, much like our existing code. The prime modulus can be specified, along with limb size, and the list of functions we want. The invocation would look like:
|
Yup! Let's hold off for now, however. |
f5f80bd
to
ed8c1c9
Compare
* Split FFT support out from FieldElement trait * Move Integer associated type to a separate trait * Implement Field255 * Use Shl<usize>/Shr<usize> on F::Integer
a93a5e3
to
553ea30
Compare
dc6e149
to
024b2f7
Compare
553ea30
to
1ee21ca
Compare
This is a working prototype of IdpfPoplar, as specified in VDAF-03. Key generation and evaluation work, and the codec traits are implemented for the public share.
This needs the following issues to be addressed before it's production-ready:
fiat-crypto
is used forField255
. This could be easily replaced with a handwritten type using fouru64
limbs.bitvec
is used for storage of IDPF inputs, to simplify looking up bits in their semantic order, without fiddling with manual shifts. This could be replaced with much simpler types, since we don't need the ability to mutate bit vectors, etc.bitvec
is also used in serialization/deserialization of packed control bit correction words, for convenience.ToyIdpf
and the existing, olderPoplar1
implementation. I renamed the existingIdpf
trait and some existing types to avoid having to change them too much. These will need to be overhauled to work with the complete Poplar1 specification in VDAF-03. I think it might make sense to wait and mergeIdpfPoplar
and the newPoplar1
together once both are ready.Other notes: I split the$p=2^{255}-19$ is not FFT-friendly, thus certain methods don't make sense on it, and so that I could avoid implementing field operations that IdpfPoplar and Poplar1 do not use.
FieldElement
trait in two, both because