Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement IdpfPoplar #381

Closed

Conversation

divergentdave
Copy link
Collaborator

This is a working prototype of IdpfPoplar, as specified in VDAF-03. Key generation and evaluation work, and the codec traits are implemented for the public share.

This needs the following issues to be addressed before it's production-ready:

  • Cut down on dependencies as needed. (I was liberal about adding dependencies to get things working, with the intention of replacing some or all of them)
    • fiat-crypto is used for $GF(2^{255}-19)$ field math. While the formal methods work behind this provides a great deal of assurance, we only need a tiny slice of the library. This dependency could be troublesome for consumers that are conservative about crypto libraries.
    • A crate from Parity's Ethereum ecosystem is used to implement a 256-bit unsigned integer type. This is needed for an associated type on Field255. This could be easily replaced with a handwritten type using four u64 limbs.
    • bitvec is used for storage of IDPF inputs, to simplify looking up bits in their semantic order, without fiddling with manual shifts. This could be replaced with much simpler types, since we don't need the ability to mutate bit vectors, etc. bitvec is also used in serialization/deserialization of packed control bit correction words, for convenience.
  • Implement a stateful API, so that keys and control bits from previous evaluations can be reused in later evaluations. This will be necessary for efficient evaluation over practical size inputs, as this memoization will bring runtime from $O(n^2)$ to $O(n)$.
  • Clean up ToyIdpf and the existing, older Poplar1 implementation. I renamed the existing Idpf trait and some existing types to avoid having to change them too much. These will need to be overhauled to work with the complete Poplar1 specification in VDAF-03. I think it might make sense to wait and merge IdpfPoplar and the new Poplar1 together once both are ready.
  • We should have benchmarks for Poplar1.

Other notes: I split the FieldElement trait in two, both because $p=2^{255}-19$ is not FFT-friendly, thus certain methods don't make sense on it, and so that I could avoid implementing field operations that IdpfPoplar and Poplar1 do not use.

@cjpatton
Copy link
Collaborator

Heeerree we go!!! 🛵

A few high-level, somewhat opinionated suggestions (worth discussing). They mostly have to do with not over-indexing on generality:

  1. Implementing finite field arithmetic is tricky, and I think we're better off using implementations that already exist. In particular, I think that a dependency on fiat-crypto for Field255 is perfectly fine. In fact, ideally we would upstream implementations of Field64 and Field128 to fiat-crypto :)
  2. The IDPF should be moved into its own module, similar to how FLP has its own module. The reason is that this thing is likely to be useful independent of Poplar1.
  3. I don't think it's worth defining a trait for IDPF, as we will in all likelihood end up with a single instantiation of it. Accordingly, I think we should drop ToyIdpf altogether.

@divergentdave
Copy link
Collaborator Author

Rebased to pick up a Clippy fix, and pushed another commit to get rid of the primitive-types dependency.

@divergentdave
Copy link
Collaborator Author

Some notes before I stop thinking about this for a while:

On my machine, the IDPF key generation benchmark is consistently running at a throughput of ~75KiB/s.

Right now, the evaluation benchmark has a throughput of ~100 KiB/s on 1-byte inputs, ~80 KiB/s on 16-byte inputs, and ~20 KiB/s on 256-byte inputs. Profiling the largest case shows that CPU time is dominated by hashing bit slices for cache lookups and insertions.

With a better designed cache, we could keep the throughput from dropping so much. Since IDPF evaluations will be almost certainly part of a tree walking algorithm, I think a cache built on a very small ring buffer, with lookups done via a linear scan, would make sense. So long as the ring buffer capacity is more than 1.5x the number of candidate prefixes being considered, LRU eviction from the ring buffer would work great.

We can make sure our eventual replacement for bitvec is more sympathetic to comparison. (I see that the bitvec hashing code uses a stack of iterators, and calls Hasher::write_u8() many times)

On the other hand, production deployments will either bring their own cache implementation, backed by durable/distributed storage, or choose to trade off recomputing seeds instead of storing them. Perhaps the performance of the cache implementations prio provides only matters to the extent they get out of the way of our benchmarks.

Separately, I have an idea for an optimization for the evaluation side. Currently, eval_next() computes the IDPF's values, i.e. the pair of field elements, for every level. eval_inner() discards these from all but the last eval_next() call. This is approximately one wasted field addition and one wasted field negation per level, if no caching is used, though no field computations are wasted if a perfect cache is used. With a little bit of rearranging, we could pull application of the field element correction words out of eval_next(), and only run this at the end of eval_inner(), for the level where we know we will use the result.

Copy link
Collaborator

@cjpatton cjpatton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've started digging in. I think the first step is to carve out a PR for the Field255 implementation; see inline comments.

@@ -199,6 +181,26 @@ pub trait FieldElement:
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pow() and inv() are defined for any field, not just FFT-friendly ones. The only methods that are FFT-specific are generator_order(), generator(), and root().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I had just moved those two because Poplar wouldn't need them (it just squares), so "FFT friendly field element" is a slight misnomer because of that. I could implement those two for a cleaner API boundary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I would not have Field255 implement them. Rather, I would just stick a todo!("implement when needed") in there.

Cargo.toml Outdated
@@ -11,10 +11,12 @@ resolver = "2"

[dependencies]
aes = { version = "0.8.2", optional = true }
bitvec = "1.0.1" # TODO: replace this with a simplified bit vector struct after getting a working prototype together
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, re-implementing bitvec is not necessarily the right choice. At least while IDPF is behind the experimental flag, it would be worth keeping this dependency.

src/field255.rs Outdated
@@ -0,0 +1,305 @@
//! Finite field arithmetic for `GF(2^255 - 19)`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping there would be a higher-level API for arithmetic in this field that we could take advantage of, i.e., that we wouldn't use fiat-crypto directly. However after doing some research, this appears not to be the case.

The best rust implementation of GF(2^255-19) is dalek, but they don't expose the field arithemtic directly. I also noticed that the code here is essentially a re-implementation of dalek's old fiat backend. As I understand it, maintenance has moved to the zkcrypto project (https://github.com/zkcrypto/curve25519-dalek-ng). Interestingly, there the maintainers have removed the fiat backend entirely.

I'm not sure what the best path forward here would be. If the zkcrypto folks are willing to consider exposing the field arithmetic (zkcrypto/curve25519-dalek-ng#20), then I think that would be our best bet. In the meantime, re-implementing the fiat backend is probably our best option.

This code is delicate and will require careful review. Let's move splitting the FFT functionality of FieldElement and implementing Field255 into a separate PR, then rebase this PR on top. I will see if I can get some feedback from folks on my team who have experience with fiat-crypto.

src/vdaf/idpf.rs Outdated
@@ -0,0 +1,1620 @@
//! This module implements the incremental distributed point function (IDPF) described in
//! [[BBCG+21]] and [[draft-irtf-cfrg-vdaf-03]].
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec is based on BBCG+21, so I would just say "This module implements the IDPF described in [[draft-irtf-cfrg-vdaf-03]]`.

src/vdaf/idpf.rs Outdated
@@ -0,0 +1,1620 @@
//! This module implements the incremental distributed point function (IDPF) described in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this module go in the root of the crate? Like flp, it's not necessarily VDAF-specific.

src/vdaf/idpf.rs Outdated
Leaf([FL; OUT_LEN]),
}

/// An additive secret share of an IDPF output.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for having a different type for "secret shared" outputs? Mathematically speaking, IdpfOutput and IdpfOutputShare are necessarily the same. In my opinion, this distinction doesn't need to be made a this level of abstraction, and in fact can get in the way. (C.f. the combinatorial blow up in merge() method below.)

FWIW, @tgeoghegan and I had a similar debate about whether to define a Polynomial type in order to distinguish it from a vector. My point of view is that the cost of this abstraction is too high.

src/vdaf/idpf.rs Outdated

/// An output from evaluation of an IDPF at some level and index.
#[derive(Debug, PartialEq, Eq)]
pub enum IdpfOutput<const OUT_LEN: usize, FI, FL> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a heads up: If/when we decide to implement Doplar, we're going to need to generalize the type here: Instead of a vector over a field, we will have a mixture of field elements and bit strings. (Mathematically speaking, all that is required is that the output forms a group.)

I don't think we should change anything now, just keep in mind that we'd have to refactor at some point.

src/vdaf/idpf.rs Outdated
#[derive(Debug, Clone)]
pub struct IdpfPoplar<P, const L: usize, const OUT_LEN: usize> {
bits: usize,
_phantom: PhantomData<(P, [(); OUT_LEN])>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler seems to be fine with not using OUT_LEN.

Suggested change
_phantom: PhantomData<(P, [(); OUT_LEN])>,
_phantom: PhantomData<P>,

src/vdaf/idpf.rs Outdated
_phantom: PhantomData<(P, [(); OUT_LEN])>,
}

impl<P, const L: usize, const OUT_LEN: usize> IdpfPoplar<P, L, OUT_LEN>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple observations:

  1. The only only stored by IpdfPoplar is the bit length.
  2. The type is quite complex, although the complexity is not actually used.

I wonder if the public methods on IdpfPoplar (gen and eval) should just be standalone functions that take the bit length as input.

src/vdaf/idpf.rs Outdated

let mut byte = [0u8];
seed_stream.fill(&mut byte);
let control_bits = [(byte[0] & 1) != 0, ((byte[0] >> 1) & 1) != 0];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly worried about the comparisons here (i.e., x != 0) being a side channel. It should be sufficient to represent the control bits as u8's right?

src/vdaf/idpf.rs Outdated
control_bits_1[keep] ^ (previous_control_bits[1] & control_bit_correction_words[keep]);

let seeds_corrected = [
if previous_control_bits[0] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to rewrite this code to be constant time, i.e., not branch on the value of the secret control bits. (I'm making a note to adjust the spec accordingly.) One idea is to convert the control bits to bitmasks via wrapping subtraction:

let control_bit_mask_0 = 0_u8.wrapping_sub(previous_control_bits[0] as u8);
let control_bit_mask_1 = 0_u8.wrapping_sub(previous_control_bits[1] as u8);
for i in 0..L {
    seed_0[keep].0[i] ^= seed_correction_word.0[i] & control_bit_mask_0;
    seed_1[keep].0[i] ^= seed_correction_word.0[i] & control_bit_mask_1;
}

src/vdaf/idpf.rs Outdated
}

/// An auxiliary function that acts as a pseudorandom generator, returning field elements.
fn convert<F>(&self, seed: &Seed<L>) -> (Seed<L>, [F; OUT_LEN])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: It might be easier to oparate on [u8; L] directly rather than Seed<L>.

@divergentdave
Copy link
Collaborator Author

  1. Implementing finite field arithmetic is tricky, and I think we're better off using implementations that already exist. In particular, I think that a dependency on fiat-crypto for Field255 is perfectly fine. In fact, ideally we would upstream implementations of Field64 and Field128 to fiat-crypto :)

As an intermediate step, I think we could just run the Coq code ourselves to stamp out functions implementing field math for our fields. One of the binaries produces implementations using Montgomery representations, much like our existing code. The prime modulus can be specified, along with limb size, and the list of functions we want. The invocation would look like:

src/ExtractionOCaml/word_by_word_montgomery --lang Rust --inline field64 64 '18446744069414584321' mul square add sub opp from_montgomery to_montgomery ...

@cjpatton
Copy link
Collaborator

cjpatton commented Jan 6, 2023

  1. Implementing finite field arithmetic is tricky, and I think we're better off using implementations that already exist. In particular, I think that a dependency on fiat-crypto for Field255 is perfectly fine. In fact, ideally we would upstream implementations of Field64 and Field128 to fiat-crypto :)

As an intermediate step, I think we could just run the Coq code ourselves to stamp out functions implementing field math for our fields. One of the binaries produces implementations using Montgomery representations, much like our existing code. The prime modulus can be specified, along with limb size, and the list of functions we want. The invocation would look like:

src/ExtractionOCaml/word_by_word_montgomery --lang Rust --inline field64 64 '18446744069414584321' mul square add sub opp from_montgomery to_montgomery ...

Yup! Let's hold off for now, however.

@cjpatton cjpatton mentioned this pull request Jan 6, 2023
@divergentdave divergentdave changed the base branch from main to david/field255 January 12, 2023 23:56
@divergentdave divergentdave force-pushed the david/idpf branch 2 times, most recently from f5f80bd to ed8c1c9 Compare January 13, 2023 00:04
* Split FFT support out from FieldElement trait
* Move Integer associated type to a separate trait
* Implement Field255
* Use Shl<usize>/Shr<usize> on F::Integer
@divergentdave divergentdave deleted the branch divviup:david/field255 January 13, 2023 23:19
@divergentdave divergentdave deleted the david/idpf branch January 13, 2023 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants