Implement IdpfPoplar #381

divergentdave · 2022-12-13T00:57:43Z

This is a working prototype of IdpfPoplar, as specified in VDAF-03. Key generation and evaluation work, and the codec traits are implemented for the public share.

This needs the following issues to be addressed before it's production-ready:

Cut down on dependencies as needed. (I was liberal about adding dependencies to get things working, with the intention of replacing some or all of them)
- fiat-crypto is used for $GF(2^{255}-19)$ field math. While the formal methods work behind this provides a great deal of assurance, we only need a tiny slice of the library. This dependency could be troublesome for consumers that are conservative about crypto libraries.
- A crate from Parity's Ethereum ecosystem is used to implement a 256-bit unsigned integer type. This is needed for an associated type on Field255. This could be easily replaced with a handwritten type using four u64 limbs.
- bitvec is used for storage of IDPF inputs, to simplify looking up bits in their semantic order, without fiddling with manual shifts. This could be replaced with much simpler types, since we don't need the ability to mutate bit vectors, etc. bitvec is also used in serialization/deserialization of packed control bit correction words, for convenience.
Implement a stateful API, so that keys and control bits from previous evaluations can be reused in later evaluations. This will be necessary for efficient evaluation over practical size inputs, as this memoization will bring runtime from $O(n^2)$ to $O(n)$.
Clean up ToyIdpf and the existing, older Poplar1 implementation. I renamed the existing Idpf trait and some existing types to avoid having to change them too much. These will need to be overhauled to work with the complete Poplar1 specification in VDAF-03. I think it might make sense to wait and merge IdpfPoplar and the new Poplar1 together once both are ready.
We should have benchmarks for Poplar1.

Other notes: I split the FieldElement trait in two, both because $p=2^{255}-19$ is not FFT-friendly, thus certain methods don't make sense on it, and so that I could avoid implementing field operations that IdpfPoplar and Poplar1 do not use.

cjpatton · 2022-12-13T01:27:35Z

Heeerree we go!!! 🛵

A few high-level, somewhat opinionated suggestions (worth discussing). They mostly have to do with not over-indexing on generality:

Implementing finite field arithmetic is tricky, and I think we're better off using implementations that already exist. In particular, I think that a dependency on fiat-crypto for Field255 is perfectly fine. In fact, ideally we would upstream implementations of Field64 and Field128 to fiat-crypto :)
The IDPF should be moved into its own module, similar to how FLP has its own module. The reason is that this thing is likely to be useful independent of Poplar1.
I don't think it's worth defining a trait for IDPF, as we will in all likelihood end up with a single instantiation of it. Accordingly, I think we should drop ToyIdpf altogether.

divergentdave · 2022-12-21T16:16:41Z

Rebased to pick up a Clippy fix, and pushed another commit to get rid of the primitive-types dependency.

divergentdave · 2022-12-23T23:38:19Z

Some notes before I stop thinking about this for a while:

On my machine, the IDPF key generation benchmark is consistently running at a throughput of ~75KiB/s.

Right now, the evaluation benchmark has a throughput of ~100 KiB/s on 1-byte inputs, ~80 KiB/s on 16-byte inputs, and ~20 KiB/s on 256-byte inputs. Profiling the largest case shows that CPU time is dominated by hashing bit slices for cache lookups and insertions.

With a better designed cache, we could keep the throughput from dropping so much. Since IDPF evaluations will be almost certainly part of a tree walking algorithm, I think a cache built on a very small ring buffer, with lookups done via a linear scan, would make sense. So long as the ring buffer capacity is more than 1.5x the number of candidate prefixes being considered, LRU eviction from the ring buffer would work great.

We can make sure our eventual replacement for bitvec is more sympathetic to comparison. (I see that the bitvec hashing code uses a stack of iterators, and calls Hasher::write_u8() many times)

On the other hand, production deployments will either bring their own cache implementation, backed by durable/distributed storage, or choose to trade off recomputing seeds instead of storing them. Perhaps the performance of the cache implementations prio provides only matters to the extent they get out of the way of our benchmarks.

Separately, I have an idea for an optimization for the evaluation side. Currently, eval_next() computes the IDPF's values, i.e. the pair of field elements, for every level. eval_inner() discards these from all but the last eval_next() call. This is approximately one wasted field addition and one wasted field negation per level, if no caching is used, though no field computations are wasted if a perfect cache is used. With a little bit of rearranging, we could pull application of the field element correction words out of eval_next(), and only run this at the end of eval_inner(), for the level where we know we will use the result.

cjpatton

I've started digging in. I think the first step is to carve out a PR for the Field255 implementation; see inline comments.

cjpatton · 2023-01-04T17:28:24Z

src/field.rs

@@ -199,6 +181,26 @@ pub trait FieldElement:
    }
 }



pow() and inv() are defined for any field, not just FFT-friendly ones. The only methods that are FFT-specific are generator_order(), generator(), and root().

True, I had just moved those two because Poplar wouldn't need them (it just squares), so "FFT friendly field element" is a slight misnomer because of that. I could implement those two for a cleaner API boundary.

For now, I would not have Field255 implement them. Rather, I would just stick a todo!("implement when needed") in there.

cjpatton · 2023-01-04T17:44:38Z

Cargo.toml

@@ -11,10 +11,12 @@ resolver = "2"

 [dependencies]
 aes = { version = "0.8.2", optional = true }
+bitvec = "1.0.1"  # TODO: replace this with a simplified bit vector struct after getting a working prototype together


IMO, re-implementing bitvec is not necessarily the right choice. At least while IDPF is behind the experimental flag, it would be worth keeping this dependency.

cjpatton · 2023-01-04T18:56:52Z

src/field255.rs

@@ -0,0 +1,305 @@
+//! Finite field arithmetic for `GF(2^255 - 19)`.


I was hoping there would be a higher-level API for arithmetic in this field that we could take advantage of, i.e., that we wouldn't use fiat-crypto directly. However after doing some research, this appears not to be the case.

The best rust implementation of GF(2^255-19) is dalek, but they don't expose the field arithemtic directly. I also noticed that the code here is essentially a re-implementation of dalek's old fiat backend. As I understand it, maintenance has moved to the zkcrypto project (https://github.com/zkcrypto/curve25519-dalek-ng). Interestingly, there the maintainers have removed the fiat backend entirely.

I'm not sure what the best path forward here would be. If the zkcrypto folks are willing to consider exposing the field arithmetic (zkcrypto/curve25519-dalek-ng#20), then I think that would be our best bet. In the meantime, re-implementing the fiat backend is probably our best option.

This code is delicate and will require careful review. Let's move splitting the FFT functionality of FieldElement and implementing Field255 into a separate PR, then rebase this PR on top. I will see if I can get some feedback from folks on my team who have experience with fiat-crypto.

cjpatton · 2023-01-05T19:50:22Z

src/vdaf/idpf.rs

@@ -0,0 +1,1620 @@
+//! This module implements the incremental distributed point function (IDPF) described in
+//! [[BBCG+21]] and [[draft-irtf-cfrg-vdaf-03]].


The spec is based on BBCG+21, so I would just say "This module implements the IDPF described in [[draft-irtf-cfrg-vdaf-03]]`.

cjpatton · 2023-01-05T19:55:47Z

src/vdaf/idpf.rs

@@ -0,0 +1,1620 @@
+//! This module implements the incremental distributed point function (IDPF) described in


Should this module go in the root of the crate? Like flp, it's not necessarily VDAF-specific.

cjpatton · 2023-01-05T20:08:44Z

src/vdaf/idpf.rs

+    Leaf([FL; OUT_LEN]),
+}
+
+/// An additive secret share of an IDPF output.


What's the rationale for having a different type for "secret shared" outputs? Mathematically speaking, IdpfOutput and IdpfOutputShare are necessarily the same. In my opinion, this distinction doesn't need to be made a this level of abstraction, and in fact can get in the way. (C.f. the combinatorial blow up in merge() method below.)

FWIW, @tgeoghegan and I had a similar debate about whether to define a Polynomial type in order to distinguish it from a vector. My point of view is that the cost of this abstraction is too high.

cjpatton · 2023-01-05T20:16:55Z

src/vdaf/idpf.rs

+
+/// An output from evaluation of an IDPF at some level and index.
+#[derive(Debug, PartialEq, Eq)]
+pub enum IdpfOutput<const OUT_LEN: usize, FI, FL> {


Just a heads up: If/when we decide to implement Doplar, we're going to need to generalize the type here: Instead of a vector over a field, we will have a mixture of field elements and bit strings. (Mathematically speaking, all that is required is that the output forms a group.)

I don't think we should change anything now, just keep in mind that we'd have to refactor at some point.

cjpatton · 2023-01-05T20:29:21Z

src/vdaf/idpf.rs

+#[derive(Debug, Clone)]
+pub struct IdpfPoplar<P, const L: usize, const OUT_LEN: usize> {
+    bits: usize,
+    _phantom: PhantomData<(P, [(); OUT_LEN])>,


The compiler seems to be fine with not using OUT_LEN.

Suggested change

_phantom: PhantomData<(P, [(); OUT_LEN])>,

_phantom: PhantomData<P>,

cjpatton · 2023-01-05T20:45:09Z

src/vdaf/idpf.rs

+    _phantom: PhantomData<(P, [(); OUT_LEN])>,
+}
+
+impl<P, const L: usize, const OUT_LEN: usize> IdpfPoplar<P, L, OUT_LEN>


A couple observations:

The only only stored by IpdfPoplar is the bit length.

The type is quite complex, although the complexity is not actually used.

I wonder if the public methods on IdpfPoplar (gen and eval) should just be standalone functions that take the bit length as input.

cjpatton · 2023-01-05T21:05:08Z

src/vdaf/idpf.rs

+
+        let mut byte = [0u8];
+        seed_stream.fill(&mut byte);
+        let control_bits = [(byte[0] & 1) != 0, ((byte[0] >> 1) & 1) != 0];


I'm slightly worried about the comparisons here (i.e., x != 0) being a side channel. It should be sufficient to represent the control bits as u8's right?

cjpatton · 2023-01-05T21:22:19Z

src/vdaf/idpf.rs

+            control_bits_1[keep] ^ (previous_control_bits[1] & control_bit_correction_words[keep]);
+
+        let seeds_corrected = [
+            if previous_control_bits[0] {


We need to rewrite this code to be constant time, i.e., not branch on the value of the secret control bits. (I'm making a note to adjust the spec accordingly.) One idea is to convert the control bits to bitmasks via wrapping subtraction:

let control_bit_mask_0 = 0_u8.wrapping_sub(previous_control_bits[0] as u8); let control_bit_mask_1 = 0_u8.wrapping_sub(previous_control_bits[1] as u8); for i in 0..L { seed_0[keep].0[i] ^= seed_correction_word.0[i] & control_bit_mask_0; seed_1[keep].0[i] ^= seed_correction_word.0[i] & control_bit_mask_1; }

cjpatton · 2023-01-05T21:32:09Z

src/vdaf/idpf.rs

+    }
+
+    /// An auxiliary function that acts as a pseudorandom generator, returning field elements.
+    fn convert<F>(&self, seed: &Seed<L>) -> (Seed<L>, [F; OUT_LEN])


Suggestion: It might be easier to oparate on [u8; L] directly rather than Seed<L>.

divergentdave · 2023-01-06T14:48:49Z

Implementing finite field arithmetic is tricky, and I think we're better off using implementations that already exist. In particular, I think that a dependency on fiat-crypto for Field255 is perfectly fine. In fact, ideally we would upstream implementations of Field64 and Field128 to fiat-crypto :)

As an intermediate step, I think we could just run the Coq code ourselves to stamp out functions implementing field math for our fields. One of the binaries produces implementations using Montgomery representations, much like our existing code. The prime modulus can be specified, along with limb size, and the list of functions we want. The invocation would look like:

src/ExtractionOCaml/word_by_word_montgomery --lang Rust --inline field64 64 '18446744069414584321' mul square add sub opp from_montgomery to_montgomery ...

cjpatton · 2023-01-06T16:48:15Z

Implementing finite field arithmetic is tricky, and I think we're better off using implementations that already exist. In particular, I think that a dependency on fiat-crypto for Field255 is perfectly fine. In fact, ideally we would upstream implementations of Field64 and Field128 to fiat-crypto :)

As an intermediate step, I think we could just run the Coq code ourselves to stamp out functions implementing field math for our fields. One of the binaries produces implementations using Montgomery representations, much like our existing code. The prime modulus can be specified, along with limb size, and the list of functions we want. The invocation would look like:
src/ExtractionOCaml/word_by_word_montgomery --lang Rust --inline field64 64 '18446744069414584321' mul square add sub opp from_montgomery to_montgomery ...

Yup! Let's hold off for now, however.

* Split FFT support out from FieldElement trait * Move Integer associated type to a separate trait * Implement Field255 * Use Shl<usize>/Shr<usize> on F::Integer

divergentdave force-pushed the david/idpf branch from c6b7478 to e4586c4 Compare December 21, 2022 16:15

cjpatton reviewed Jan 4, 2023

View reviewed changes

cjpatton mentioned this pull request Jan 5, 2023

Securely instantiating the PRG cfrg/draft-irtf-cfrg-vdaf#106

Closed

cjpatton reviewed Jan 5, 2023

View reviewed changes

cjpatton mentioned this pull request Jan 6, 2023

Implement Field255 #408

Merged

divergentdave changed the base branch from main to david/field255 January 12, 2023 23:56

divergentdave force-pushed the david/idpf branch 2 times, most recently from f5f80bd to ed8c1c9 Compare January 13, 2023 00:04

Add Field255, split FieldElement trait in three

553ea30

* Split FFT support out from FieldElement trait * Move Integer associated type to a separate trait * Implement Field255 * Use Shl<usize>/Shr<usize> on F::Integer

divergentdave force-pushed the david/field255 branch from a93a5e3 to 553ea30 Compare January 13, 2023 22:23

divergentdave added 6 commits January 13, 2023 16:34

Remove ToyIdpf, stub out Poplar1

bdd085e

Add bitvec dependency

79ca595

Implement IdpfPoplar

9e91765

Use two specialized conversions for IdpfInput

9967bd3

Reduce copying of bit vectors

e9ea490

Add dedicated IdpfPoplar test vector

024b2f7

divergentdave force-pushed the david/idpf branch from dc6e149 to 024b2f7 Compare January 13, 2023 22:39

divergentdave force-pushed the david/field255 branch from 553ea30 to 1ee21ca Compare January 13, 2023 23:10

divergentdave deleted the branch divviup:david/field255 January 13, 2023 23:19

divergentdave closed this Jan 13, 2023

divergentdave deleted the david/idpf branch January 13, 2023 23:20

divergentdave mentioned this pull request Jan 13, 2023

Implement IdpfPoplar #414

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement IdpfPoplar #381

Implement IdpfPoplar #381

divergentdave commented Dec 13, 2022

cjpatton commented Dec 13, 2022

divergentdave commented Dec 21, 2022

divergentdave commented Dec 23, 2022

cjpatton left a comment

cjpatton Jan 4, 2023

divergentdave Jan 4, 2023

cjpatton Jan 4, 2023

cjpatton Jan 4, 2023

cjpatton Jan 4, 2023

cjpatton Jan 5, 2023

cjpatton Jan 5, 2023

cjpatton Jan 5, 2023

cjpatton Jan 5, 2023

cjpatton Jan 5, 2023

cjpatton Jan 5, 2023

cjpatton Jan 5, 2023

cjpatton Jan 5, 2023

cjpatton Jan 5, 2023

divergentdave commented Jan 6, 2023

cjpatton commented Jan 6, 2023

		@@ -0,0 +1,305 @@
		//! Finite field arithmetic for `GF(2^255 - 19)`.

		@@ -0,0 +1,1620 @@
		//! This module implements the incremental distributed point function (IDPF) described in
		//! [[BBCG+21]] and [[draft-irtf-cfrg-vdaf-03]].

	_phantom: PhantomData<(P, [(); OUT_LEN])>,
	_phantom: PhantomData<P>,

Implement IdpfPoplar #381

Implement IdpfPoplar #381

Conversation

divergentdave commented Dec 13, 2022

cjpatton commented Dec 13, 2022

divergentdave commented Dec 21, 2022

divergentdave commented Dec 23, 2022

cjpatton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

divergentdave commented Jan 6, 2023

cjpatton commented Jan 6, 2023