Last 12 weeks · 0 commits
4 of 6 standards met
Quick heads up first: I built this on plonky2, not Winterfell. But it runs over the same field (your is the 2^64 - 2^32 + 1 Goldilocks prime I use), and the moving parts line up with what , , and the Merkle tree in do, so this felt like the right place to ask. The idea: a DEEP-FRI STARK where none of the stages that normally hold the whole trace actually keep it in RAM. They stream to disk instead, so peak memory ends up tracking a tile size rather than the trace length. Roughly how each stage goes: NTT/LDE uses the four-step (Bailey) split, N = n1*n2: transform the rows, apply twiddles, transform the columns, transpose. Every pass only touches one row or column, so the working set is O(sqrt N). The matrix sits on NVMe and tiles move through a fixed buffer. The Merkle tree is built bottom-up off a leaf file, one chunk per pread, so it never fully lives in memory the way a does. Leaves are batched, one leaf hashing a group of columns at a row. FRI reads each round's file and writes the next. I commit every other layer and let the verifier recompute the skipped one. There's a page-cache flusher too, because under a hard cgroup with swap off the dirty pages alone will OOM you even when the heap is flat. The thing I was careful about: the out-of-core proof comes out byte-for-byte identical to an in-core run, and a test checks that on every AIR (the in-core path is itself checked against plonky2). So streaming only moves where the work happens. And every memory number is measured under a real systemd cgroup with swap off, so "it survived" is something I watched, not a model. Some numbers under a 256 MB cgroup: 2^28 NTT, 2 GB of data: in-core gets killed, out-of-core runs in 40 MB with the same checksum. A full FRI low-degree proof from 2^26 to 2^28 (512 MB up to 2 GB): in-core OOMs every time, out-of-core stays flat at 34 MB. A Fibonacci STARK at 2^22, around 7 GB in core: killed in core, 45 MB and 61 s out of core. A sponge hash proving Sponge(public msg) = digest, 54 columns: killed in core, 121 MB out of core. The resident set follows the tile, not N. The CPU side is already parallel (a 2^24 self-proof drops from 65.7 s to 11.2 s on 16 cores), and the real cost is the Poseidon hashing in the commitment, not the I/O. What I actually want to know: I'm Goldilocks-only and on plonky2, so this is a design question, not a PR. Winterfell's and assume the trace lives in memory with random row access, and the is fully resident. Is there room in that abstraction for a disk-backed trace and a streaming commitment, or do the trait signatures bake the in-memory assumption in too deeply? And if it did fit, would a streaming prover ever belong in Winterfell, or only as an external thing built on the traits? Upfront about the gaps: it's a research artifact. Small AIRs (they do go up to the sponge), Goldilocks only, a research Poseidon rather than your Rescue, and since the bottleneck is the hash, the obvious next step is GPU hashing, which is the one part I haven't had the hardware to try. Writeup with diagrams and a recording of the cgroup run: https://dev.to/nzengi/a-zk-provers-ram-is-a-dial-not-a-wall-3blg Code: https://github.com/nzengi/zk-stream
In winterfell/air/src/proof/table.rs, the comment and assertion's message requires num_rows and num_cols Result { assert!(num_rows > 0, "number of rows must be greater than 0"); assert!( num_rows 0, "number of columns must be greater than 0"); assert!( num_cols < MAX_ROWS, "number of columns cannot exceed {MAX_COLS}, but was {num_cols}" ); let mut reader = SliceReader::new(bytes); let num_elements = num_rows * num_cols; Ok(Self { data: reader.read_many(num_elements)?, row_width: num_cols, }) } ```
uninit_vector previously returned Vec with uninitialized memory, which is undefined behavior (especially for types implementing Drop). Change return type to Vec> and add assume_init_vec helper for zero-cost conversion after initialization. Update all ~30 call sites across math, crypto, fri, prover, and examples crates. Closes #396
Repository: facebook/winterfell. Description: A STARK prover and verifier for arbitrary computations Stars: 897, Forks: 231. Primary language: Rust. Languages: Rust (99.7%), HTML (0.1%), Makefile (0.1%), Shell (0%). License: MIT. Open PRs: 19, open issues: 48. Last activity: 11mo ago. Community health: 75%. Top contributors: irakliyk, Nashtare, Al-Kindi-0, 0xkanekiken, plafer, hackaugusto, grjte, andrewmilson, Jasleen1, Fumuran and others.