Multiple optimizations of vector copies, spans, and Snappy API raw compression#222
Conversation
avalerio-tkd
commented
Mar 4, 2026
- Multiple optimizations to switch to using tcb::span instead of std::vector.
- Removed unnecessary copied returns of std::vector values.
- Swtiched to the more performant Snappy raw API for compression.
…:vector. - Removed unnecessary copied returns of std::vector values. - Swtiched to the more performant Snappy raw API for compression.
argmarco-tkd
left a comment
There was a problem hiding this comment.
Thanks for this. Overall LGTM. Left a few questions, none of which are blockers if they are not redflags/concerns for you. Approving.
| snappy::Compress(reinterpret_cast<const char*>(bytes.data()), bytes.size(), &compressed); | ||
| return std::vector<uint8_t>(compressed.begin(), compressed.end()); | ||
| std::vector<uint8_t> out_buffer; | ||
| out_buffer.resize(snappy::MaxCompressedLength(bytes.size())); |
There was a problem hiding this comment.
just how large can this get? (IOW - is there a risk of this causing some OOM because the estimated compressed length too large?)
There was a problem hiding this comment.
ah, this is the MaxCompressedLength for the particular byte array, not an absolute max. The heuristic for the size.
There was a problem hiding this comment.
Yeah - the question was more around how 'accurate' the heuristic was. Also checked. The operation is very quick, and the heuristic is fairly decent. All good.
https://github.com/google/snappy/blob/main/snappy.cc#L197-L219
- Improvements from code review.
avalerio-tkd
left a comment
There was a problem hiding this comment.
Thanks for all the comments @argmarco-tkd , specially the dedup'd of Split. Merging.
| snappy::Compress(reinterpret_cast<const char*>(bytes.data()), bytes.size(), &compressed); | ||
| return std::vector<uint8_t>(compressed.begin(), compressed.end()); | ||
| std::vector<uint8_t> out_buffer; | ||
| out_buffer.resize(snappy::MaxCompressedLength(bytes.size())); |
There was a problem hiding this comment.
ah, this is the MaxCompressedLength for the particular byte array, not an absolute max. The heuristic for the size.