-
-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed
Description
Summary
Design and run comprehensive benchmarks to compare the current pattern of heap allocations for byte[] (via new byte[n]) against alternatives using stackalloc and ArrayPool<byte> throughout the NLightning.Bolt11 assembly. Measure execution time, memory allocation, and CPU usage under realistic workloads (e.g., decoding BOLT11 invoices). If results show meaningful improvements, follow up with a PR to apply the preferred approach consistently across the project.
Motivation
new byte[n]causes heap allocations and GC pressure for transient buffers used in parsing/encoding BOLT11 invoices.stackalloccan avoid heap allocations for small, short‑lived buffers but increases stack usage and requiresSpan<T>-based code. Which is mostly ok in the project.ArrayPool<byte>can amortize buffer costs for larger or variable‑size buffers but adds rental/return complexity and potential for misuse.- We need data to guide a project‑wide change to one of these strategies (or to stay as is).
Scope
- Benchmark the existing code paths that allocate temporary
byte[]buffers inNLightning.Bolt11. - Compare three strategies:
- Baseline:
new byte[n](status quo) stackalloc(where possible – small, fixed‑size buffers)ArrayPool<byte>.Shared.Rent/Return(for larger/variable sizes)
- Baseline:
- Workloads: realistic scenarios such as full invoice decode (and optionally encode) across small, typical, and large inputs.
- Metrics: execution time, total allocations (bytes/GC count), and CPU usage.
Out of Scope
- Immediate refactoring across the codebase. That will be proposed only if benchmarks show a clear benefit.
Proposed Work
- Create a dedicated benchmark project under
benchmark/NLightning.Bolt11.Benchmarksusing BenchmarkDotNet. - Implement benchmarks that:
- Drive end‑to‑end decode of BOLT11 invoices with datasets representing common and worst‑case sizes.
- Include micro-benchmarks for the most allocation‑heavy routines (e.g., bit readers/writers, tagged field parsing, bech32 operations) to isolate buffer behavior.
- Provide three variants for each benchmarked routine:
- Baseline (
new byte[n]) stackalloc(guard with size thresholds and safe spans)ArrayPool<byte>
- Baseline (
- Collect metrics:
- BenchmarkDotNet’s standard stats (Mean, P95, StdDev)
- Allocated bytes, Gen0/1/2 counts
- Optional CPU sampling/tracing corroboration using external tools
- Document results and provide a recommendation (strategy/thresholds). If beneficial, open a follow‑up PR to apply the chosen strategy consistently.
Methodology & Metrics
- Use BenchmarkDotNet with
Releasebuilds,RunStrategy.Monitoring, andGcForcedisabled to reflect realistic GC. - Configure multiple input sizes:
- Small invoices
- Typical invoices
- Large invoices (many tagged fields, long route info)
- Metrics from BDN:
Mean,Error,StdDev,MedianAllocated(bytes),Gen0/1/2
- External validation (optional but recommended):
- CPU:
dotnet-trace+Speedscope/PerfView - Counters:
dotnet-countersfor GC (alloc rate, GC count)
- CPU:
- Ensure warmup and multiple iterations; include environment info in the report (TFM, OS, CPU model, .NET version).
Benchmark Project Structure
benchmark/NLightning.Bolt11.Benchmarks/(new project)- References
src/NLightning.Bolt11 - Contains:
DecodeInvoiceBenchmarks.cs(end‑to‑end scenarios)BufferStrategyBenchmarks.cs(microbenchmarks comparing allocation strategies)TestData/with representative invoice samplesREADME.mdwith instructions to run and interpret results
- References
Example BenchmarkDotNet Template
using System;
using System.Buffers;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Jobs;
[SimpleJob(RuntimeMoniker.Net80, warmupCount: 3, iterationCount: 15)]
[MemoryDiagnoser]
public class BufferStrategyBenchmarks
{
[Params(16, 64, 256, 1024, 4096)]
public int N;
[Benchmark(Baseline = true)]
public int Baseline_NewArray()
{
var buf = new byte[N];
return Touch(buf);
}
[Benchmark]
public int Stackalloc_WhenSmall()
{
if (N <= 256)
{
Span<byte> span = stackalloc byte[N];
return Touch(span);
}
else
{
var buf = new byte[N];
return Touch(buf);
}
}
[Benchmark]
public int ArrayPool_RentReturn()
{
var pool = ArrayPool<byte>.Shared;
var buf = pool.Rent(N);
try { return Touch(buf.AsSpan(0, N)); }
finally { pool.Return(buf, clearArray: false); }
}
private static int Touch(Span<byte> s)
{
int x = 0;
for (int i = 0; i < s.Length; i++)
x ^= i;
return x;
}
}Datasets
- Curate a set of real‑world and synthetic BOLT11 invoice samples:
- Minimal invoices
- Typical invoices (median size from your logs/fixtures)
- Stress invoices (max fields, large route info, long descriptions)
- Reuse samples from existing tests under
test/NLightning.Bolt11.Testsandtest/NLightning.Integration.Testswhere possible.
Tooling (suggested) — links
- BenchmarkDotNet: https://benchmarkdotnet.org/
- dotnet-counters: https://learn.microsoft.com/dotnet/core/diagnostics/dotnet-counters
- dotnet-trace: https://learn.microsoft.com/dotnet/core/diagnostics/dotnet-trace
- PerfView: https://github.com/microsoft/perfview
- Speedscope (for viewing traces): https://www.speedscope.app/
- JetBrains Rider Profiler: https://www.jetbrains.com/help/rider/Performance_Profiler.html
- Visual Studio Profiler: https://learn.microsoft.com/visualstudio/profiling/
- GC Tuning/Docs: https://learn.microsoft.com/dotnet/standard/garbage-collection/
Risks / Considerations
stackalloconly for small buffers; large stack allocations risk stack overflow.ArrayPool<byte>requires careful zeroing policy and correctReturnusage to avoid data leakage and correctness issues.- Some APIs may need
Span<T>overloads; refactoring effort should be considered in the follow‑up PR. - Ensure benchmarks aren’t over‑optimized by the JIT; vary inputs and prevent dead‑code elimination.
Acceptance Criteria
- Benchmark project added under
benchmark/that can be run locally and in CI (optional) for reproducible results. - Clear scripts/instructions to run benchmarks and collect results for time, allocations, and CPU.
- Report comparing strategies across representative workloads, with environment details.
- Decision documented: keep
new, switch tostackallocfor ≤X bytes, or preferArrayPool<byte>beyond a threshold (or hybrid). - If improvement is substantial, open a follow‑up PR to apply the chosen strategy consistently in
NLightning.Bolt11.
Deliverables
- Benchmark project + source.
- Benchmark results (Markdown/CSV) checked into
benchmark/results/with date and environment metadata. - Recommendation summary and next steps.
How to Run
- From repo root:
dotnet build -c Releasedotnet run -c Release --project benchmark/NLightning.Bolt11.Benchmarks- Optional:
dotnet-counters monitor System.Runtime -- dotnet run ... - Optional:
dotnet-trace collect -- dotnet run ...and analyze with PerfView/Speedscope.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed