Benchmark: Evaluate replacing `new byte[n]` allocations with `stackalloc` or `ArrayPool<byte>` in `NLightning.Bolt11`

### Summary
Design and run comprehensive benchmarks to compare the current pattern of heap allocations for `byte[]` (via `new byte[n]`) against alternatives using `stackalloc` and `ArrayPool<byte>` throughout the `NLightning.Bolt11` assembly. Measure execution time, memory allocation, and CPU usage under realistic workloads (e.g., decoding BOLT11 invoices). If results show meaningful improvements, follow up with a PR to apply the preferred approach consistently across the project.

### Motivation
- `new byte[n]` causes heap allocations and GC pressure for transient buffers used in parsing/encoding BOLT11 invoices.
- `stackalloc` can avoid heap allocations for small, short‑lived buffers but increases stack usage and requires `Span<T>`-based code. Which is mostly ok in the project.
- `ArrayPool<byte>` can amortize buffer costs for larger or variable‑size buffers but adds rental/return complexity and potential for misuse.
- We need data to guide a project‑wide change to one of these strategies (or to stay as is).

### Scope
- Benchmark the existing code paths that allocate temporary `byte[]` buffers in `NLightning.Bolt11`.
- Compare three strategies:
  1. Baseline: `new byte[n]` (status quo)
  2. `stackalloc` (where possible – small, fixed‑size buffers)
  3. `ArrayPool<byte>.Shared.Rent/Return` (for larger/variable sizes)
- Workloads: realistic scenarios such as full invoice decode (and optionally encode) across small, typical, and large inputs.
- Metrics: execution time, total allocations (bytes/GC count), and CPU usage.

### Out of Scope
- Immediate refactoring across the codebase. That will be proposed only if benchmarks show a clear benefit.

### Proposed Work
1. Create a dedicated benchmark project under `benchmark/NLightning.Bolt11.Benchmarks` using BenchmarkDotNet.
2. Implement benchmarks that:
   - Drive end‑to‑end decode of BOLT11 invoices with datasets representing common and worst‑case sizes.
   - Include micro-benchmarks for the most allocation‑heavy routines (e.g., bit readers/writers, tagged field parsing, bech32 operations) to isolate buffer behavior.
3. Provide three variants for each benchmarked routine:
   - Baseline (`new byte[n]`)
   - `stackalloc` (guard with size thresholds and safe spans)
   - `ArrayPool<byte>`
4. Collect metrics:
   - BenchmarkDotNet’s standard stats (Mean, P95, StdDev)
   - Allocated bytes, Gen0/1/2 counts
   - Optional CPU sampling/tracing corroboration using external tools
5. Document results and provide a recommendation (strategy/thresholds). If beneficial, open a follow‑up PR to apply the chosen strategy consistently.

### Methodology & Metrics
- Use BenchmarkDotNet with `Release` builds, `RunStrategy.Monitoring`, and `GcForce` disabled to reflect realistic GC.
- Configure multiple input sizes:
  - Small invoices
  - Typical invoices
  - Large invoices (many tagged fields, long route info)
- Metrics from BDN:
  - `Mean`, `Error`, `StdDev`, `Median`
  - `Allocated` (bytes), `Gen0/1/2`
- External validation (optional but recommended):
  - CPU: `dotnet-trace` + `Speedscope`/`PerfView`
  - Counters: `dotnet-counters` for GC (alloc rate, GC count)
- Ensure warmup and multiple iterations; include environment info in the report (TFM, OS, CPU model, .NET version).

### Benchmark Project Structure
- `benchmark/NLightning.Bolt11.Benchmarks/` (new project)
  - References `src/NLightning.Bolt11`
  - Contains:
    - `DecodeInvoiceBenchmarks.cs` (end‑to‑end scenarios)
    - `BufferStrategyBenchmarks.cs` (microbenchmarks comparing allocation strategies)
    - `TestData/` with representative invoice samples
    - `README.md` with instructions to run and interpret results

### Example BenchmarkDotNet Template
```csharp
using System;
using System.Buffers;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Jobs;

[SimpleJob(RuntimeMoniker.Net80, warmupCount: 3, iterationCount: 15)]
[MemoryDiagnoser]
public class BufferStrategyBenchmarks
{
    [Params(16, 64, 256, 1024, 4096)]
    public int N;

    [Benchmark(Baseline = true)]
    public int Baseline_NewArray()
    {
        var buf = new byte[N];
        return Touch(buf);
    }

    [Benchmark]
    public int Stackalloc_WhenSmall()
    {
        if (N <= 256)
        {
            Span<byte> span = stackalloc byte[N];
            return Touch(span);
        }
        else
        {
            var buf = new byte[N];
            return Touch(buf);
        }
    }

    [Benchmark]
    public int ArrayPool_RentReturn()
    {
        var pool = ArrayPool<byte>.Shared;
        var buf = pool.Rent(N);
        try { return Touch(buf.AsSpan(0, N)); }
        finally { pool.Return(buf, clearArray: false); }
    }

    private static int Touch(Span<byte> s)
    {
        int x = 0;
        for (int i = 0; i < s.Length; i++)
            x ^= i;
        return x;
    }
}
```

### Datasets
- Curate a set of real‑world and synthetic BOLT11 invoice samples:
  - Minimal invoices
  - Typical invoices (median size from your logs/fixtures)
  - Stress invoices (max fields, large route info, long descriptions)
- Reuse samples from existing tests under `test/NLightning.Bolt11.Tests` and `test/NLightning.Integration.Tests` where possible.

### Tooling (suggested) — links
- BenchmarkDotNet: https://benchmarkdotnet.org/
- dotnet-counters: https://learn.microsoft.com/dotnet/core/diagnostics/dotnet-counters
- dotnet-trace: https://learn.microsoft.com/dotnet/core/diagnostics/dotnet-trace
- PerfView: https://github.com/microsoft/perfview
- Speedscope (for viewing traces): https://www.speedscope.app/
- JetBrains Rider Profiler: https://www.jetbrains.com/help/rider/Performance_Profiler.html
- Visual Studio Profiler: https://learn.microsoft.com/visualstudio/profiling/
- GC Tuning/Docs: https://learn.microsoft.com/dotnet/standard/garbage-collection/

### Risks / Considerations
- `stackalloc` only for small buffers; large stack allocations risk stack overflow.
- `ArrayPool<byte>` requires careful zeroing policy and correct `Return` usage to avoid data leakage and correctness issues.
- Some APIs may need `Span<T>` overloads; refactoring effort should be considered in the follow‑up PR.
- Ensure benchmarks aren’t over‑optimized by the JIT; vary inputs and prevent dead‑code elimination.

### Acceptance Criteria
- [ ] Benchmark project added under `benchmark/` that can be run locally and in CI (optional) for reproducible results.
- [ ] Clear scripts/instructions to run benchmarks and collect results for time, allocations, and CPU.
- [ ] Report comparing strategies across representative workloads, with environment details.
- [ ] Decision documented: keep `new`, switch to `stackalloc` for ≤X bytes, or prefer `ArrayPool<byte>` beyond a threshold (or hybrid).
- [ ] If improvement is substantial, open a follow‑up PR to apply the chosen strategy consistently in `NLightning.Bolt11`.

### Deliverables
- Benchmark project + source.
- Benchmark results (Markdown/CSV) checked into `benchmark/results/` with date and environment metadata.
- Recommendation summary and next steps.

### How to Run
- From repo root:
  - `dotnet build -c Release`
  - `dotnet run -c Release --project benchmark/NLightning.Bolt11.Benchmarks`
  - Optional: `dotnet-counters monitor System.Runtime -- dotnet run ...`
  - Optional: `dotnet-trace collect -- dotnet run ...` and analyze with PerfView/Speedscope.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmark: Evaluate replacing `new byte[n]` allocations with `stackalloc` or `ArrayPool<byte>` in `NLightning.Bolt11` #72

Summary

Motivation

Scope

Out of Scope

Proposed Work

Methodology & Metrics

Benchmark Project Structure

Example BenchmarkDotNet Template

Datasets

Tooling (suggested) — links

Risks / Considerations

Acceptance Criteria

Deliverables

How to Run

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Benchmark: Evaluate replacing new byte[n] allocations with stackalloc or ArrayPool<byte> in NLightning.Bolt11 #72

Description

Summary

Motivation

Scope

Out of Scope

Proposed Work

Methodology & Metrics

Benchmark Project Structure

Example BenchmarkDotNet Template

Datasets

Tooling (suggested) — links

Risks / Considerations

Acceptance Criteria

Deliverables

How to Run

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Benchmark: Evaluate replacing `new byte[n]` allocations with `stackalloc` or `ArrayPool<byte>` in `NLightning.Bolt11` #72