Add Stratum V2 (SV2) protocol support#1553
Add Stratum V2 (SV2) protocol support#1553warioishere wants to merge 18 commits intobitaxeorg:masterfrom
Conversation
8882a06 to
9551b7e
Compare
|
Please add some screenshots. I've had a cursory look at the code, will do a more in-depth review later. I'm not opposed to agentic development, however, only if the author is familiar with the codebase and does first line code reviews as well. This codebase looks decent, but I did see some areas where it could benefit more from separation of concerns. For example: would it be cleaner to have a dedicated If you have Claude work on it, please make it aware of #901, and have it keep that in mind that if we're splitting off work production protocols, this is something that I might like to add as well in the future. |
Hey, I'll refactor this using a protocol coordinator pattern. I'll extract V1 into its own task (stratum_v1_task), remove the cross-protocol dependencies from sv2_task, I'm keeping create_jobs_task unified since it already works fine, and I don't think the global_state union or vtable stuff is worth the complexity. I'll remove the old stratum_task.c once everything tests out. I read #901 already, but I thought about usings sv2 capability to connect to a separate jd-client that can connect to bitcoin core via Sjors TP, that would give us also full control over templates and dezentralize things further, but yea, I can have a look onto this option directly getting templates from core as its far simplier then using a jd-client. We can look at this in further developement, okay? |
9551b7e to
0470129
Compare
|
Refactoring is done, everything squashed into a single commit. Here's what changed since last time: V1 and V2 now live in their own task files (stratum_v1_task.c and stratum_v2_task.c), with a protocol coordinator handling all the fallback and recovery logic. The old stratum_task.c is gone. Fixed the sv2_api.h naming mismatch, it's consistently sv2_protocol now. Both primary and fallback pool support SV2 selection, and the UI only shows relevant options per protocol. Heartbeat probing works for both V1 and V2 primaries, so liveness checks are covered regardless of protocol combo. Also fixed a bunch of stability issues with protocol switching - the old implementation would crash or restart the device when switching between V1 and V2 at runtime. That's sorted now. Extended channel support and GetBlockTemplate mining are left for follow-up PRs as discussed. Ready for review 🚀 |
|
No need to do squash and force-push. This makes it harder to review incremental changes. |
|
i made a backup of the branch before squashing, should I revert? |
No it's fine for this PR. Sometime a squash and/or force push is necessary to fix merges, so no problem. Just for future reference, the code will be squashed into a single commit on merge anyways. |
|
Can you see if you can update your branch, it looks like some changes that were added to master have been added here as well. These might drop out if you update. |
0470129 to
b3d3a46
Compare
Updated the branch, rebased onto latest master. The 3 duplicate commits dropped out. Should be a cleaner diff now. |
Add full Stratum V2 mining protocol support to the bitaxe, enabling encrypted communication with SV2 pools via Noise_NX handshake (secp256k1+EllSwift, ChaChaPoly, SHA256). Includes a robust protocol coordinator for clean failover between any combination of V1/V2 primary and fallback pools without device restarts. Protocol implementation: - SV2 binary protocol (components/stratum_v2/): SetupConnection, OpenStandardMiningChannel, NewMiningJob, SetNewPrevHash, SetTarget, SubmitShares with proper frame encoding/decoding - Noise encryption (sv2_noise.c): Full NX handshake with optional authority public key verification (TOFU mode when unconfigured) - libsecp256k1 v0.6.0 as git submodule for elliptic curve operations Protocol coordinator and fallback: - Non-blocking event-driven coordinator manages protocol task lifecycle - Supports all 4 failover combinations: V2->V1, V2->V2, V1->V2, V1->V1 - Timer-based heartbeat probes primary pool during fallback operation - User-selected fallback (dashboard toggle) disables auto-recovery - Clean state transitions: queue clear, share stats reset, proper task shutdown with event synchronization Key reliability fixes: - Heap-allocate sv2_conn to prevent dangling pointer after task exit - Dynamic protocol check in create_jobs_task (was cached at startup, causing memory corruption on protocol switch) - Single event per task exit (was double-signaling coordinator) - Remove esp_restart() from V1 task, notify coordinator instead - Fix V1 transport handle leak (destroy after close) - Remove close_connection race from asic_result_task Frontend and configuration: - NVS settings for SV2 authority pubkey and fallback pool protocol - Pool settings UI: protocol selector and SV2 pubkey for both pools, V1-only options hidden when SV2 selected - Display: hide block height and scriptsig in SV2 mode (not available in standard channel), show protocol indicator instead - OpenAPI spec updated with new SV2 configuration fields
b3d3a46 to
51787d9
Compare
coordinator_state_t and coordinator_event_t are only used inside protocol_coordinator.c, no need to expose them in the header.
|
Which parts of SV2 protocol does this implement, is it Mining protocol or JD as well? Does it include both extended and standard channels support? |
|
currently no extented channels, that will be part of a next PR. and no JD-client, but I am building an easy to setup full stack deployment which you could run on a raspberry pi, or even on the node itself. Contains a JD-Client, TP from Sjor, and a bitcoincore setup build with --enable-multiprocess so that the IPC Unix socket connection method can be used: |
|
jst confirmed the implementation works also againstthe original sv2 reference server: |
|
Do you have a Max (BM1397) device to test against? The others are functionally equivalent with respect to job construction/asic comms. |
ntime rolling on a bitaxe seems quite backwards? |
|
elaborating on the comment above IIUC the BM1370 doesn't support version rolling, which is forcing you to roll ntime so you can stick with a standard channel, right?
"allowed" but not definitely not encouraged... rolling ntime can have bad consequences on consensus level as a rule of thumb, ntime should only be increased when we want to reset the search space and we know we've been hashing for longer than 1s and it should happen like this:
or at least some variation that respects ntime as something that progresses together with real time, and not something that rolls indefinitely into the past or future (with undesired consequences) overall, I think it's a very good idea to support Sv2 Standard Channels as a "first class citizen" on AxeOS but for edge cases where ASICs cannot do version rolling, I would go with Extended Channels and not try to reinvent the wheel |
Can you elaborate on this? From what I see in the codebase, the BM1370 does do hardware version rolling:
These are latest-gen Antminer S21 chips — would be surprising if they dropped version rolling support. The BM1397 is the only one where That said, you're right that the ntime rolling approach is wrong regardless. Even if the ASIC does version roll, bumping ntime by +1 every 500ms job send is not how ntime should be used. I'll fix that — either remove the offset entirely (since version rolling gives enough search space) or clamp it to real elapsed wall-clock time. What would you suggest as the right approach here for standard channels? |
the BM1397 is the only ASIC where set_version_mask is a no-op — it doesn't do hardware version rolling. That means SV2 standard channels won't give it enough search space even with jst ntime rolling, which isn't a good solution (see discussion with plebhash above). I think there are two options here:
What do you think makes more sense? |
|
Excluding a chip from a protocol is a bit of a nasty dependency. Maybe adding Extended channels also puts it more in line with how SV1 currently works. I can't really oversee how much more work that is to support though. |
|
Yeah, I agree excluding a chip entirely from a protocol isn't great. I think the clean approach is:
The channel type decision is just a runtime check on the ASIC ID — use For this PR I'd scope it to standard channels only, which means BM1397 stays on SV1 for now. Extended channel support in a follow-up PR would unlock SV2 for the Max as well. |
|
I looked into the effort for Extended Channel support. The good news is most of the heavy lifting already exists — One option would be to drop Standard Channels entirely and go straight to Extended Channels. That way:
The tradeoff is that Extended Channels put more work on the miner (coinbase + merkle computation), but the ESP32 already does this for SV1 without issues. What would you suggest — Extended-only, or keep both with a runtime ASIC check? Thats about a week more effort or so. |
I'm not claiming this is true and I don't know whether BM1370 can do version rolling or not. I just got this understanding from the PR description.
|
Add activeProtocolLabel field to /api/system/info endpoint that returns the human-readable protocol string based on runtime state. Frontend now just reads the backend-provided label.
Replace protocol/channel type logic with a simple blockHeight > 0 check. The backend only populates coinbase fields when it has the data, so the frontend just needs to check if the data is present.
sv2AuthorityPubkey → stratumV2AuthorityPubkey sv2ChannelType → stratumV2ChannelType fallbackSv2AuthorityPubkey → fallbackStratumV2AuthorityPubkey fallbackSv2ChannelType → fallbackStratumV2ChannelType Updated across NVS rest_name, http_server JSON, openapi.yaml, and frontend components. NVS storage keys unchanged (no flash migration).
|
Thanks for the thorough review! Addressing your three key points: 1. Protocol separation / coordinator leakage: That said, I agree the protocol-specific details in 2. Unified job struct: The main friction is that V1 uses hex strings throughout (coinbase_1/2, prev_block_hash as 3. Unrelated changes: |
The work loop re-sends the current job to the ASIC on timeout. For V1/SV2-extended this produces unique work (extranonce_2 changes), but for SV2 standard channel the data is identical, restarting the ASIC nonce search from 0 and producing duplicate shares. Skip the re-send for SV2 standard channel. Once PR bitaxeorg#420 (fullscan nonce space) is merged, the ASIC will have enough search space (2^32 nonces x 2^16 version rolls) to keep mining without re-feeding.
To cover all comments: I think these issues can be done in a follow-up. There's something to say to leave the SV1 path as intact as possible, and only rework/integrate/unify stuff when the SV2 implementation is somewhat battle-tested. |
Author: wario_is_here <mario.hofmann@yourdevice.ch>
- Set poolConnectionInfo on all SV2 error paths with user-friendly messages (e.g. "Pool unreachable", "Auth failed - check key") - Replace manual byte parsing of certificate with packed struct
Handle return values of secp256k1_context_randomize and secp256k1_xonly_pubkey_from_pubkey to fix warn_unused_result warnings.
|
Built https://github.com/bitaxeorg/ESP-Miner/tree/early-access which includes this PR, upon setting up braiins via SV2 my 601 gamma fails over to secondary pool with this error Full braiins SV2 stratum url for reference: |
can you try standard channels, looks like braiins doesnt support extented channels. Also they dont completly follow SRI reference. |
Working as intended with standard channels. |
Are you sure? If Extended is not supported, why does it report |
I'm not 100% sure on all the details, but what I can confirm is that BraiinsOS itself does not follow the SRI reference implementation fully. I tested with an S9 running BraiinsOS and couldn't get it to connect to the SRI reference server or my own TypeScript implementation. After hours of testing I found the following differences from the miner side:
But interestingly their server themself seems to also accept SRI reference handshake. But maybe on Extented Channels they use different message types and formats? The Problem is, their server is closed source, I have no insights about it, I wrote them a long mail 3 weeks ago asking for some details, but they of course didn't bother to answer me. Maybe @GitGab19 has some insights. Edit: I'll add some logging tomorrow to my dev firmware and see if I can find out what formats they send. |
|
Braiins is using an older version of the That's the reason why the parsing is failing when using extended channels. We're in touch with them, and they are gonna update their implementation soon, in the meantime you can still test this PR using standard channels if you want to connect to Braiins. |
|
this was the breaking change: stratum-mining/stratum#2044 |
|
I and @jayrmotta are testing this PR on a Bitaxe Supra (BM1368) against the SRI pool (75.119.150.111:3333) with both extended and standard channels. Extended channels work fine, but when using standard channels, we noticed that almost every share is submitted multiple times, and so we get many |
I'm testing with the early-access release and both test servers from the opening post result in a 100% error rate and do not generate a proper hashrate. It's possible I messed up something with the merge? |
For testing I merged PR420 into this PR and everything works as it should. Didnt try the early access tree |






Summary
Adds Stratum V2 binary protocol support alongside the existing V1 JSON-RPC implementation. Tested on a BM1370 bitaxe against a local SRI server and the SRI reference pool — full 1.3 TH/s hashrate with shares accepted.
What's included:
What works:
Open decisions
none
Test plan
Test servers
Public SV2 test server:
blitzpool-test.yourdevice.ch33339bCoFxTszKCuffyywH5uS5o6WcU4vsjTH2axxc7wE86y2HhvULUSRI reference pool (confirmed working):
75.119.150.11133339auqWEzQDVyd2oe1JVGFLMLHZtCo2FFqZwtKA5gd9xbuEu7PH72For transparency: most of this implementation was done with the help of Claude (Opus 4.6). I hope that doesn't detract from the goal of bringing SV2 support to bitaxe.