Skip to content

[mpt] Abstract out hash function from Merkle Compute#1997

Open
kkuehlz wants to merge 1 commit intomainfrom
kkuehler/merkle_hasher
Open

[mpt] Abstract out hash function from Merkle Compute#1997
kkuehlz wants to merge 1 commit intomainfrom
kkuehler/merkle_hasher

Conversation

@kkuehlz
Copy link
Contributor

@kkuehlz kkuehlz commented Dec 19, 2025

Introduces a the MerkleHasher trait which is plumbed through as a template parameter for the different Compute implementations. This PR is fully backwards compatible with the current implementation. Every Compute implementation uses the Keccak256Hasher. This PR prepares the compute module for pluggable hash functions, namely to support the coming BSTORE subtrie, which will use blake3.

The trait allows us to reuse all the existing code for merkelization while customizing the hash function used. The blake3 addition will look roughly like the following:

struct Blake3Hasher
{
    static_assert(BLAKE3_OUT_LEN == HASH_SIZE);

    static void hash(unsigned char const *in, size_t len, unsigned char *out)
    {
        blake3_hasher hasher;
        blake3_hasher_init(&hasher);
        blake3_hasher_update(&hasher, in, len);
        blake3_hasher_finalize(&hasher, out, HASH_SIZE);
    }
};

using BlockStorageMerkleCompute =
        MerkleComputeBase<Blake3Hasher, ComputeBlockStorageLeaf>>;

Copilot AI review requested due to automatic review settings December 19, 2025 23:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a MerkleHasher trait to abstract hash functions from the Merkle Compute implementations, preparing the codebase for pluggable hash functions while maintaining full backwards compatibility. Currently, all implementations continue to use Keccak256, but the abstraction enables future support for alternative hash functions like Blake3 for the upcoming BSTORE subtrie.

  • Introduces MerkleHasher C++20 concept and Keccak256Hasher implementation
  • Threads the Hasher template parameter through all Compute classes
  • Refactors functions from .cpp to header as templates while improving code quality

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
category/mpt/merkle_hasher.hpp New file defining the MerkleHasher concept, HASH_SIZE constant, and Keccak256Hasher implementation
category/mpt/merkle/node_reference.hpp Templated to_node_reference function with MerkleHasher parameter and replaced KECCAK256_SIZE with HASH_SIZE
category/mpt/compute.hpp Added MerkleHasher template parameter to all Compute classes and helper functions; moved previously non-templated functions from .cpp to header as templates; renamed hash methods for generality
category/mpt/compute.cpp Removed functions that are now templated in the header, keeping only encode_empty_string
category/mpt/test/test_fixtures_base.hpp Updated test fixtures to explicitly specify Keccak256Hasher template parameter
category/execution/ethereum/db/util.cpp Updated all Compute type instantiations to include Keccak256Hasher as the first template parameter
category/mpt/CMakeLists.txt Added merkle_hasher.hpp to the build configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@kkuehlz kkuehlz force-pushed the kkuehler/merkle_hasher branch 2 times, most recently from 356ab2a to 8495c7f Compare December 19, 2025 23:26
Introduces a the MerkleHasher trait which is plumbed through as a
template parameter for the different Compute implementations. This PR is
fully backwards compatible with the current implementation. Every
Compute implementation uses the Keccak256Hasher. This PR prepares the
compute module for pluggable hash functions, namely to support the
coming BSTORE subtrie, which will use blake3.

The trait allows us to reuse all the existing code for merkelization
while customizing the hash function used. The blake3 addition will look
roughly like the following:

struct Blake3Hasher
{
    static_assert(BLAKE3_OUT_LEN == HASH_SIZE);

    static void hash(unsigned char const *in, size_t len, unsigned char *out)
    {
        blake3_hasher hasher;
        blake3_hasher_init(&hasher);
        blake3_hasher_update(&hasher, in, len);
        blake3_hasher_finalize(&hasher, out, HASH_SIZE);
    }
};

  using BlockStorageMerkleCompute =
        MerkleComputeBase<Blake3Hasher, ComputeBlockStorageLeaf>>;

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
@kkuehlz kkuehlz force-pushed the kkuehler/merkle_hasher branch from 8495c7f to da7986b Compare December 19, 2025 23:31
// leaf and hashed node ref requires rlp encoding,
// rlp encoded but unhashed branch node ref doesn't
bool const need_encode_second = has_value || second.size() >= HASH_SIZE;
auto const concat_len =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to check second.size() similarly to first.size() with MONAD_ASSERT?

also in theory concat_len could integer overflow although unlikely because the data we are summing sizes is backed by memory, however we can explore options like checked arithmetic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants