Skip to content

SHA256 for nRF54L15#9

Closed
WilliamTakeshi wants to merge 37 commits intolake-rs:mainfrom
WilliamTakeshi:nrf54l15-sha256-impl
Closed

SHA256 for nRF54L15#9
WilliamTakeshi wants to merge 37 commits intolake-rs:mainfrom
WilliamTakeshi:nrf54l15-sha256-impl

Conversation

@WilliamTakeshi
Copy link
Collaborator

This PR introduces hardware-accelerated SHA-256 for the nRF54L15.

The implementation still contains some duplication and a few magic numbers because it was translated from the vendor’s C code, and the CRACEN documentation appears to be incomplete.

There is also room for improvement in the “oneshot” API. Doing directly with less copying.

  • For testing, I added default-members so that cargo test at the workspace root does not attempt to run the nRF54L15 example. Is this the correct approach? I might need to check how other projects handle chip-specific code.

Let me know if you need explanations about any part of the code. It was really hard to understand and adapt from the C examples.

Copy link
Member

@malishav malishav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks very nice, congrats! A left a couple of clarifying questions inline, mainly for my understanding.

MEMORY
{
FLASH : ORIGIN = 0x00000000, LENGTH = 1524K
RAM : ORIGIN = 0x20000000, LENGTH = 128K
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At https://www.nordicsemi.com/Products/nRF54L15, I see 1.5 MB flash, but there seems to be 256 kB of RAM. Why have 128 kB as length here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here!
ac7b381

But I have a question. The nRF54L15 has 2 cores, the app-core and the flpr-core (link). By using the whole memory.x on the app-core, we will be locking the library users to only use the app-core?

(group_end | DMA_REALIGN) as u32
}

fn sha256_padding(msg_len: usize, out: &mut [u8; 128]) -> usize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Padding needs to be done in software only?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question that can go back to the API.

Apparently, the hardware accelerator does SHA256 but only of full blocks. (So we need a) this padding, and b) block spooling).

Maybe a takeaway from both is to offer a low-level block-only hashing, which the formally-verified part would build on and do the buffering and finalization?

Tracked in #12. Until that is resolved, I think we can leave the code in here, maybe with a remark that it is to be removed, referencing #12.

When that is done, a lot of code here can go out, also out of the struct (removing the block, processed_bytes and block_bytes_used).

let next_full_boundary = total & !(BLOCK_SIZE - 1); // round down to nearest multiple of BLOCK_SIZE
let bytes_from_data = next_full_boundary.saturating_sub(instance.block_bytes_used);

let dma = self.p.global_cracencore_s.cryptmstrdma();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this line only instantiates a struct, could you confirm?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it just take a handler to the CRACEN CryptoMaster DMA. It is a zero-cost and probably it disappear on the compilation.

Its just to make it easier to read, instead of having :
self.p.global_cracencore_s.cryptmstrdma().fetchaddrlsb().write(|w| unsafe { w.bits(descriptors.first() as u32) });
we have
dma.fetchaddrlsb().write(|w| unsafe { w.bits(descriptors.first() as u32) });

which is slightly easier to read

Copy link
Member

@chrysn chrysn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks; some initial comments.

(group_end | DMA_REALIGN) as u32
}

fn sha256_padding(msg_len: usize, out: &mut [u8; 128]) -> usize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question that can go back to the API.

Apparently, the hardware accelerator does SHA256 but only of full blocks. (So we need a) this padding, and b) block spooling).

Maybe a takeaway from both is to offer a low-level block-only hashing, which the formally-verified part would build on and do the buffering and finalization?

Tracked in #12. Until that is resolved, I think we can leave the code in here, maybe with a remark that it is to be removed, referencing #12.

When that is done, a lot of code here can go out, also out of the struct (removing the block, processed_bytes and block_bytes_used).

let mut data_desc = Descriptor::empty();
let mut padding_desc = Descriptor::empty();

match instance.state {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks familiar from above. Can't the padding data be created, sent into self.update(), with an assertion that (by construction of the padding) there is no data left in the buffer after it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont fully get this one. But I rewrote using DescriptorChain, I think its better documented this way. lmk if still have questions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue (that AIU still persists) is that there is a large amount of duplication between the update and the finish function. This would be way more readable and less prone to editing errors if some of that update/finish code were refactored into (non-trait) methods that are then used commonly between update and finalize.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, now I get it. I refactored one execute_cryptomaster_dma here: 3af1f96

chrysn added a commit that referenced this pull request Dec 15, 2025
The -10 value and associated commented-on trouble was for *selecting KDF
algorithms*, not for selecting hash algorithms, which do have proper
numbers.

See-Also: #9 (comment)
// https://www.rfc-editor.org/rfc/rfc9054.html#name-sha-2-hash-algorithms
fn from_cose_number(number: impl Into<i128>) -> Option<Self> {
match number.into() {
-10 => Some(HashAlgorithm::Sha256),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, where does -10 come from? I am looking at https://www.iana.org/assignments/cose/cose.xhtml

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my mistake from mixing up SHA256 with HKDF based on SHA256; this was already discussed (and presumably fixed?) in #9 (comment).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just merged with recent main (do we prefer merges or rebases?) and fixed on 192920d

@WilliamTakeshi WilliamTakeshi self-assigned this Dec 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants