fix: prevent apps from crashing when LLMs are loaded by chmjkb · Pull Request #1063 · software-mansion/react-native-executorch

chmjkb · 2026-04-08T10:02:19Z

Description

Fix crashes when loading LLMs by using mmap and avoiding reporting model file size as external memory pressure. This PR changes how LLM models are loaded to prevent crashes with large models. Previously, reporting the full model file size via setExternalMemoryPressure() would cause Hermes to crash because it breaks the GC's heap accounting when external memory exceeds or approaches the 3GB max heap size.

We also set the LoadMode to Mmap instead of File, causing the ET runtime to lazy-load weights to RAM on-demand instead of storing the entire file content in memory, preventing the OS from killing the app.

Introduces a breaking change?

Yes
No

Type of change

Bug fix (change which fixes an issue)
New feature (change which adds functionality)
Documentation update (improves or adds clarity to existing documentation)
Other (chores, tests, code style improvements etc.)

Tested on

iOS
Android

Testing instructions

Take a large model, verify it crashes the app on main and note the memory consumption
Try running the same model on this branch, make sure it doesn't crash and note the memory consumption
Verify models that would usually fit in your RAM are not slowed down significantly.

Screenshots

Related issues

Checklist

I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have updated the documentation accordingly
My changes generate no new warnings

Additional notes

…m/LLM.cpp Co-authored-by: Mateusz Sluszniak <56299341+msluszniak@users.noreply.github.com>

NorbertKlockiewicz

I wasn't able to crash Private Mind during model load, only during generation but on emulator with 2gbs of RAM, I am only wondering if we shouldn't also load vision encoders in similar way as those also can use a lot of RAM and when they are loaded with text_decoder the app can crash.

chmjkb · 2026-04-08T15:10:56Z

I wasn't able to crash Private Mind during model load, only during generation but on emulator with 2gbs of RAM, I am only wondering if we shouldn't also load vision encoders in similar way as those also can use a lot of RAM and when they are loaded with text_decoder the app can crash.

yeah, we do. I'll add the changes tomorrow.

chmjkb · 2026-04-09T06:24:03Z

I wasn't able to crash Private Mind during model load, only during generation but on emulator with 2gbs of RAM, I am only wondering if we shouldn't also load vision encoders in similar way as those also can use a lot of RAM and when they are loaded with text_decoder the app can crash.

yeah, we do. I'll add the changes tomorrow.

turns out we already do that, as vision encoders are a part of the same module.

## Description Fix crashes when loading LLMs by using mmap and avoiding reporting model file size as external memory pressure. This PR changes how LLM models are loaded to prevent crashes with large models. Previously, reporting the full model file size via setExternalMemoryPressure() would cause Hermes to crash because it breaks the GC's heap accounting when external memory exceeds or approaches the 3GB max heap size. We also set the LoadMode to Mmap instead of File, causing the ET runtime to lazy-load weights to RAM on-demand instead of storing the entire file content in memory, preventing the OS from killing the app. ### Introduces a breaking change? - [ ] Yes - [x] No ### Type of change - [x] Bug fix (change which fixes an issue) - [ ] New feature (change which adds functionality) - [ ] Documentation update (improves or adds clarity to existing documentation) - [ ] Other (chores, tests, code style improvements etc.) ### Tested on - [x] iOS - [x] Android ### Testing instructions - [ ] Take a large model, verify it crashes the app on main and note the memory consumption - [ ] Try running the same model on this branch, make sure it doesn't crash and note the memory consumption - [ ] Verify models that would usually fit in your RAM are not slowed down significantly. ### Screenshots  ### Related issues  ### Checklist - [x] I have performed a self-review of my code - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have updated the documentation accordingly - [ ] My changes generate no new warnings ### Additional notes  --------- Co-authored-by: Mateusz Sluszniak <56299341+msluszniak@users.noreply.github.com>

## Summary Patch release v0.8.3 — cherry-picks the following bug fixes from `main` into `release/0.8`: - fix: add mutex to VoiceActivityDetection to prevent race between `generate()` and `unload()` (#1056) - fix: prevent apps from crashing when LLMs are loaded (#1063) - fix: add inference mutex to Text Embedding and Text-to-Image (#1060) ## Checklist - [x] Commits cherry-picked from `main` in chronological order - [x] Version bumped to `0.8.3` in `packages/react-native-executorch/package.json` 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Radek Czemerys <7029942+radko93@users.noreply.github.com> Co-authored-by: Bartosz Hanc <bartosz.hanc02@gmail.com> Co-authored-by: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>

chmjkb added 3 commits April 7, 2026 11:09

wip

4de1a86

fix: update memoryLowerBound for llm

1aac7aa

chore: remove unused header

aeac249

chmjkb marked this pull request as ready for review April 8, 2026 10:02

chmjkb linked an issue Apr 8, 2026 that may be closed by this pull request

Prevent OOM-based app crashes #726

Closed

msluszniak assigned chmjkb Apr 8, 2026

msluszniak added the bug fix PRs that are fixing bugs label Apr 8, 2026

msluszniak reviewed Apr 8, 2026

View reviewed changes

Comment thread packages/react-native-executorch/common/rnexecutorch/models/llm/LLM.cpp Outdated

Update packages/react-native-executorch/common/rnexecutorch/models/ll…

e76e2cc

…m/LLM.cpp Co-authored-by: Mateusz Sluszniak <56299341+msluszniak@users.noreply.github.com>

msluszniak requested a review from NorbertKlockiewicz April 8, 2026 10:27

mkopcins approved these changes Apr 8, 2026

View reviewed changes

NorbertKlockiewicz reviewed Apr 8, 2026

View reviewed changes

NorbertKlockiewicz approved these changes Apr 9, 2026

View reviewed changes

chmjkb merged commit dc5664f into main Apr 9, 2026
4 checks passed

chmjkb deleted the @chmjkb/mmap-load-mode branch April 9, 2026 06:52

msluszniak mentioned this pull request Apr 10, 2026

chore: Release v0.8.3 #1072

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: prevent apps from crashing when LLMs are loaded#1063

fix: prevent apps from crashing when LLMs are loaded#1063
chmjkb merged 4 commits into
mainfrom
@chmjkb/mmap-load-mode

chmjkb commented Apr 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

NorbertKlockiewicz left a comment

Uh oh!

chmjkb commented Apr 8, 2026

Uh oh!

chmjkb commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

chmjkb commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Introduces a breaking change?

Type of change

Tested on

Testing instructions

Screenshots

Related issues

Checklist

Additional notes

Uh oh!

Uh oh!

NorbertKlockiewicz left a comment

Choose a reason for hiding this comment

Uh oh!

chmjkb commented Apr 8, 2026

Uh oh!

chmjkb commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chmjkb commented Apr 8, 2026 •

edited

Loading