Skip to content

fix: biencoder PEFT adapter key remapping for merge_lora#1479

Merged
adil-a merged 3 commits intomainfrom
fix/biencoder-peft-key-remapping
Mar 7, 2026
Merged

fix: biencoder PEFT adapter key remapping for merge_lora#1479
adil-a merged 3 commits intomainfrom
fix/biencoder-peft-key-remapping

Conversation

@adil-a
Copy link
Collaborator

@adil-a adil-a commented Mar 7, 2026

Summary

Fixes merge_lora.py failures when merging biencoder LoRA adapters into the standalone base model (nvidia/llama-nemotron-embed-1b-v2).

Root cause

The biencoder wraps the base model as self.lm_q / self.lm_p, so adapter checkpoint keys and adapter_config.json target modules get prefixed with lm_q.. When merge_lora.py loads the adapter against the standalone base model, module names like lm_q.layers.0.self_attn.q_proj don't exist — the base model has layers.0.self_attn.q_proj.

Changes

  • addons.py_extract_target_modules: Strip lm_q. prefix from target module names when saving adapter_config.json for biencoder models, so merge_lora.py can match them against the base model
  • bidirectional.pyBiencoderStateDictAdapter.to_hf: PEFT keys base_model.model.lm_q.X are now mapped to base_model.model.X (strips lm_q. without adding model. prefix, since LlamaBidirectionalModel extends LlamaModel not LlamaForCausalLM)
  • bidirectional.pyBiencoderStateDictAdapter.from_hf: PEFT keys are only restored to lm_q (not fanned out to lm_p), which fixes FSDP DTensor loading failures with the shared encoder
  • train_biencoder.py: Add apply_te_patches() call
  • Tests: Updated to verify PEFT key roundtrip, lm_q. stripping, and no lm_p fanout

Test plan

  • Unit tests pass (test_state_dict_adapter.py, test_addons.py)
  • End-to-end: biencoder LoRA training → checkpoint save → merge_lora.py merge
  • Verify merged model produces correct embeddings

🤖 Generated with Claude Code

The biencoder wraps the base model as `lm_q`, so PEFT adapter weights
and target modules were saved with incorrect prefixes. The standalone
base model (LlamaBidirectionalModel extends LlamaModel) uses bare
module names like `layers.0.self_attn.q_proj` — no `model.` prefix.

This caused merge_lora.py to fail because PEFT could not match the
target modules or weight keys against the base model.

Changes:
- bidirectional.py: fix PEFT key path in to_hf via peft_dst parameter
  (base_model.model.lm_q.X → base_model.model.X, was incorrectly
  producing base_model.model.model.X). Fix from_hf to handle PEFT
  keys explicitly.
- addons.py: strip lm_q. prefix from target modules in
  _extract_target_modules for biencoder so adapter_config.json has
  module names matching the standalone HF base model.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 7, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

adil-a and others added 2 commits March 7, 2026 02:33
- Strip lm_q. prefix from target modules in adapter_config.json so
  merge_lora.py can match them against the standalone base model
- Fix to_hf to not add extra model. prefix for PEFT keys (lm_q. → bare)
- Fix from_hf to only restore PEFT keys to lm_q (not lm_p) to avoid
  FSDP DTensor loading failures with shared encoder
- Add apply_te_patches() to biencoder training recipe

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@adil-a
Copy link
Collaborator Author

adil-a commented Mar 7, 2026

/ok to test 1a6ee91

@adil-a adil-a merged commit 1b56f4f into main Mar 7, 2026
52 checks passed
@adil-a adil-a deleted the fix/biencoder-peft-key-remapping branch March 7, 2026 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants