[peft] Define PEFT base class and LoRA transform#71
[peft] Define PEFT base class and LoRA transform#71ananthsub merged 5 commits intoNVIDIA-NeMo:mainfrom
Conversation
|
/ok to test 46f3623 |
|
What's the timeline for adding PEFT to the hub? In the work I was previously doing I have re-implemented PEFT and was thinking to contribute that. But I first need to finish the bridge, so I can likely start with that in a few weeks. The new implementation would also leverage the bridge to support 2-way binding with HF. The new implementation I worked on doesn't really change the external API, it's mostly internal. So could make sense to first merge a nemo-inspired PEFT and them iterate on it. |
|
@marcromeyn I'd like to have some basic PEFT support in for 25.07, so as soon as possible. for now I am following the existing nemo pattern as closely as possible to simplify the migration |
|
/ok to test 09c7d73 |
Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
|
/ok to test 4256a0b |
Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
|
/ok to test c9c5b74 |
Signed-off-by: ashors1 <ashors@nvidia.com> Signed-off-by: Anna Shors <ashors@nvidia.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
Base class implementation leans heavily on implementations from PeFT in NeMo: https://github.com/NVIDIA/NeMo/blob/ec9c486557f1b5fb211a903e299d4cb5be1fd3b9/nemo/lightning/pytorch/callbacks/peft.py#L44
PEFT base class differences:
PEFTis an abstract dataclass so that implementations neatly plug into theConfigContainer(params_to_save is now set in the post init, which subclasses need to explicitly initialize)trainingflag to__call__andfreeze_modelas we cannot rely on the presence of a lightning trainer object to indicate what stage is being requestedLoRA is otherwise nearly identical to what's in NeMo minus the following:
LoRAMerge is identical to what's in NeMo
This PR includes fixes to
walk_utilsfound from new unit testsSee #68 for the ModuleMatcher, which should get merged first as it's independently testable