experts based on Phi 3 model with PEFT library #124

AliEdalat · 2024-10-19T06:47:14Z

Hello,

Thank you for your interesting research work.

I have 10 experts trained based on the Phi 3 model (datasets selected based on paper cluttering). I have used the TRL and PEFT libraries for training, ensuring the checkpoint structures are suitable for these libraries.

In training the experts, I used LoRA in 4-bit quantized mode. Additionally, I utilized the o and kqv attention in each layer during training.

I would like to know how I can use your code to execute Arrow for merging these experts for each token in every model layer.

I have some errors in the code.

please explain step by step. I am a beginner in this field.

Thank you, and I would appreciate your response.

sordonia · 2024-10-24T13:11:30Z

@pclucas14 can you help with this issue?

I also have a PR that creates a library compatible with our library out of PEFT adapters, maybe we can merge that and write an example on how to run Arrow using experts trained with TRL and PEFT

Thank you

sordonia · 2024-10-24T13:23:08Z

@AliEdalat would you be able to share with us the name of your PEFT adapters?

sordonia · 2024-10-24T17:40:01Z

in #86 , we provide a script in examples/create_arrow_model.py, you can specify a list of PEFT adapters and a destination path, it will upload a checkpoint of the base model that has been "Arrowed" with the selected experts.

you can reload the model with model = MultiExpertModel.from_pretrained(destination) (look in the file for instructions)

Once you have the model, you can model.forward and model.generate as you would do with a HF model.

TheTahaaa · 2024-10-25T07:01:17Z

Hey @sordonia,

I'm @AliEdalat's teammate, and have recently implemented the Arrow algorithm for our project using the PEFT library, achieving promising results. To do this, I modified the forward path in the bnb.py file (since we use QLoRA in our workflow) and made adjustments to methods in the peft_model.py file, along with some other PEFT library files.

I thought you might be interested in a dedicated PEFT adapter for the Arrow algorithm, similar to existing ones like Poly and MHR. Let me know if you'd like any help on this!

Thanks!

sordonia · 2024-10-26T20:09:55Z

It'd be great! Do you plan to open a PR into PEFT? I'd be willing to help

TheTahaaa · 2024-10-28T13:15:41Z

It'd be great! Do you plan to open a PR into PEFT? I'd be willing to help

Yes, I'd be glad to contribute!

Which branch should I (we) open a PR on?

sordonia · 2024-10-28T13:54:43Z

Just to clarify my understanding: you are planning to PR your work into the huggingface PEFT library, right? Or into MTTL library?

TheTahaaa · 2024-10-28T14:33:47Z

Just to clarify my understanding: you are planning to PR your work into the huggingface PEFT library, right? Or into MTTL library?

Yes! HuggingFace PEFT library.

I initially asked about the branch because I thought there might already be a feature request for implementing Arrow in PEFT. Should I go ahead and open a feature request for this, or would you prefer to handle it?

(Sorry for any ambiguity!)

pclucas14 · 2024-10-28T19:50:24Z

Hi,

Super exciting to hear you are getting promising results! Please go ahead and PR into PEFT. I am happy to help if needed, but you probably have a better intuition on how PEFT works than us :)

sordonia · 2024-10-29T03:06:39Z

Yes please go ahead an open a feature request, we can jump in the PR when needed!

sordonia · 2024-10-29T03:14:58Z

(we merged our own PEFT support in examples/create_arrow_model.py curious if it works with your PEFT experts :))

TheTahaaa · 2024-11-02T07:13:55Z

@sordonia @pclucas14 I'll do it for sure! However, I need to first make sure that my code can almost reproduce the result on the zero-shot datasets, like Piqa, as you've reported in the paper.

Furthermore, I think I need to refactor the code, so the Arrow can be added as a Tuner Class in the PEFT library, similar to Polytropon and π-Tuning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experts based on Phi 3 model with PEFT library #124

experts based on Phi 3 model with PEFT library #124

AliEdalat commented Oct 19, 2024

sordonia commented Oct 24, 2024

sordonia commented Oct 24, 2024

sordonia commented Oct 24, 2024

TheTahaaa commented Oct 25, 2024 •

edited

Loading

sordonia commented Oct 26, 2024

TheTahaaa commented Oct 28, 2024 •

edited

Loading

sordonia commented Oct 28, 2024

TheTahaaa commented Oct 28, 2024

pclucas14 commented Oct 28, 2024 •

edited

Loading

sordonia commented Oct 29, 2024

sordonia commented Oct 29, 2024 •

edited

Loading

TheTahaaa commented Nov 2, 2024 •

edited

Loading

experts based on Phi 3 model with PEFT library #124

experts based on Phi 3 model with PEFT library #124

Comments

AliEdalat commented Oct 19, 2024

sordonia commented Oct 24, 2024

sordonia commented Oct 24, 2024

sordonia commented Oct 24, 2024

TheTahaaa commented Oct 25, 2024 • edited Loading

sordonia commented Oct 26, 2024

TheTahaaa commented Oct 28, 2024 • edited Loading

sordonia commented Oct 28, 2024

TheTahaaa commented Oct 28, 2024

pclucas14 commented Oct 28, 2024 • edited Loading

sordonia commented Oct 29, 2024

sordonia commented Oct 29, 2024 • edited Loading

TheTahaaa commented Nov 2, 2024 • edited Loading

TheTahaaa commented Oct 25, 2024 •

edited

Loading

TheTahaaa commented Oct 28, 2024 •

edited

Loading

pclucas14 commented Oct 28, 2024 •

edited

Loading

sordonia commented Oct 29, 2024 •

edited

Loading

TheTahaaa commented Nov 2, 2024 •

edited

Loading