upgrade liger to 0.4.0 #1973

winglian · 2024-10-15T15:44:19Z

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

src/axolotl/integrations/liger/args.py

tests/integrations/liger.py

src/axolotl/integrations/liger/__init__.py

README.md

NanoCode012 · 2024-11-01T13:59:53Z

requirements.txt

@@ -34,7 +34,7 @@ tensorboard
 python-dotenv==1.0.1
 autoawq>=0.2.5
 triton>=2.3.0
-liger-kernel==0.3.0
+liger-kernel==0.3.1


Do we need to wait for their latest release or point to this commit for GA fix? linkedin/Liger-Kernel#333

bursteratom · 2024-11-01T19:04:27Z

@NanoCode012 @winglian tried to run this particular branch just now but ran into this error

File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/triton/compiler/code_generator.py", line 1066, in visit_Attribute                                  [114/1839]
             return getattr(lhs, node.attr)                                                                                                                                   [113/1839]
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                             ]
[rank0]: AttributeError: 'tensor' object has no attribute 'cast'

which happens during

File "/workspace/axolotl/src/axolotl/core/trainer_builder.py", line 678, in compute_loss                                                                                     
[rank0]:     return super().compute_loss(model, inputs, return_outputs=return_outputs)                                                                                                  
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I'm using the default axolotl template on runpod and made sure to install the dependencies associated with this branch

And my yaml is as follows:

base_model: NousResearch/Meta-Llama-3.1-8B

plugins:
  - axolotl.integrations.liger.LigerPlugin
liger_rope: true
liger_rms_norm: true
liger_glu_activation: true
liger_fused_linear_cross_entropy: true

strict: false

datasets:
    - path: tatsu-lab/alpaca
      type: alpaca
dataset_prepared_path: last_run_prepared
val_set_size: 0.02
output_dir: ./outputs/out

sequence_len: 4096
sample_packing: true
pad_to_sequence_len: true

wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 1
optimizer: adamw_torch
lr_scheduler: cosine
learning_rate: 2e-5

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: true
gradient_checkpointing_kwargs:
  use_reentrant: false
early_stopping_patience:
resume_from_checkpoint:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 100
evals_per_epoch: 2
eval_table_size:
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
  - full_shard
  - auto_wrap
fsdp_config:
  fsdp_limit_all_gathers: true
  fsdp_sync_module_states: true
  fsdp_offload_params: true
  fsdp_use_orig_params: false
  fsdp_cpu_ram_efficient_loading: true
  fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
  fsdp_transformer_layer_cls_to_wrap: LlamaDecoderLayer
  fsdp_state_dict_type: FULL_STATE_DICT
  fsdp_sharding_strategy: FULL_SHARD
  fsdp_backward_prefetch: BACKWARD_PRE
special_tokens:
  pad_token: <|finetune_right_pad_id|>
  eos_token: <|eot_id|>

Co-authored-by: NanoCode012 <[email protected]>

* upgrade liger to 0.3.1 * update docs and example * skip duplicate code check * Update src/axolotl/integrations/liger/args.py Co-authored-by: NanoCode012 <[email protected]> * Update README.md Co-authored-by: NanoCode012 <[email protected]> * add logging * chore: lint * add test case * upgrade liger and transformers * also upgrade accelerate * use kwargs to support patch release * make sure prepared path is empty for test * use transfromers 4.46.1 since 4.46.2 breaks fsdp --------- Co-authored-by: NanoCode012 <[email protected]>

winglian force-pushed the upgrade-liger branch from 92cb4fc to 564d86d Compare October 15, 2024 16:04

NanoCode012 reviewed Oct 16, 2024

View reviewed changes

src/axolotl/integrations/liger/args.py Outdated Show resolved Hide resolved

tests/integrations/liger.py Show resolved Hide resolved

src/axolotl/integrations/liger/__init__.py Show resolved Hide resolved

README.md Show resolved Hide resolved

winglian added the ready to merge label Oct 31, 2024

NanoCode012 reviewed Nov 1, 2024

View reviewed changes

winglian and others added 8 commits November 5, 2024 09:23

upgrade liger to 0.3.1

fcdc6fe

update docs and example

1d7ab52

skip duplicate code check

4911d09

Update src/axolotl/integrations/liger/args.py

67c0413

Co-authored-by: NanoCode012 <[email protected]>

Update README.md

415399b

Co-authored-by: NanoCode012 <[email protected]>

add logging

c2a48c3

chore: lint

1ed3517

add test case

1b8d439

NanoCode012 force-pushed the upgrade-liger branch from 1c324f8 to 1b8d439 Compare November 5, 2024 02:23

winglian added 2 commits November 6, 2024 08:53

upgrade liger and transformers

6ac10de

also upgrade accelerate

6b617a4

winglian changed the title ~~upgrade liger to 0.3.1~~ upgrade liger to 0.4.0 Nov 6, 2024

winglian added 3 commits November 6, 2024 09:43

use kwargs to support patch release

613f238

make sure prepared path is empty for test

e42e319

use transfromers 4.46.1 since 4.46.2 breaks fsdp

86d9534

winglian merged commit 02ce520 into main Nov 7, 2024
14 checks passed

winglian deleted the upgrade-liger branch November 7, 2024 17:53

winglian mentioned this pull request Nov 7, 2024

Bump liger to 0.4.0 #2017

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

upgrade liger to 0.4.0 #1973

upgrade liger to 0.4.0 #1973

winglian commented Oct 15, 2024

NanoCode012 Nov 1, 2024

bursteratom commented Nov 1, 2024 •

edited

Loading

upgrade liger to 0.4.0 #1973

upgrade liger to 0.4.0 #1973

Conversation

winglian commented Oct 15, 2024

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

NanoCode012 Nov 1, 2024

Choose a reason for hiding this comment

bursteratom commented Nov 1, 2024 • edited Loading

bursteratom commented Nov 1, 2024 •

edited

Loading