Throw messages in text-generation task with deepseek r1 with PEFTModel #36783

falconlee236 · 2025-03-18T04:54:56Z

System Info

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

transformers version: 4.49.0
Platform: Linux-5.15.0-134-generic-x86_64-with-glibc2.35
Python version: 3.10.12
Huggingface_hub version: 0.29.3
Safetensors version: 0.5.3
Accelerate version: 1.3.0
Accelerate config: - compute_environment: LOCAL_MACHINE
- distributed_type: DEEPSPEED
- use_cpu: False
- debug: False
- num_processes: 1
- machine_rank: 0
- num_machines: 0
- rdzv_backend: static
- same_network: True
- main_training_function: main
- enable_cpu_affinity: False
- deepspeed_config: {'deepspeed_config_file': '/opt/config/train_config.json', 'zero3_init_flag': True}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
DeepSpeed version: 0.16.4
PyTorch version (GPU?): 2.5.1+cu124 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: No
Using GPU in script?: No
GPU type: NVIDIA H100 80GB HBM3

Who can help?

@ArthurZucker @Rocketknight1 @muellerzr

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

import torch

from transformers import pipeline, AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer
from peft import PeftModel

ADAPTER_PATH = "./output/adapter/mnc_adapter"
BASE_PATH = "./output/model"
BNB_CONFG = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)


# input
text = "Who is a Elon Musk?"

model = AutoModelForCausalLM.from_pretrained(
    BASE_PATH,
    quantization_config=BNB_CONFG,
    torch_dtype=torch.float16,
    device_map = 'auto',
)
tokenizer = AutoTokenizer.from_pretrained(BASE_PATH)
lora_model = PeftModel.from_pretrained(
    model,
    ADAPTER_PATH,
    quantization_config=BNB_CONFG,
    torch_dtype=torch.float16,
    device_map = 'auto',
)

default_generator = pipeline(
    task="text-generation",
    model=model,
    tokenizer=tokenizer,
    device_map="auto",
    torch_dtype=torch.float16
)
print(f"this is base model result: {default_generator(text)}")

lora_generator = pipeline(
    task="text-generation",
    model=lora_model,
    tokenizer=tokenizer,
    device_map="auto",
    torch_dtype=torch.float16
)
print(f"this is lora model result: {lora_generator(text)}")

execute lora_generator(text)
output warning messages with followings
With my debugging, transformers/pipelines/base.py that section was problems

def check_model_type(self, supported_models: Union[List[str], dict]):
        """
        Check if the model class is in supported by the pipeline.

        Args:
            supported_models (`List[str]` or `dict`):
                The list of models supported by the pipeline, or a dictionary with model class values.
        """
        if not isinstance(supported_models, list):  # Create from a model mapping
            supported_models_names = []
            for _, model_name in supported_models.items():
                # Mapping can now contain tuples of models for the same configuration.
                if isinstance(model_name, tuple):
                    supported_models_names.extend(list(model_name))
                else:
                    supported_models_names.append(model_name)
            if hasattr(supported_models, "_model_mapping"):
                for _, model in supported_models._model_mapping._extra_content.items():
                    if isinstance(model_name, tuple):
                        supported_models_names.extend([m.__name__ for m in model])
                    else:
                        supported_models_names.append(model.__name__)
            supported_models = supported_models_names
        if self.model.__class__.__name__ not in supported_models:
            logger.error(
                f"The model '{self.model.__class__.__name__}' is not supported for {self.task}. Supported models are"
                f" {supported_models}."
            )

Expected behavior

without unsupported models message.

This error might be occured the deepseek model was not in supported_models List

The pipeline was successfully worked, but I wanna remove this annoying message

python hug_inference.py 
/root/workspace/lora_test/.venv/lib/python3.10/site-packages/transformers/quantizers/auto.py:206: UserWarning: You passed `quantization_config` or equivalent parameters to `from_pretrained` but the model you're loading already has a `quantization_config` attribute. The `quantization_config` from the model will be used.
  warnings.warn(warning_msg)
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:07<00:00,  1.12it/s]
Device set to use cuda:0
/root/workspace/lora_test/.venv/lib/python3.10/site-packages/bitsandbytes/nn/modules.py:451: UserWarning: Input type into Linear4bit is torch.float16, but bnb_4bit_compute_dtype=torch.float32 (default). This will lead to slow inference or training speed.
  warnings.warn(
this is base model result: [{'generated_text': "Who is a Elon Musk? Well, he's a business magnate, investor, and entrepreneur. He's known for his ambitious"}]
Device set to use cuda:0
The model 'PeftModel' is not supported for text-generation. Supported models are ['AriaTextForCausalLM', 'BambaForCausalLM', 'BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'LlamaForCausalLM', 'CodeGenForCausalLM', 'CohereForCausalLM', 'Cohere2ForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'DbrxForCausalLM', 'DiffLlamaForCausalLM', 'ElectraForCausalLM', 'Emu3ForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'FalconMambaForCausalLM', 'FuyuForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'GitForCausalLM', 'GlmForCausalLM', 'GotOcr2ForConditionalGeneration', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'GraniteForCausalLM', 'GraniteMoeForCausalLM', 'GraniteMoeSharedForCausalLM', 'HeliumForCausalLM', 'JambaForCausalLM', 'JetMoeForCausalLM', 'LlamaForCausalLM', 'MambaForCausalLM', 'Mamba2ForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'MllamaForCausalLM', 'MoshiForCausalLM', 'MptForCausalLM', 'MusicgenForCausalLM', 'MusicgenMelodyForCausalLM', 'MvpForCausalLM', 'NemotronForCausalLM', 'OlmoForCausalLM', 'Olmo2ForCausalLM', 'OlmoeForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PersimmonForCausalLM', 'PhiForCausalLM', 'Phi3ForCausalLM', 'PhimoeForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'RecurrentGemmaForCausalLM', 'ReformerModelWithLMHead', 'RemBertForCausalLM', 'RobertaForCausalLM', 'RobertaPreLayerNormForCausalLM', 'RoCBertForCausalLM', 'RoFormerForCausalLM', 'RwkvForCausalLM', 'Speech2Text2ForCausalLM', 'StableLmForCausalLM', 'Starcoder2ForCausalLM', 'TransfoXLLMHeadModel', 'TrOCRForCausalLM', 'WhisperForCausalLM', 'XGLMForCausalLM', 'XLMWithLMHeadModel', 'XLMProphetNetForCausalLM', 'XLMRobertaForCausalLM', 'XLMRobertaXLForCausalLM', 'XLNetLMHeadModel', 'XmodForCausalLM', 'ZambaForCausalLM', 'Zamba2ForCausalLM'].
this is lora model result: [{'generated_text': "Who is a Elon Musk? I mean, I know he's a business magnate or something, but what has he actually done"}]

The text was updated successfully, but these errors were encountered:

Rocketknight1 · 2025-03-19T17:28:49Z

cc @sayakpaul @BenjaminBossan for PEFT - if you think this is an issue in pipelines instead, let me know and I'll try to update our class matching logic!

BenjaminBossan · 2025-03-20T11:01:57Z

I'm not very familiar with pipelines but this is what I gather: I think we should check if peft is installed, and if it is, it should be added to the supported_models list. However, I'm not sure why self.model.__class__.__name__ not in supported_models is checked instead of isinstance, maybe to avoid imports? The issue with that is that we have many PeftModel subclasses, so the list would need to be extended by:

"PeftModel", "PeftModelForSequenceClassification", "PeftModelForCausalLM", "PeftModelForSeq2SeqLM", "PeftModelForTokenClassification", "PeftModelForQuestionAnswering", "PeftModelForFeatureExtraction"

Rocketknight1 · 2025-03-20T14:19:32Z

@BenjaminBossan that makes sense! @falconlee236 would you be willing to attempt a PR for that?

sambhavnoobcoder · 2025-03-20T18:47:55Z

Hi @Rocketknight1 , i found this to be an interesting issue , and raised a PR fixing the same in #36868 . please have a look at it , i'll make any changes to it if required as soon as possible . thank you @falconlee236 for raising this issue .

falconlee236 · 2025-03-20T21:32:49Z

Hi @Rocketknight1 , i found this to be an interesting issue , and raised a PR fixing the same in #36868 . please have a look at it , i'll make any changes to it if required as soon as possible . thank you @falconlee236 for raising this issue .

I tried to resolve the issue first, but @sambhavnoobcoder resolved it before me, so I don't feel great about it. At the very least, I wish you had submitted the PR after hearing my answer

I want to be willing to attempt a PR
@Rocketknight1

sambhavnoobcoder · 2025-03-20T23:08:09Z

so sorry @falconlee236 , that was not my intention in any way . please submit your pr , my curiosity just got the best of me . Please ignore my attempt and go ahead with your implementation . apologies for any inconvenience again .

Rocketknight1 · 2025-03-21T13:47:25Z

I'm happy for anyone to make the PR as long as it gets fixed! We generally don't "assign" issues to specific people - there's more than enough work to be done in the library

falconlee236 · 2025-03-21T15:03:40Z

I'm happy for anyone to make the PR as long as it gets fixed! We generally don't "assign" issues to specific people - there's more than enough work to be done in the library

I think I said that because I also want to contribute to Transformers. I'm sorry if it made you feel bad. @sambhavnoobcoder

sambhavnoobcoder · 2025-03-21T16:17:18Z

cool . in that case , i have reopened the PR and would appreciate your review @Rocketknight1 on the same . Also no worries @falconlee236 , i understand you also want to contribute to Transformers , and it would be my pleasure to contribute alongside you . i would also appreciate to learn more from your PR as well .

falconlee236 added the bug label Mar 18, 2025

sambhavnoobcoder mentioned this issue Mar 20, 2025

Fix warning message for PEFT models in text-generation pipeline #36783 #36868

Open

falconlee236 linked a pull request Mar 21, 2025 that will close this issue

Fix warning message for PEFT models in text-generation pipeline #36783 #36887

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Throw messages in text-generation task with deepseek r1 with PEFTModel #36783

Throw messages in text-generation task with deepseek r1 with PEFTModel #36783

falconlee236 commented Mar 18, 2025

Rocketknight1 commented Mar 19, 2025

BenjaminBossan commented Mar 20, 2025

Rocketknight1 commented Mar 20, 2025

sambhavnoobcoder commented Mar 20, 2025

falconlee236 commented Mar 20, 2025 •

edited

Loading

sambhavnoobcoder commented Mar 20, 2025

Rocketknight1 commented Mar 21, 2025

falconlee236 commented Mar 21, 2025

sambhavnoobcoder commented Mar 21, 2025

Throw messages in text-generation task with deepseek r1 with PEFTModel #36783

Throw messages in text-generation task with deepseek r1 with PEFTModel #36783

Comments

falconlee236 commented Mar 18, 2025

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Rocketknight1 commented Mar 19, 2025

BenjaminBossan commented Mar 20, 2025

Rocketknight1 commented Mar 20, 2025

sambhavnoobcoder commented Mar 20, 2025

falconlee236 commented Mar 20, 2025 • edited Loading

sambhavnoobcoder commented Mar 20, 2025

Rocketknight1 commented Mar 21, 2025

falconlee236 commented Mar 21, 2025

sambhavnoobcoder commented Mar 21, 2025

falconlee236 commented Mar 20, 2025 •

edited

Loading