fix: Use trust_remote_code_for_model() in all from_pretrained calls#229
fix: Use trust_remote_code_for_model() in all from_pretrained calls#229kendrickb-nvidia wants to merge 1 commit intomainfrom
Conversation
…dia/ models AutoConfig, AutoModelForCausalLM, and AutoTokenizer.from_pretrained calls were missing trust_remote_code=True for nvidia/ models (e.g. Nemotron), causing ValueError when loading models with custom code. Consolidates the check into a single trust_remote_code_for_model() in llm/utils.py used by both ModelMetadata subclasses and HuggingFaceBackend. Signed-off-by: Kendrick Boyd <kendrickb@nvidia.com> Made-with: Cursor
|
Also more |
|
good consolidation -- the populate_derived_fields fix is the important one and the duplicate removal from TrainingBackend is clean. one thing: the utils.py change narrows the type from str | Path to str and drops the str() coercion. this will conflict with #286 and #287 since both touch every subclass init in metadata.py. I'd plan to land #286 first and then either #287 or this. the rebases should be pretty straightforward. longer term the repeated |
Summary
AutoConfig,AutoModelForCausalLM, andAutoTokenizerfrom_pretrainedcalls were missingtrust_remote_code=Truefornvidia/models (e.g. Nemotron), causingValueErrorwhen loading models with custom codetrust_remote_code_for_model()inllm/utils.py, called by all 8ModelMetadatasubclasses,populate_derived_fields,LLMPromptConfig.from_tokenizer, andHuggingFaceBackendTrainingBackend._trust_remote_code_for_model()methodTest plan
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16loads successfully through the SDKSafeSynthesizerpipelinemake test)Other notes
Created #231 to followup and make modifying this behavior use configurable.
Made with Cursor