- 
                Notifications
    
You must be signed in to change notification settings  - Fork 190
 
Pull requests: NVIDIA/TensorRT-Model-Optimizer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
      [OMNIML-2917] handle lm_head and other un-quantized modules correctly
      
    
      
  
        
          #504
            opened Nov 4, 2025  by
            shengliangxu
            
        
        
            
    •
    
      Draft
    
  
        
        
      
    
      Fix the onnx checker to use model path when model size > 2gib
      
    
      
  
        
          #502
            opened Nov 4, 2025  by
            hthadicherla
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      [NVBUG: 5608888] Update link in vlm_ptq README for support matrix details
      
    
      
  
        
          #499
            opened Nov 4, 2025  by
            cjluo-nv
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      [NVBUG: 5617733] Update LLM generate API for modelopt LLM eval
      
    
      
  
        
          #498
            opened Nov 4, 2025  by
            cjluo-nv
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      [NVBUG: 5612606] Clear GPU cache for large models layer quantization during export
      
    
      
  
        
          #497
            opened Nov 4, 2025  by
            cjluo-nv
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      fix qdq utils issues and remove global cast replacements
      
    
      
  
        
          #489
            opened Oct 31, 2025  by
            nvluxiaoz
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      [Draft] [5526696] Add kv cache quantization support for onnx quantization
      
    
      
  
        
          #486
            opened Oct 31, 2025  by
            zhanghaoc
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      [5590225] Fixed regression introduced by PR #364 (FP64-to-FP32 conversion)
      
    
      
  
        
          #462
            opened Oct 24, 2025  by
            gcunhase
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      Add functional test cases for published checkpoints on HF
      
    
      
  
        
          #455
            opened Oct 21, 2025  by
            noeyy-mino
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      Preserve original rope scaling type in export due to transformers library AutoConfig issue
      
    
      
  
        
          #452
            opened Oct 17, 2025  by
            Edwardf0t1
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      [1/2] Registry interface for custom quantization functional backend
      
    
      
  
        
          #449
            opened Oct 17, 2025  by
            realAsma
            
        
        
            
    
  
    Loading…
 
        
        
      
    Previous Next
  
  
  ProTip!
  Mix and match filters to narrow down what you’re looking for.