Commit 69c8bab
authored
perf(inspect): skip redundant Auto* calls for text-only models (#746)
## Summary
Two extra Strategy-2 gating fixes on top of #719. Profile on
`cardiffnlp/twitter-roberta-base-sentiment-latest` (text-only) showed
the inspect command was still taking ~16 s warm — most of it spent on
Auto* calls that didn't need to run.
Targeting **`release/v0.1.0`** since #717 / #718 / #719 are already on
that release; this is the natural follow-up.
## Profile (warm cache,
`cardiffnlp/twitter-roberta-base-sentiment-latest`)
**Before this PR**:
```
[0] AutoConfig 0.74s (parent_hf_config — already deduped by #719)
[1] AutoProcessor 4.27s returns RobertaTokenizerFast
[7] AutoTokenizer 2.22s ← redundant, AutoProcessor already returned the tokenizer
[11] AutoImageProcessor 1.39s FAIL (text model has no preprocessor_config.json)
[12] AutoFeatureExtractor 0.64s FAIL (same)
─────
~16 s total
```
**After this PR**:
```
AutoConfig 5 calls (vs 8)
AutoProcessor 1 call
AutoTokenizer 1 call (vs 2 — the redundant standalone load is gone)
AutoImageProcessor 0 calls (skipped — no preprocessor_config.json)
AutoFeatureExtractor 0 calls (skipped — same)
─────
~12 s total (~25% faster)
```
## Change
### 1. Detect when `AutoProcessor` returns a leaf class
For single-modality models, `AutoProcessor.from_pretrained` returns the
leaf class directly — e.g. RoBERTa → `RobertaTokenizerFast`. Such a
return has no `.tokenizer` wrapper attribute, so the old code couldn't
populate `tokenizer_class` and fell through to a redundant
`AutoTokenizer.from_pretrained` (~2 s warm).
Pattern-match the returned class name (`*Tokenizer` / `*TokenizerFast`,
`*ImageProcessor` / `*ImageProcessorFast`, `*FeatureExtractor`) and
populate the corresponding field. The `.tokenizer` / `.image_processor`
/ `.feature_extractor` attribute path still wins for genuine multimodal
`ProcessorMixin` returns (CLIP, etc.) — see the
`test_autoprocessor_with_wrapped_pieces_uses_attributes` regression
test.
### 2. `preprocessor_config.json` absence is authoritative
`_resolve_processor_from_hub_configs` already tries to download
`preprocessor_config.json`. When the hub returns 404, the model has *no*
image processor or feature extractor, period. Surface this as a
`has_preprocessor_config` bool from the helper so the caller can skip
the `AutoImageProcessor` / `AutoFeatureExtractor` round-trips (~2 s
total wasted confirming 404s).
## Tests
`tests/unit/inspect/test_resolve_processor_gating.py`:
- `test_autoprocessor_returns_tokenizer_fills_tokenizer_class` —
leaf-class detection populates `tokenizer_class` from class-name suffix
and skips standalone `AutoTokenizer`
- `test_autoprocessor_returns_image_processor_fills_image_class` — same
for `*ImageProcessor`
- `test_autoprocessor_returns_feature_extractor_fills_feature_class` —
same for `*FeatureExtractor`
- `test_autoprocessor_with_wrapped_pieces_uses_attributes` — multimodal
`ProcessorMixin` with `.tokenizer` attribute wins over name suffix
- `test_missing_preprocessor_config_skips_image_and_feature` —
`has_preprocessor_config=False` skips `AutoImageProcessor` /
`AutoFeatureExtractor`
55 targeted tests pass.1 parent 90b9cec commit 69c8bab
2 files changed
Lines changed: 323 additions & 44 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| |||
869 | 869 | | |
870 | 870 | | |
871 | 871 | | |
| 872 | + | |
872 | 873 | | |
873 | | - | |
874 | | - | |
875 | | - | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
876 | 877 | | |
877 | | - | |
878 | | - | |
| 878 | + | |
| 879 | + | |
879 | 880 | | |
880 | | - | |
881 | | - | |
| 881 | + | |
| 882 | + | |
882 | 883 | | |
883 | | - | |
884 | | - | |
| 884 | + | |
| 885 | + | |
885 | 886 | | |
| 887 | + | |
886 | 888 | | |
887 | 889 | | |
888 | 890 | | |
889 | 891 | | |
890 | 892 | | |
891 | 893 | | |
892 | 894 | | |
| 895 | + | |
| 896 | + | |
| 897 | + | |
| 898 | + | |
893 | 899 | | |
894 | 900 | | |
895 | | - | |
896 | | - | |
| 901 | + | |
| 902 | + | |
897 | 903 | | |
898 | 904 | | |
899 | 905 | | |
| |||
938 | 944 | | |
939 | 945 | | |
940 | 946 | | |
941 | | - | |
942 | | - | |
943 | | - | |
| 947 | + | |
| 948 | + | |
| 949 | + | |
| 950 | + | |
| 951 | + | |
| 952 | + | |
| 953 | + | |
| 954 | + | |
| 955 | + | |
| 956 | + | |
| 957 | + | |
| 958 | + | |
| 959 | + | |
| 960 | + | |
| 961 | + | |
944 | 962 | | |
945 | 963 | | |
946 | 964 | | |
| |||
950 | 968 | | |
951 | 969 | | |
952 | 970 | | |
953 | | - | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
954 | 977 | | |
955 | 978 | | |
956 | 979 | | |
| |||
962 | 985 | | |
963 | 986 | | |
964 | 987 | | |
| 988 | + | |
965 | 989 | | |
966 | 990 | | |
967 | 991 | | |
| |||
970 | 994 | | |
971 | 995 | | |
972 | 996 | | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
| 1000 | + | |
| 1001 | + | |
973 | 1002 | | |
974 | 1003 | | |
975 | 1004 | | |
| |||
1009 | 1038 | | |
1010 | 1039 | | |
1011 | 1040 | | |
1012 | | - | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
| 1056 | + | |
| 1057 | + | |
| 1058 | + | |
| 1059 | + | |
| 1060 | + | |
| 1061 | + | |
| 1062 | + | |
| 1063 | + | |
| 1064 | + | |
| 1065 | + | |
| 1066 | + | |
| 1067 | + | |
| 1068 | + | |
1013 | 1069 | | |
1014 | 1070 | | |
1015 | 1071 | | |
| |||
1058 | 1114 | | |
1059 | 1115 | | |
1060 | 1116 | | |
1061 | | - | |
1062 | | - | |
1063 | | - | |
1064 | | - | |
1065 | | - | |
1066 | | - | |
1067 | | - | |
1068 | | - | |
1069 | | - | |
1070 | | - | |
1071 | | - | |
1072 | | - | |
1073 | | - | |
1074 | | - | |
1075 | | - | |
1076 | | - | |
1077 | | - | |
1078 | | - | |
1079 | | - | |
1080 | | - | |
1081 | | - | |
1082 | | - | |
| 1117 | + | |
| 1118 | + | |
| 1119 | + | |
| 1120 | + | |
| 1121 | + | |
| 1122 | + | |
| 1123 | + | |
| 1124 | + | |
| 1125 | + | |
| 1126 | + | |
| 1127 | + | |
| 1128 | + | |
| 1129 | + | |
| 1130 | + | |
| 1131 | + | |
| 1132 | + | |
| 1133 | + | |
| 1134 | + | |
| 1135 | + | |
| 1136 | + | |
| 1137 | + | |
| 1138 | + | |
| 1139 | + | |
| 1140 | + | |
| 1141 | + | |
1083 | 1142 | | |
1084 | 1143 | | |
1085 | 1144 | | |
| |||
0 commit comments