Fixed QNN data format config issue. (#480)

shubhagr-qc · quic-amitraj · commit f29d1553a782 · 2025-07-10T06:26:25.000Z
Generating data format config file fails for encoder onnx graph without
past key or past value.
Fixed a coding bug in the function.

---------

Signed-off-by: Shubham Agrawal &lt;shubhagr@qti.qualcomm.com&gt;
Signed-off-by: Amit Raj &lt;quic_amitraj@quicinc.com&gt;
diff --git a/QEfficient/utils/generate_qnn_network_specialization_config.py b/QEfficient/utils/generate_qnn_network_specialization_config.py
@@ -166,8 +166,8 @@ def generate_data_format_config(
     for output in onnx_model.graph.output:
         if "past_key" in output.name or "past_value" in output.name:
             kv_nodes.append(output.name)
-            kv_overrides = {}
 
+    kv_overrides = {}
     kv_overrides["graphs"] = [
         {
             "graph_name": model_dlc_name + "_configuration_1",
diff --git a/docs/source/quick_start.md b/docs/source/quick_start.md
@@ -94,7 +94,7 @@ python -m QEfficient.cloud.execute --model_name gpt2 --qpc_path qeff_models/gpt2
 You can run the finetune with set of predefined existing datasets on QAIC using the eager pipeline
 
 ```bash
-python -m QEfficient.cloud.finetune --device qaic:0 --use-peft --output_dir ./meta-sam --num_epochs 2 --context_length 256 
+python -m QEfficient.cloud.finetune --device qaic:0 --use-peft --output_dir ./meta-sam --num_epochs 2 --context_length 256
 ```
 For more details on finetune, checkout the subsection.
 
@@ -138,6 +138,28 @@ Users can compile a model with QNN SDK by following the steps below:
 * Enabled QNN by passing enable_qnn flag, add --enable_qnn in the cli command.
 * An optional config file can be passed to override the default parameters.
 
+**Default Parameters**
+
+QNN Converter Stage:
+
+    "--float_bias_bitwidth 32 --float_bitwidth 16 --preserve_io_datatype --onnx_skip_simplification --target_backend AIC"
+
+QNN Context Binary Stage:
+
+    LOG_LEVEL = "error"
+    COMPILER_COMPILATION_TARGET = "hardware"
+    COMPILER_CONVERT_TO_FP16 = True
+    COMPILER_DO_DDR_TO_MULTICAST = True
+    COMPILER_HARDWARE_VERSION = "2.0"
+    COMPILER_PERF_WARNINGS = False
+    COMPILER_PRINT_DDR_STATS = False
+    COMPILER_PRINT_PERF_METRICS = False
+    COMPILER_RETAINED_STATE = True
+    COMPILER_STAT_LEVEL = 10
+    COMPILER_STATS_BATCH_SIZE = 1
+    COMPILER_TIME_PASSES = False
+
+
 **CLI Inference Command**
 
 Without QNN Config

Original file line number	Diff line number	Diff line change
`@@ -166,8 +166,8 @@ def generate_data_format_config(`
`166`	`166`	`for output in onnx_model.graph.output:`
`167`	`167`	`if "past_key" in output.name or "past_value" in output.name:`
`168`	`168`	`kv_nodes.append(output.name)`
`169`		`- kv_overrides = {}`
`170`	`169`
	`170`	`+ kv_overrides = {}`
`171`	`171`	`kv_overrides["graphs"] = [`
`172`	`172`	`{`
`173`	`173`	`"graph_name": model_dlc_name + "_configuration_1",`