What feature would you like to see in FastForward?
Hi Qualcomm Team,
Just wondering if fastforward will support profiling for inference time and memory footprint with full-precision and quantized model say in TensorRT platform (export the model to .onnx and then profile it on TensorRT) to examine the real-world effect.
Why is this feature important to you?
This will test the real-life inference speed improvement and memory reduction.
How would you prioritize this feature?
Must Have - Now
List related issue numbers (if any)
No response