Skip to content

Performance #12

@mdegans

Description

@mdegans

Right now the model used is a fp32 model provided by Nvidia. This is the only trained model i can find for detecting faces right now other than MTCNN which is actually three models and harder to plug in the pipeline. In any case, any trained models available would requires a clusterf**k of conversions.

Around the same time I created this app, Nvidia seemed to be thinking the same thing, so they produced this tutorial which is fantastic, but they don't provide a trained model... which means I must train it myself.

The dataset required to train is over 1TB -- more than is available on my GPU box, meaning I would have to use my NAS and nfs or something for the actual storage. All this is doable, but it means tying up my GPUs for what would likely be days. According to Nvidia it takes 8 hours on a DGX-1, but two 1080s is proably quite a bit slower.

If anybody is willing to follow that tutorial, pay for the cloud time, and send me the trained model it would be greatly appreciated (and you'd get your name in the credits). Otherwise performance enhancements will have to wait until my GPU box has a week free, which might be a while since I'm using nvidia-docker on it all the time.

Alternatively, if somebody knows how to quantize the model I have, that works too. Most of the tools I have found expect an .onnx, but it looks like the code might be able to be modified to use a different parser. Something worth exploring.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions