Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does python library work with audio files? #17

Open
esphoenixc opened this issue Jan 8, 2025 · 8 comments
Open

Does python library work with audio files? #17

esphoenixc opened this issue Jan 8, 2025 · 8 comments

Comments

@esphoenixc
Copy link

Does it work with audio files? If I provide two files - mic and system audio files. Can it remove system audio sound in mic files?

Thank you for your work!

@thewh1teagle
Copy link
Owner

Yes
See the examples
https://github.com/thewh1teagle/aec/tree/main/examples/python-examples

@esphoenixc
Copy link
Author

Thank you, I've tried and the result literally removed not only the system sound but also my speaking which was captured by mic.

@thewh1teagle
Copy link
Owner

Thank you, I've tried and the result literally removed not only the system sound but also my speaking which was captured by mic.

You can play with the filter length, see the docs of speexdsp.
What do you use it for?
I found that this aec based on speex is not magic solution and may suitable only for specific use cases.
Neural network based aec(s) works better in my opinion

@thewh1teagle
Copy link
Owner

For instance
https://modelscope.cn/models/damo/speech_dfsmn_aec_psm_16k/summary
Works like magic but it's a bit heavy 40-60% cpu on amd ryzen 5

@esphoenixc
Copy link
Author

@thewh1teagle Thank you for your response. Is this a machine learning based AEC? Is there a huggingface version that I can try? I don't understand Chinese. I want to test this on my mac and see if I can use mac M1 Chip for processing or even MPS GPU. Also I wonder how long it would take to process a 1-hr long audio file

Thank you again for your work!

@thewh1teagle
Copy link
Owner

thewh1teagle commented Jan 9, 2025

Is this a machine learning based AEC?

Yes

Is there a huggingface version that I can try? I don't understand Chinese. I want to test this on my mac and see if I can use mac M1 Chip for processing or even MPS GPU.

Modelscope is like huggingface you can tinker with it. idk Chinese as well but it worked well when tested online

@thewh1teagle
Copy link
Owner

thewh1teagle commented Jan 9, 2025

wonder how long it would take to process a 1-hr long audio file

On amd ryzen 5 it worked in realtime with 50% cpu
Say we use 100% cou it compute 1 hour in 30 minutes
Now macos m1 is much faster. I would guess m1 will compute 1 hour in 10-20 minutes

@jacksongoode
Copy link

For instance
modelscope.cn/models/damo/speech_dfsmn_aec_psm_16k/summary
Works like magic but it's a bit heavy 40-60% cpu on amd ryzen 5

Hey @thewh1teagle, the only other options, with actual implementations, I've found for DL based AEC solutions are https://github.com/breizhn/DTLN-aec. I've perused a number of other papers that appear to have better performance:

Just wanted to drop these here given there might be some interest to find a better alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants