Conversation Analysis Platform 🎙️

An advanced audio analysis platform that leverages AI to provide deep insights into conversations. This tool performs speaker diarization, transcription, sentiment analysis, and emotion detection to generate comprehensive conversation summaries.

Features 🚀

Speaker Diarization: Automatically identifies and separates different speakers
Speech-to-Text: Accurate transcription of conversations
Sentiment Analysis: Analyzes the sentiment of each speaker's utterances
Emotion Detection: Identifies emotions in speech using audio features
Conversation Summary: AI-powered detailed analysis of conversation dynamics
Batch Processing: Analyze multiple audio files simultaneously
GPU Acceleration: Optimized performance with CUDA support

Key Components ⚙️

Single file analysis interface
Batch processing capability (up to 10 files)
Detailed conversation insights
Downloadable analysis results
Progress tracking and status updates

Installation 🛠️

Clone the repository

git clone https://github.com/Sarthakischill/Conversation_Analysis.git
cd conversation-analysis-platform

Create and activate a virtual environment

# Windows
python -m venv venv
venv\Scripts\activate

# Linux/Mac
python -m venv venv
source venv/bin/activate

Install required packages

pip install -r requirements.txt

Set up authentication

Create a Hugging Face account
Accept the license for pyannote/speaker-diarization-3.1
Get your Hugging Face token
Replace the token in final.py

# In final.py
self.diarization_pipeline = Pipeline.from_pretrained(
    "pyannote/speaker-diarization-3.1",
    use_auth_token="your-token-here"
)

Usage 💡

Start the application

streamlit run Home.py

Access the web interface at http://localhost:8501
Choose between:
- Single file analysis (Home page)
- Batch analysis (Batch Analysis page)
Upload WAV format audio file(s)
Click "Analyze Conversation" to start processing
View and download results

Project Structure 📁

app/
├── pages/
│   └── 1_Batch_Analysis.py
├── models/
│   └── emotion_detection_model.pkl
├── uploads/
├── results/
└── Home.py

Technical Requirements 💻

Python 3.10 or later
CUDA-capable GPU (recommended)
CUDA Toolkit (for GPU acceleration)
Minimum 8GB RAM

Dependencies 📚

Major dependencies include:

streamlit
torch
pyannote.audio
whisper
transformers
librosa
google-generativeai
scikit-learn

See requirements.txt for complete list.

Features in Detail 🔍

Audio Analysis

Speaker separation and identification
High-quality speech-to-text conversion
Real-time sentiment analysis
Emotion detection from audio features

Conversation Analysis

Speaker interaction patterns
Emotional tone mapping
Sentiment progression
Key topics identification

Batch Processing

Multiple file upload support
Parallel processing capability
Combined results in ZIP format
Progress tracking for each file

Output Format 📝

The analysis generates a structured summary including:

Main Topics
Conversation Dynamics
Speaker Analysis
- Key Points
- Predominant Emotions
- Sentiments
- Behavior Analysis
- Tone
Overall Emotional Tone
Key Insights
Actionable Suggestions

Troubleshooting 🔧

Common issues and solutions:

CUDA related warnings:
- Ensure CUDA toolkit is installed
- Update GPU drivers
- Check CUDA compatibility with PyTorch version
Memory issues:
- Reduce batch size
- Process shorter audio segments
- Close other GPU-intensive applications
Model loading errors:
- Verify Hugging Face token
- Check internet connection
- Ensure model files are present

Contributing 🤝

Contributions are welcome! Please feel free to submit a Pull Request.

License 📄

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments 👏

PyAnnote Audio for speaker diarization
OpenAI Whisper for transcription
Hugging Face for transformer models
Google for Gemini API
Streamlit for the web interface

Author ✍️

[Sarthak]

GitHub: [@Sarthakischill]
Email: sarthakshitole@gmail.com

Support 💪

If you encounter any issues or have questions, please:

Check the troubleshooting section
Open an issue on GitHub!
Contact the author

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
src		src
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conversation Analysis Platform 🎙️

Features 🚀

Key Components ⚙️

Installation 🛠️

Usage 💡

Project Structure 📁

Technical Requirements 💻

Dependencies 📚

Features in Detail 🔍

Audio Analysis

Conversation Analysis

Batch Processing

Output Format 📝

Troubleshooting 🔧

Contributing 🤝

License 📄

Acknowledgments 👏

Author ✍️

Support 💪

Implementation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

Sarthakischill/Conversation_Analysis

Folders and files

Latest commit

History

Repository files navigation

Conversation Analysis Platform 🎙️

Features 🚀

Key Components ⚙️

Installation 🛠️

Usage 💡

Project Structure 📁

Technical Requirements 💻

Dependencies 📚

Features in Detail 🔍

Audio Analysis

Conversation Analysis

Batch Processing

Output Format 📝

Troubleshooting 🔧

Contributing 🤝

License 📄

Acknowledgments 👏

Author ✍️

Support 💪

Implementation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages