Skip to content

codewithfourtix/aiVoiceAnalyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎙️ Voice AI Evaluator: Unlocking the Power of Your Voice

Project Overview

Welcome to the Voice AI Evaluator, a cutting-edge application meticulously engineered to provide profound insights into vocal characteristics. This project stands as a testament to modern full-stack development, seamlessly integrating a dynamic React-based frontend with a robust Node.js backend. Our mission was to create an intuitive yet powerful tool that not only captures audio with precision but also dissects it using advanced signal processing techniques, offering users unparalleled feedback on their speaking pace, pitch, and volume. The Voice AI Evaluator is more than just an application; it's a sophisticated analytical instrument designed for clarity, performance, and empowering effective communication.

Architectural Grandeur: A Blueprint of Innovation

Our project's strength is rooted in its meticulously designed, modular architecture. This structure ensures exceptional scalability, maintainability, and a clear separation of concerns, allowing each component to perform its role with optimal efficiency. Below is a detailed directory tree illustrating our project's elegant organization:

voiceAiEvaluator/
├── .gitignore
├── README.md
├── client/                                 # The vibrant heart of our user interface
│   ├── index.html                          # Main HTML entry point
│   ├── package-lock.json                   # Frontend dependency lock file
│   ├── package.json                        # Frontend project metadata and dependencies
│   ├── postcss.config.js                   # PostCSS configuration for styling
│   ├── src/                                # Source code for the React frontend
│   │   ├── App.jsx                       # The orchestrator of our React components, defining the main application layout
│   │   ├── api/                          # Centralized location for frontend API service integrations
│   │   │   └── voiceApi.js               # API client for interacting with the backend voice analysis endpoint
│   │   ├── components/                   # Reusable, modular UI building blocks for consistency and maintainability
│   │   │   ├── Contact.jsx               # Component for contact information or forms
│   │   │   ├── Features.jsx              # Highlights key features of the application
│   │   │   ├── Footer.jsx                # Application-wide footer component
│   │   │   ├── Hero.jsx                  # Main hero section of the landing page
│   │   │   ├── Navbar.jsx                # Navigation bar component
│   │   │   ├── OverallPerformanceChart.jsx # Displays overall vocal performance metrics visually
│   │   │   ├── PaceChart.jsx             # Visualizes speaking pace analysis results
│   │   │   ├── Pricing.jsx               # Details pricing plans or tiers
│   │   │   ├── Recorder.jsx            # The gateway to voice capture, managing recording state and audio submission
│   │   │   ├── ReportGenrator.jsx      # The engine for insightful reports, rendering analysis results
│   │   │   ├── Result.jsx                # Displays individual analysis results
│   │   │   ├── Team.jsx                  # Introduces the development team
│   │   │   ├── VoiceAnalyzer.jsx         # Frontend component for displaying voice analysis (distinct from backend utility)
│   │   │   └── ui/                     # Our custom UI component library, ensuring consistent design
│   │   │       ├── button.jsx            # Reusable Button component
│   │   │       ├── card.jsx              # Reusable Card component for structured content display
│   │   │       └── separator.jsx         # Reusable Separator component for visual division
│   │   ├── index.css                     # Global CSS styles
│   │   ├── main.jsx                      # Entry point for the React application
│   │   └── pages/                        # Top-level page components
│   │       └── Home.jsx                  # The main home page of the application
│   ├── tailwind.config.js                # Tailwind CSS configuration
│   └── vite.config.js                    # Vite build and development server configuration, including proxy and aliases
├── package-lock.json                     # Root dependency lock file
├── package.json                          # Root project metadata and scripts
└── server/                                 # The robust backend processing powerhouse
    ├── README.md                         # Backend-specific README
    ├── app.js                            # Main Express application setup
    ├── controllers/                    # Logic handlers for incoming HTTP requests
    │   └── handleUpload.js               # Controller for handling audio file uploads and analysis initiation
    ├── middlewares/                    # Express middleware for request processing
    │   └── multerConfig.js               # Multer configuration for handling multipart/form-data (file uploads)
    ├── package-lock.json                 # Backend dependency lock file
    ├── package.json                      # Backend project metadata and dependencies
    ├── routes/                         # Defines our API endpoints and their respective handlers
    │   └── uploadRoute.js                # Defines the API route for voice analysis
    ├── server.js                       # The core Express server setup and startup script
    └── utils/                          # Essential utility functions and helper modules
        └── voiceAnalyzer.js            # The brain of our voice analysis, containing DSP algorithms

Feature Deep Dive: What Makes Our Project Shine

1. The Frontend: An Intuitive and Engaging User Experience

Our React-based frontend, meticulously crafted within the client directory, is designed for maximum user engagement and ease of use. It's not merely functional; it's a testament to thoughtful UX/UI design, providing a seamless and delightful interaction.

  • : This component is a marvel of audio capture technology. It leverages the browser's native MediaRecorder API, providing a responsive and high-fidelity recording experience. Users can effortlessly initiate, pause, and conclude their recording sessions. The handleAnalyze function within this component exemplifies efficient data handling: it transforms the captured audio blob into a FormData object and dispatches it to our backend's /api/analyze-voice endpoint using axios. This meticulous process ensures that every subtle nuance of the user's voice is accurately transmitted for comprehensive analysis.

  • : Once the sophisticated analysis is complete, this component takes center stage, transforming complex raw data into clear, digestible, and actionable insights. It dynamically renders the voice analysis report, categorizing vocal characteristics such as speaking pace, pitch, and volume into meaningful, human-readable feedback. We've meticulously integrated custom UI components (like Button, Card, and Separator from our client/src/components/ui library) to ensure the report is not only profoundly informative but also visually appealing, intuitive, and easy to navigate. This component truly brings the analytical prowess of our backend to life for the end-user.

  • vite.config.js: This configuration file is the unsung hero of our development workflow, powering our lightning-fast Vite development server and enabling rapid iteration cycles. Crucially, it manages proxy settings for seamless API communication with the backend and establishes intelligent path aliases (e.g., @ mapping to src). This seemingly minor detail dramatically enhances code readability, simplifies module imports, and significantly improves developer experience, making our codebase a pleasure to work with and expand upon.

2. The Backend: The Analytical Powerhouse and Data Orchestrator

Our Node.js backend, robustly structured within the server directory, serves as the intelligent core of the application. It's where the magic of voice analysis truly happens, meticulously designed for high performance, reliability, and secure data processing.

  • server.js: This is the foundational pillar of our backend, an Express.js server expertly configured to handle incoming requests with efficiency and security. It acts as the central hub where all middleware and routing logic are defined, ensuring that every request is processed systematically and reliably.

  • routes/uploadRoute.js: This file meticulously defines our critical /api/analyze-voice endpoint. It serves as the primary entry point for all audio analysis requests, working in perfect concert with multerConfig.js to manage secure and efficient file uploads. Subsequently, it intelligently forwards the processed request to handleUpload.js for the core analysis initiation.

  • middlewares/multerConfig.js: This specialized middleware provides a robust and secure solution for handling multipart/form-data uploads. It is specifically tailored for audio files, ensuring that incoming recordings are correctly parsed, validated, and temporarily stored in a secure location, ready for the next stage of sophisticated analysis.

  • controllers/handleUpload.js: This controller acts as the orchestrator of the entire analysis workflow. It receives the securely uploaded audio file, efficiently reads its contents, and then, with precision, invokes the analyzeWithAssemblyAI function (from our voiceAnalyzer.js utility). After the comprehensive analysis is performed, it diligently handles the cleanup of temporary files, maintaining a clean, secure, and highly efficient system.

  • utils/voiceAnalyzer.js: This is the crown jewel of our backend – the sophisticated voice analysis engine. It is packed with advanced algorithms and digital signal processing (DSP) techniques, transforming raw audio into actionable insights:

    • : This function serves as the primary interface for all voice analysis operations. It takes the raw audio buffer as input and orchestrates a cascade of intricate processing steps, ensuring a thorough examination of the vocal characteristics.

    • : This function is a powerhouse for extracting crucial audio metrics. It operates on the principle of analyzing short-time frames of the audio signal to capture dynamic changes and subtle nuances. Key algorithms include:

      • Root Mean Square (RMS) for Volume: A robust statistical measure of the effective amplitude or power of the sound, calculated as ( RMS = \sqrt{\frac{1}{N} \sum_{i=1}^{N} x_i^2} ). This provides a reliable indicator of perceived loudness.
      • Zero Crossing Rate (ZCR) for Voicing: Counts the rate at which the audio waveform crosses the zero amplitude axis, providing insights into the frequency content and distinguishing voiced from unvoiced speech segments.
      • Energy-Based Voice Activity Detection (VAD): Identifies segments of the audio signal that contain actual speech by comparing the energy (RMS) of frames against a dynamic threshold. This is crucial for accurate speaking rate calculation.
      • Spectral Centroid: Measures the "center of mass" of the audio spectrum, indicating the "brightness" or timbre of the voice. Calculated as ( SC = \frac{\sum_{k=1}^{N} f_k \cdot M_k}{\sum_{k=1}^{N} M_k} ).
    • : This specialized function employs the autocorrelation method for highly accurate pitch estimation. Autocorrelation measures the similarity of a signal with a delayed version of itself, revealing the fundamental period of voiced speech. The reciprocal of this period, scaled by the sampling rate, yields the fundamental frequency (pitch).

    • : This intelligent function translates the raw numerical outputs from our sophisticated DSP algorithms into qualitative, human-understandable descriptions. It categorizes speaking pace, estimated pitch, and volume levels into intuitive labels (e.g., "fast pace," "high pitch," "loud volume") based on meticulously defined thresholds. This crucial step makes the complex analysis immediately accessible and actionable for users. All simulated delays and errors have been rigorously removed, ensuring every analysis is based purely on real-time, algorithmic processing.

The Seamless Workflow: A Testament to Integration and Efficiency

The true elegance and power of the Voice AI Evaluator lie in its perfectly synchronized and highly efficient workflow:

  1. User Interaction: A user initiates an audio recording session through the intuitive interface provided by the component.
  2. Frontend-to-Backend Handshake: Upon completion of the recording, the captured audio blob is efficiently packaged and securely transmitted as a POST request to our backend's /api/analyze-voice endpoint.
  3. Backend Ingestion and Preparation: The uploadRoute.js and multerConfig.js modules work in perfect tandem to receive, validate, and prepare the incoming audio file for subsequent processing.
  4. Deep Analytical Processing: The handleUpload.js controller orchestrates the call to the function, which then performs the intricate and multi-faceted audio analysis using the algorithms detailed above.
  5. Results Delivery: The comprehensive and insightful analysis results are then securely transmitted back to the frontend.
  6. Insightful Reporting: Finally, the component takes these processed results and presents them to the user in a clear, concise, and actionable report format.

Our Commitment to Excellence: Code Quality and Maintainability

We have not merely built a functional application; we have engineered a maintainable, scalable, and high-quality codebase. Our unwavering commitment to best practices is evident in several key areas:

  • Elimination of Mock Data: A critical step in ensuring the absolute integrity and trustworthiness of our analysis. All client-side fallbacks and simulated behaviors were rigorously removed, guaranteeing that every insight provided is derived from genuine, real-time algorithmic processing.

  • Robust Dependency Management: Our package.json files in both the client and server directories are meticulously managed, ensuring that all project dependencies are clearly defined, version-controlled, and consistently reproducible across all development and deployment environments.

  • Clear Separation of Concerns: The distinct and well-defined roles of controllers, middlewares, routes, and utility functions are a hallmark of our clean and modular architecture. This inherent modularity makes the codebase exceptionally easy to navigate, comprehend, and extend with new features or functionalities.

  • Intelligent Path Aliases: The strategic implementation of path aliases in vite.config.js significantly enhances code readability and simplifies module imports, reducing boilerplate code and dramatically improving the developer experience. This attention to detail fosters a more efficient and enjoyable development process.

Getting Started

To run the Voice AI Evaluator locally, follow these steps:

  1. Clone the repository:

    git clone https://github.com/your-username/voiceAiEvaluator.git
    cd voiceAiEvaluator
  2. Install Backend Dependencies:

    cd server
    npm install
  3. Start the Backend Server:

    npm start
    # The backend server will typically run on http://localhost:3001
  4. Install Frontend Dependencies:

    cd ../client
    npm install
  5. Start the Frontend Development Server:

    npm run dev
    # The frontend application will typically run on http://localhost:3000
  6. Access the Application: Open your web browser and navigate to http://localhost:3000.

Future Enhancements

While the Voice AI Evaluator is already a powerful tool, potential future enhancements could include:

  • User Authentication: Implementing user accounts to save and track analysis history.
  • Advanced Visualizations: More interactive and detailed charts for vocal metrics.
  • Comparative Analysis: Allowing users to compare their current analysis with previous recordings or target profiles.
  • Real-time Feedback: Providing immediate visual feedback during the recording process.
  • Integration with Cloud Speech APIs: Exploring options for integrating with external speech-to-text or sentiment analysis APIs for richer insights.

We are incredibly proud of the Voice AI Evaluator and its potential to empower individuals through a deeper understanding of their vocal communication. This project is a testament to robust engineering, thoughtful design, and a commitment to delivering a high-quality, insightful product.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages