Skip to content

Shaurya-dev7/titan-voice-assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎤 Titan Voice Assistant – Advanced Voice Control for Windows

Python Windows AI/NLP License: MIT

Powerful AI voice assistant with 270+ voice commands, intelligent NLP, and beautiful modern GUI for Windows.

Talk. Control. Execute. – Your personal AI assistant at your fingertips.


🎯 Overview

Titan Voice Assistant is a feature-rich, production-grade voice automation system that transforms your computer into an intelligent, voice-controlled machine. With over 270 voice commands, advanced natural language understanding, and a sleek modern interface, Titan enables hands-free control of applications, system functions, and online services.

Why Titan?

  • 🗣️ 270+ Voice Commands - Comprehensive control coverage
  • 🤖 Advanced NLP - Intelligent command parsing and fuzzy matching
  • 🎨 Modern GUI - Beautiful CustomTkinter interface with dark theme
  • Lightning Fast - Optimized speech recognition and processing
  • 🔊 Natural Speech - High-quality text-to-speech responses
  • 🖥️ System Integration - Deep Windows integration with admin support
  • 🔐 Secure - Local processing, no cloud dependency for core functions
  • 📦 Easy Setup - Simple installation with comprehensive documentation

✨ Features

🗣️ Comprehensive Voice Commands (270+)

🚀 Applications & Websites

  • 80+ Applications - Chrome, VS Code, Spotify, Steam, Discord, etc.
  • 100+ Websites - YouTube, GitHub, ChatGPT, Gmail, Reddit, etc.
  • Custom Apps - Easily add your favorite applications

💻 System Information

  • CPU usage, RAM usage, disk space
  • IP address, WiFi SSID
  • System uptime, battery status
  • GPU information

🪟 Window Management

  • Minimize all, maximize, snap left/right
  • Switch between windows
  • Close windows, fullscreen mode
  • Virtual desktop navigation

📋 Clipboard Operations

  • Copy, paste, clear clipboard
  • Read clipboard contents
  • Clipboard history (configurable)

🧮 Calculations & Units

  • Math: "What is 25 times 4?"
  • Conversions: Temperature, distance, weight
  • Quick Calculations: Direct mathematical expressions

📝 Text Operations

  • Type text, select all, undo/redo
  • Find and replace
  • Save file, print
  • Text formatting (uppercase, lowercase)

🎵 Media Controls

  • Play, pause, next, previous track
  • Volume control, brightness adjustment
  • Set specific volume/brightness levels

⚙️ Power Management

  • Shutdown, restart, sleep (with admin auth)
  • Hibernate, lock screen
  • Scheduled shutdown

🔊 Audio Management

  • Volume up/down/max/mute
  • Set custom volume levels
  • Output device selection

💡 Brightness Control

  • Brightness up/down/max/min
  • Set specific brightness percentage
  • Multi-monitor support

🌐 Network Control

  • Toggle WiFi on/off
  • Bluetooth on/off
  • Network diagnostics

🎨 Modern GUI Features

  • Beautiful dark theme with customizable accent colors
  • Animated microphone button with visual feedback
  • System tray integration for background operation
  • Real-time command display and feedback
  • Responsive and compact mode support
  • Beautiful waveform animation during listening

⚡ Performance & Optimization

  • Fast speech recognition with optimized settings
  • Smart wake word detection with fuzzy matching
  • Efficient command processing and execution
  • Minimal resource usage (~50-100MB RAM)
  • Runs smoothly on all Windows systems

🛠️ Tech Stack

Component Technology Details
Language Python 100% pure Python
Voice Recognition Google Speech Recognition API Real-time speech-to-text
Text-to-Speech pyttsx3 Natural voice output
GUI CustomTkinter Modern, sleek interface
NLP/Parsing FuzzyWuzzy Intelligent command matching
Audio Input PyAudio Microphone integration
System Control psutil, pycaw, pywin32 Windows system access
Display Control screen-brightness-control Brightness management

📋 Command Categories

Category Count Examples
Applications 80+ "Open Chrome", "Launch VS Code", "Start Spotify"
Websites 100+ "Open YouTube", "Go to GitHub", "Open ChatGPT"
System Info 10+ "CPU usage", "RAM usage", "What's my IP"
Window Mgmt 8+ "Minimize all", "Snap left", "Close window"
Volume 6+ "Volume up", "Set volume to 50%", "Mute"
Brightness 6+ "Brightness down", "Set to 80%", "Max brightness"
Clipboard 4+ "Copy that", "Paste", "What did I copy"
Math 5+ "What is 25 plus 17", "Calculate 100 / 4"
Hotkeys 20+ "New tab", "Close tab", "Refresh", "Zoom in"
Total 270+ And more...

📋 Quick Start

Prerequisites

  • Windows 10/11 (64-bit recommended)
  • Python 3.9 or higher
  • Working microphone
  • Internet connection (for Google Speech Recognition)
  • ~2GB disk space for installation

Installation

  1. Clone the repository

    git clone https://github.com/Shaurya-dev7/titan-voice-assistant.git
    cd titan-voice-assistant
  2. Create virtual environment

    python -m venv venv
    venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Run the assistant

    GUI Version (Recommended):

    python app.py

    Console Version:

    python main.py

🎤 Usage Guide

Activation & Basic Commands

  1. Launch the app - Run python app.py
  2. Say "Hey Titan" - Activate the voice assistant
  3. Give your command - Speak clearly and naturally
  4. Receive response - Titan executes and responds

Example Voice Commands

# Application Control
"Open Chrome"
"Launch VS Code"
"Open Spotify"

# Web Navigation
"Open YouTube"
"Go to GitHub"
"Open ChatGPT"

# System Information
"CPU usage"
"What's my IP address"
"RAM usage"

# Window Management
"Minimize all"
"Snap to left"
"Switch window"

# Media Control
"Play music"
"Next track"
"Volume up"

# Brightness/Volume
"Set brightness to 80%"
"Volume to 50%"
"Brightness max"

# Clipboard
"Copy that"
"Paste"
"What did I copy"

# Math
"What is 25 plus 17?"
"Calculate 100 divided by 4"

# System Control
"Take a screenshot"
"Open settings"
"Lock screen"

📁 Project Structure

titan-voice-assistant/
├── app.py                   # GUI application (CustomTkinter)
├── main.py                  # Console application entry point
├── nlu_pipeline.py          # Natural Language Understanding
│   ├── command_parser.py
│   ├── intent_classifier.py
│   └── entity_extractor.py
├── voice_engine.py          # Speech recognition & synthesis
├── system_controller.py      # Windows system operations
├── command_handler.py       # Command execution logic
├── config.py                # Configuration settings
├── utils.py                 # Helper functions
├── assets/
│   ├── icons/
│   └── sounds/
├── requirements.txt         # Dependencies
└── README.md               # This file

⚙️ Configuration

Customize Wake Word

Edit config.py:

WAKE_WORD = "titan"  # Default: "titan"
WAKE_WORD_SENSITIVITY = 0.8  # Fuzzy match threshold

Configure Microphone

MIC_DEVICE_INDEX = 1  # Find index: python -m sounddevice
MIC_SAMPLE_RATE = 16000
MIC_CHANNELS = 1

Adjust Conversation Settings

CONVERSATION_TIMEOUT = 300  # Seconds before sleep
RESPONSE_DELAY = 0.5  # Seconds before response
COMMAND_COOLDOWN = 1  # Prevent rapid repeat commands

Add Custom Commands

Edit nlu_pipeline.py or command_handler.py:

def handle_custom_command(self, intent):
    if intent == "my_custom_command":
        # Your implementation
        self.speak("Command executed!")

🔧 Dependencies

SpeechRecognition   # Google Speech Recognition API
pyttsx3             # Text-to-speech engine
PyAudio             # Audio input/output
customtkinter       # Modern GUI framework
pycaw               # Windows volume control
screen-brightness-control  # Brightness adjustment
psutil              # System information
Pillow              # Image processing
pystray             # System tray integration
python-fuzzy        # Fuzzy string matching
pyperclip           # Clipboard operations
pywin32             # Windows API access

Install all:

pip install -r requirements.txt

🎨 GUI Features

Dark Theme with Accent Colors

# Customize appearance in app.py
ctk.set_appearance_mode("dark")
ctk.set_default_color_theme("blue")  # blue, green, dark-blue

System Tray Integration

  • Minimize to system tray
  • Quick access from taskbar
  • Context menu for common actions

Visual Feedback

  • Animated microphone button
  • Real-time command display
  • Visual listening indicator
  • Response status updates

🧪 Testing & Debugging

Test Voice Recognition

python -c "from voice_engine import VoiceEngine; ve = VoiceEngine(); print(ve.listen())"

Test Command Processing

from nlu_pipeline import CommandParser
parser = CommandParser()
result = parser.parse("open youtube")
print(result)

Debug Microphone

python -m sounddevice

Enable Debug Logging

# In config.py
DEBUG = True
LOG_LEVEL = "DEBUG"

🚀 Performance Metrics

Metric Target Status
Speech Recognition Latency <2s
Command Execution <500ms
Memory Usage <150MB
CPU Usage <10% idle
Startup Time <5s
Command Accuracy >95%

🤝 Contributing

Contributions are welcome! Here's how:

  1. Fork the repository
  2. Create a feature branch
    git checkout -b feature/NewCommands
  3. Make your improvements
  4. Test thoroughly
  5. Commit with clear messages
    git commit -m 'Add: New voice commands for productivity'
  6. Push and create a Pull Request

Areas for Contribution

  • Additional voice commands
  • Improved NLP accuracy
  • GUI enhancements
  • Performance optimizations
  • Documentation improvements
  • Language support (multilingual)

🐛 Troubleshooting

Microphone Not Detected

python -m sounddevice
# Update MIC_DEVICE_INDEX in config.py

Speech Recognition Not Working

  • Check internet connection
  • Verify microphone permissions
  • Test: python -c "from speech_recognition import Microphone; print(Microphone().list_microphone_indexes())"

Commands Not Executing

  • Check Windows permissions
  • Run as Administrator if needed
  • Verify application names in config

GUI Issues

  • Update CustomTkinter: pip install --upgrade customtkinter
  • Ensure Windows 10/11

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.


👨‍💻 Author

Shaurya Deep Rai - AI/ML Engineer


🙏 Acknowledgments


⭐ If you find this project helpful, please consider giving it a star!

🎤 Get Started📖 Commands🐛 Report Issue

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages