Skip to content

pragnyanramtha/autopilot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

39 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

AI Automation Assistant

Control your computer with natural language and voice commands.

Powered by Google Gemini AI, this assistant understands what you want to do and executes it automatically.

IMPORTANT: The latest version of this app is under devlopment, features may not work as expected.


๐Ÿš€ Quick Start

1. Setup (First Time)

setup_venv.bat

2. Configure API Key

Create a .env file:

GEMINI_API_KEY=your_api_key_here

Get your key: https://makersuite.google.com/app/apikey

3. Run

run.bat

That's it! ๐ŸŽ‰


๐Ÿ’ฌ Usage

Text Commands

Just type naturally:

search for Python tutorials and open first result
write an article about AI and post to X
click the submit button

Voice Commands (Optional)

Press V then speak:

๐ŸŽค "Search for AI trends"
๐ŸŽค "Post to Twitter"

๐Ÿ“– Documentation

Getting Started

Features

Technical


โœจ Features

๐ŸŽฏ Smart Command Understanding

  • Natural language processing
  • Detects simple vs complex tasks
  • Automatic workflow generation

๐ŸŒ Web Automation

  • 15+ websites supported (X, Facebook, LinkedIn, Gmail, GitHub, etc.)
  • Smart navigation with keyboard shortcuts
  • Tab-based navigation fallback

๐ŸŽค Voice Input (Optional)

  • Press V to speak commands
  • Works alongside text input
  • Install: pip install SpeechRecognition pyaudio

๐Ÿ” Search Features

  • Opens Chrome and searches
  • Can open first result automatically
  • Example: "search for Python and open first result"

โœ๏ธ Content Generation

  • AI-powered article writing
  • Social media posts
  • Customizable length and style

๐Ÿ”’ Security

  • API key in .env file (not in code)
  • Dry-run mode for testing
  • Emergency stop (Ctrl+C)

๐Ÿ“‹ Requirements

  • Python 3.10+
  • Windows (Linux/Mac support coming)
  • Gemini API Key (free from Google)

๐ŸŽฎ Commands

Simple Actions

click the OK button
type hello world
press enter
open Chrome

Web Automation

search for AI trends
go to twitter.com
navigate to github.com

Complex Workflows

search for Python tutorials and open first result
write an article about AI and post to X
research machine learning and create a summary

Special Commands

help     - Show help
voice    - Toggle voice input
exit     - Quit

๐Ÿ› ๏ธ Configuration

Edit config.json to customize:

{
  "social_media": {
    "posting_strategy": "tab_navigation",
    "supported_platforms": ["X/Twitter", "Facebook", "LinkedIn", ...]
  }
}

Strategies:

  • keyboard_shortcut - Fast (uses N key for Twitter)
  • tab_navigation - Reliable (presses Tab to navigate)
  • smart - With verification (uses screen capture)

๐Ÿ“ Project Structure

ai-automation-assistant/
โ”œโ”€โ”€ run.py                  # ๐Ÿš€ Main launcher (use this!)
โ”œโ”€โ”€ run.bat                 # ๐Ÿš€ Windows launcher
โ”œโ”€โ”€ setup_venv.bat          # Setup script
โ”œโ”€โ”€ .env                    # ๐Ÿ”’ API key (create this)
โ”œโ”€โ”€ config.json             # โš™๏ธ Configuration
โ”œโ”€โ”€ requirements.txt        # ๐Ÿ“ฆ Dependencies
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ ai_brain/           # ๐Ÿง  AI command processing
โ”œโ”€โ”€ ๐Ÿ“ automation_engine/  # ๐Ÿค– Mouse/keyboard control
โ”œโ”€โ”€ ๐Ÿ“ shared/             # ๐Ÿ”ง Shared utilities
โ”‚
โ”œโ”€โ”€ ๐Ÿ“ docs/               # ๐Ÿ“– Documentation
โ”œโ”€โ”€ ๐Ÿ“ scripts/            # ๐Ÿ”จ Helper scripts & old launchers
โ”œโ”€โ”€ ๐Ÿ“ tests/              # ๐Ÿงช Test files
โ””โ”€โ”€ ๐Ÿ“ venv/               # ๐Ÿ Virtual environment

๐Ÿ› Troubleshooting

"Virtual environment not found"

setup_venv.bat

".env file not found"

Create .env with:

GEMINI_API_KEY=your_key_here

"Module not found"

venv\Scripts\activate.bat
pip install -r requirements.txt

Voice not working

pip install SpeechRecognition pyaudio

๐ŸŽฏ Examples

Example 1: Quick Search

> search for Python tutorials

โœ“ Opens Chrome
โœ“ Searches "Python tutorials"
โœ“ Shows results

Example 2: Search and Open

> search for best restaurants and open first result

โœ“ Opens Chrome
โœ“ Searches "best restaurants"
โœ“ Presses Tab+Tab+Enter
โœ“ Opens first result

Example 3: Social Media Post

> write an article about AI and post to X

โœ“ Researches AI topics
โœ“ Generates article
โœ“ Opens Chrome
โœ“ Goes to X.com
โœ“ Posts content

๐Ÿค Contributing

Contributions welcome! The codebase is clean and well-documented.


๐Ÿ“„ License

MIT License - See LICENSE file


๐Ÿ™ Acknowledgments

  • Google Gemini AI - Natural language processing
  • PyAutoGUI - Automation
  • Rich - Beautiful terminal UI

๐Ÿ“ž Support

  • Documentation: See docs/ folder
  • Issues: Check error messages (they're helpful!)
  • Quick Check: Run run.bat and choose option 6

Made with โค๏ธ for automation enthusiasts

๐Ÿš€ Start now: run.bat

About

Automate things on your native system with the power of ai.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages