A real-time AI-powered musical jam session tool. Capture your melody via microphone, convert it to MIDI, send it to a Large Language Model (LLM) for creative response, and play back the AI's musical reply through a virtual MIDI device.
- Audio-to-MIDI: Converts your live audio input into MIDI notes using pitch detection (CREPE).
- LLM Musician: Sends your melody to a Large Language Model (LLM) (e.g., OpenAI GPT-4, any model via OpenRouter, or a local model via Ollama) which responds with a new, creative MIDI sequence.
- Virtual MIDI Output: Plays the AI's response through a virtual MIDI port, usable with any DAW or synth.
- Call-and-response: Designed for interactive, improvisational music sessions.
- Python 3.11
- macOS (tested; may work on Linux/Windows with adjustments)
- PortAudio (for
sounddevice
) - RtMidi (for
python-rtmidi
) - TensorFlow-macos and TensorFlow-metal (for fast pitch detection)
All dependencies are listed in pyproject.toml
:
- sounddevice
- crepe
- openai
- requests
- python-rtmidi
- python-dotenv
- tensorflow-macos
- tensorflow-metal
- ollama (for local LLMs)
Install with:
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt # or use a tool like uv or pip-tools
-
LLM Provider and API Keys: Create a
.env
file in the project root to configure your LLM provider and model. Supported providers areopenai
,openrouter
, andollama
(for local models). Example configuration:# Choose provider: openai, openrouter, or ollama LLM_PROVIDER=openai # For OpenAI OPENAI_API_KEY=sk-... OPENAI_MODEL=gpt-4.1 # For OpenRouter (uses https://openrouter.ai, supports many models) OPENROUTER_API_KEY=... OPENROUTER_MODEL=anthropic/claude-3.7-sonnet:thinking # For Ollama (local LLMs, requires Ollama running locally) OLLAMA_MODEL=llama3.2
- If both
OPENAI_API_KEY
andOPENROUTER_API_KEY
are set, andLLM_PROVIDER
is not specified, OpenRouter will be used by default. - For Ollama, make sure you have Ollama installed and running locally, and the model you specify is available (e.g., run
ollama pull llama3.2
).
- If both
-
Audio/MIDI Devices: Ensure your system has a working microphone and a DAW or synth that can receive MIDI from a virtual port.
Run the main app:
python llmjam.py
You can set the tempo for the session using the --bpm
flag:
python llmjam.py --bpm 120
The default is 95 BPM.
- Press Enter to start jamming.
- Play or sing a melody. Recording starts on sound and stops after 1s of silence.
- The melody is converted to MIDI, sent to the LLM, and the AI's response is played back as MIDI.
- To stop, press
Ctrl+C
orCmd+C
. - To set or change the playing style at any time, press
s
during a jam session and enter a short phrase describing your desired style (e.g., "jazzy", "aggressive", "like a lullaby"). - To change the tempo, you must restart the application with the
--bpm
flag.
- Audio Capture: Listens for sound, records until silence.
- Pitch Detection: Uses CREPE (TensorFlow) to convert audio to MIDI notes.
- LLM Response: Sends MIDI to an LLM (e.g., GPT-4, any OpenRouter-supported model, or a local Ollama model) with a musician prompt; receives a new melody as CSV.
- MIDI Playback: Plays the AI's response through a virtual MIDI port.
- No sound detected: Check your microphone and input device settings.
- No MIDI output: Ensure your DAW/synth is listening to the "llmjam MIDI Out" port.
- API errors: Check your
.env
file and API key validity. - TensorFlow/CREPE errors: Ensure you have the correct versions for your platform (see TensorFlow-macos/metal docs).
MIT
- CREPE for pitch detection
- OpenAI for LLM API
- OpenRouter for multi-provider LLM API
- Ollama for local LLM API
- python-rtmidi for MIDI output