An OS-level, native accessibility layer designed to provide real-time sign language translation and system control for Deaf and Hard of Hearing users, deeply integrated into a keyboard-driven Linux environment.
This project is the final submission for the ACS455A: Human Computer Interaction course at Daystar University.
- About The Project
- Architecture Overview
- Key Features
- Getting Started
- Usage
- Configuration
- Roadmap
- Acknowledgments
Traditional accessibility solutions for the D/HH community, such as live captions, often fail to provide an equitable experience. They suffer from high error rates, lack crucial non-verbal context, and present a significant barrier to the millions of Deaf individuals whose primary language is a visual sign language, not a written one.
This project critiques the limitations of browser-sandboxed accessibility tools and proposes a superior, user-centered solution: OmaSign.
Instead of a browser extension, OmaSign(presented as Accessibility Hub) is a suite of native tools that integrate directly with the operating system (Omarchy/Arch Linux with Hyprland). This native approach allows it to overcome the performance and access limitations of browsers, offering features like:
- System-wide audio capture for universal translation.
- High-performance, GPU-accelerated AI and rendering.
- Deep integration with the window manager for a seamless user experience.
This prototype demonstrates a path toward a more genuinely inclusive computing environment, where accessibility is treated as core infrastructure, not a low-priority add-on.
OmaSign uses a modular, multi-process architecture orchestrated by a central Python daemon. This allows for high performance and clean separation of concerns, following the principles of Clean Architecture.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Hyprland (Window Manager) β
β ββ Keybinds β Trigger Shell Scripts β
β ββ Manages visibility/position of Signer Window β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββββ
β capture_audio.sh β β capture_text.sh β β toggle_signer.sh β
β (ffmpeg/yt-dlp) β β (wl-paste) β β (hyprctl) β
ββββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββββ
β β β
βββββββββ¬βββββββββββ΄ββββββββββββββββββββββ
β (Communicate via WebSocket)
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β sign_daemon.py (The Brain - Runs in Background) β
β ββ WebSocket Server (Receives commands/data) β
β ββ Whisper C++ Model (Fast Speech-to-Text) β
β ββ Gemini API Client (For smart chunking) β
β ββ Manages and sends text to the Signer Window β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β (Sends text via postMessage)
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Signer Window (Chromium Web App) β
β ββ Borderless, transparent, always-on-top window β
β ββ Runs the sign.mt WebGL frontend internally β
β ββ Listens for postMessage to update the pose β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- β Real-time System Audio Translation: Press a keybind to translate any audio playing on your system (YouTube, Zoom, Spotify, etc.) into sign language.
- β Hybrid Transcription Engine: Seamlessly switches between local Whisper (fast) and Cloud Gemini (high accuracy) based on system resources and user preference.
- β Intelligent Blank Audio Filtering: Automatically detects and filters out silence, music, and non-speech audio to prevent "ghost" signing.
- β Smart Queue Compression: Automatically summarizes pending text chunks when the queue grows too long, ensuring the signer stays in sync with live audio.
- β Interactive Control Hub: A native TUI built with Ratatui that provides logs, status monitoring, and settings management (toggle Live Mode, Hybrid Mode, etc.).
- β Dynamic Window Management: The signer window automatically resizes based on the monitor resolution (optimized for both laptops and large displays).
- β Deep WM Integration: Uses Hyprland's special workspaces for a seamless "show/hide" toggle of the signer window and control hub.
This project is designed for an Arch Linux environment running the Hyprland window manager.
The included Makefile will check for these dependencies. Please install them using your package manager (e.g., sudo pacman -S ...).
gitrustandcargonpm(Node.js)python(>= 3.10)ffmpegwl-clipboard(forwl-paste)cmakekitty(or modify theMakefilefor your preferred terminal)websocat(can be installed withcargo install websocat)
The Makefile is designed to be idempotent and handle the entire setup process.
-
Clone the repository:
git clone https://github.com/andomeder/omasign.git cd omasign -
Run the master setup command: This will check dependencies, set up the Python environment with
uv, build the Rust components, and inject the necessary configurations into your Hyprland setup.make setup
-
Set your API Key: The system uses the Gemini API for intelligent text chunking and it also runs alongside the Whisper model for fast transcription. You must set your API key as an environment variable. Add this line to your
~/.bashrc,~/.zshrc, or shell configuration file:export GEMINI_API_KEY='your_api_key_here'
Remember to
sourceyour config file or restart your terminal. -
Reload Hyprland: To apply the new keybinds and autostart configuration, either log out and log back in, or run
hyprctl reloadin your terminal.
The project is designed to be managed entirely through the Makefile.
-
To start all services: This will launch the
sign.mtfrontend, the Python daemon, the Control Hub, and the Renderer window.make run
-
To stop all services:
make stop
-
To restart all services:
make restart
The following keybinds are injected into your ~/.config/hypr/omasign.conf:
| Keybind | Action | Description |
|---|---|---|
SUPER + ; |
Toggle Signer UI | Shows/hides the floating signer window (pinned, bottom-right). |
SUPER + ' |
Toggle Control Hub | Shows/hides the main Control Hub (floating, center). |
SUPER + ALT + A |
Translate System Audio | Toggles the real-time transcription of all system audio. |
SUPER + ALT + T |
Translate Selected Text | Translates any highlighted text on your screen. |
SUPER + ALT + V |
Activate Visual Command | (Roadmap) Listens for a sign to execute a command. |
- Keybinds & Rules: Edit
~/.config/hypr/omasign.confto change key combinations and window behaviors. - Window Sizing: Modify
scripts/position_signer.shto adjust the dynamic resizing logic. - Python Dependencies: Modify
pyproject.tomland runmake update-depsfollowed bymake venv.
This prototype is the foundation for a fully-featured accessible operating environment. Future work includes:
- Sign-to-Command: Implement the
sign_to_command.shscript to allow users to control the OS with sign language gestures. - Hyprlock Sign-In: Complete the
unlock_with_sign.shscript to enable passwordless login via a signed phrase. - Interactive TUI: Add buttons and commands to the Ratatui Control Hub to manage the daemon directly.
- Refined Audio Buffering: Improve the real-time audio transcription to use a more sophisticated buffering strategy for better accuracy.
- Official Omarchy Fork: Package all components and configurations into a dedicated, installable version of the OS.
This project would not be possible without the incredible open-source work of others:
- The
sign.mtteam for their groundbreaking sign language translation models and frontend - The Hyprland and Omarchy developers for creating a powerful and scriptable desktop environment.
- The creators of
yt-dlp,ffmpeg,whisper.cpp, and all the other tools that power this system.
