Skip to content

An OS-level, native accessibility layer designed to provide real-time sign language translation and system control for Deaf and Hard of Hearing users, deeply integrated into a keyboard-driven Linux environment.

Notifications You must be signed in to change notification settings

andomeder/omasign

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

56 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

OmaSign 🀟

Status: Prototype Platform: Arch Linux / Hyprland License: MIT

An OS-level, native accessibility layer designed to provide real-time sign language translation and system control for Deaf and Hard of Hearing users, deeply integrated into a keyboard-driven Linux environment.

This project is the final submission for the ACS455A: Human Computer Interaction course at Daystar University.


OmaSign Demo GIF


Table of Contents

About The Project

Traditional accessibility solutions for the D/HH community, such as live captions, often fail to provide an equitable experience. They suffer from high error rates, lack crucial non-verbal context, and present a significant barrier to the millions of Deaf individuals whose primary language is a visual sign language, not a written one.

This project critiques the limitations of browser-sandboxed accessibility tools and proposes a superior, user-centered solution: OmaSign.

Instead of a browser extension, OmaSign(presented as Accessibility Hub) is a suite of native tools that integrate directly with the operating system (Omarchy/Arch Linux with Hyprland). This native approach allows it to overcome the performance and access limitations of browsers, offering features like:

  • System-wide audio capture for universal translation.
  • High-performance, GPU-accelerated AI and rendering.
  • Deep integration with the window manager for a seamless user experience.

This prototype demonstrates a path toward a more genuinely inclusive computing environment, where accessibility is treated as core infrastructure, not a low-priority add-on.

Architecture Overview

OmaSign uses a modular, multi-process architecture orchestrated by a central Python daemon. This allows for high performance and clean separation of concerns, following the principles of Clean Architecture.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Hyprland (Window Manager)                                 β”‚
β”‚  β”œβ”€ Keybinds β†’ Trigger Shell Scripts                       β”‚
β”‚  └─ Manages visibility/position of Signer Window           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚                  β”‚                     β”‚
          β–Ό                  β–Ό                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ capture_audio.sh β”‚   β”‚ capture_text.sh β”‚   β”‚ toggle_signer.sh  β”‚
β”‚ (ffmpeg/yt-dlp)  β”‚   β”‚ (wl-paste)      β”‚   β”‚ (hyprctl)         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚                  β”‚                     β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚ (Communicate via WebSocket)
                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  sign_daemon.py (The Brain - Runs in Background)         β”‚
β”‚  β”œβ”€ WebSocket Server (Receives commands/data)            β”‚
β”‚  β”œβ”€ Whisper C++ Model (Fast Speech-to-Text)              β”‚
β”‚  β”œβ”€ Gemini API Client (For smart chunking)               β”‚
β”‚  └─ Manages and sends text to the Signer Window          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚ (Sends text via postMessage)
                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Signer Window (Chromium Web App)                        β”‚
β”‚  β”œβ”€ Borderless, transparent, always-on-top window        β”‚
β”‚  β”œβ”€ Runs the sign.mt WebGL frontend internally           β”‚
β”‚  └─ Listens for postMessage to update the pose           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Features

  • βœ“ Real-time System Audio Translation: Press a keybind to translate any audio playing on your system (YouTube, Zoom, Spotify, etc.) into sign language.
  • βœ“ Hybrid Transcription Engine: Seamlessly switches between local Whisper (fast) and Cloud Gemini (high accuracy) based on system resources and user preference.
  • βœ“ Intelligent Blank Audio Filtering: Automatically detects and filters out silence, music, and non-speech audio to prevent "ghost" signing.
  • βœ“ Smart Queue Compression: Automatically summarizes pending text chunks when the queue grows too long, ensuring the signer stays in sync with live audio.
  • βœ“ Interactive Control Hub: A native TUI built with Ratatui that provides logs, status monitoring, and settings management (toggle Live Mode, Hybrid Mode, etc.).
  • βœ“ Dynamic Window Management: The signer window automatically resizes based on the monitor resolution (optimized for both laptops and large displays).
  • βœ“ Deep WM Integration: Uses Hyprland's special workspaces for a seamless "show/hide" toggle of the signer window and control hub.

Getting Started

This project is designed for an Arch Linux environment running the Hyprland window manager.

Prerequisites

The included Makefile will check for these dependencies. Please install them using your package manager (e.g., sudo pacman -S ...).

  • git
  • rust and cargo
  • npm (Node.js)
  • python (>= 3.10)
  • ffmpeg
  • wl-clipboard (for wl-paste)
  • cmake
  • kitty (or modify the Makefile for your preferred terminal)
  • websocat (can be installed with cargo install websocat)

Installation

The Makefile is designed to be idempotent and handle the entire setup process.

  1. Clone the repository:

    git clone https://github.com/andomeder/omasign.git
    cd omasign
  2. Run the master setup command: This will check dependencies, set up the Python environment with uv, build the Rust components, and inject the necessary configurations into your Hyprland setup.

    make setup
  3. Set your API Key: The system uses the Gemini API for intelligent text chunking and it also runs alongside the Whisper model for fast transcription. You must set your API key as an environment variable. Add this line to your ~/.bashrc, ~/.zshrc, or shell configuration file:

    export GEMINI_API_KEY='your_api_key_here'

    Remember to source your config file or restart your terminal.

  4. Reload Hyprland: To apply the new keybinds and autostart configuration, either log out and log back in, or run hyprctl reload in your terminal.

Usage

Running the System

The project is designed to be managed entirely through the Makefile.

  • To start all services: This will launch the sign.mt frontend, the Python daemon, the Control Hub, and the Renderer window.

    make run
  • To stop all services:

    make stop
  • To restart all services:

    make restart

Keybinds

The following keybinds are injected into your ~/.config/hypr/omasign.conf:

Keybind Action Description
SUPER + ; Toggle Signer UI Shows/hides the floating signer window (pinned, bottom-right).
SUPER + ' Toggle Control Hub Shows/hides the main Control Hub (floating, center).
SUPER + ALT + A Translate System Audio Toggles the real-time transcription of all system audio.
SUPER + ALT + T Translate Selected Text Translates any highlighted text on your screen.
SUPER + ALT + V Activate Visual Command (Roadmap) Listens for a sign to execute a command.

Configuration

  • Keybinds & Rules: Edit ~/.config/hypr/omasign.conf to change key combinations and window behaviors.
  • Window Sizing: Modify scripts/position_signer.sh to adjust the dynamic resizing logic.
  • Python Dependencies: Modify pyproject.toml and run make update-deps followed by make venv.

Roadmap

This prototype is the foundation for a fully-featured accessible operating environment. Future work includes:

  • Sign-to-Command: Implement the sign_to_command.sh script to allow users to control the OS with sign language gestures.
  • Hyprlock Sign-In: Complete the unlock_with_sign.sh script to enable passwordless login via a signed phrase.
  • Interactive TUI: Add buttons and commands to the Ratatui Control Hub to manage the daemon directly.
  • Refined Audio Buffering: Improve the real-time audio transcription to use a more sophisticated buffering strategy for better accuracy.
  • Official Omarchy Fork: Package all components and configurations into a dedicated, installable version of the OS.

Acknowledgments

This project would not be possible without the incredible open-source work of others:

  • The sign.mt team for their groundbreaking sign language translation models and frontend
  • The Hyprland and Omarchy developers for creating a powerful and scriptable desktop environment.
  • The creators of yt-dlp, ffmpeg, whisper.cpp, and all the other tools that power this system.

About

An OS-level, native accessibility layer designed to provide real-time sign language translation and system control for Deaf and Hard of Hearing users, deeply integrated into a keyboard-driven Linux environment.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •