Skip to content

Real-time fact-checking of tabular claims with SOTA reasoning LLMs.

License

Notifications You must be signed in to change notification settings

TimLukaHorstmann/T-REX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

77 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

T-REX Logo T-REX: Table - Refute or Entail eXplainer

Live Demo Maintenance License: Custom NC Ollama uv arXiv

T-REX (Table - Refute or Entail eXplainer) is an interactive tool designed for intuitive, transparent, and live fact-checking of tabular data. Leveraging state-of-the-art instruction-tuned reasoning Large Language Models (LLMs), T-REX dynamically analyzes claims against tables, clearly indicating entailment or refutation, along with visual explanations highlighting relevant tble cells.

πŸš€ Key Features

  • Live Fact-Checking: Paste or upload CSV tables or images (OCR), or select from the TabFact dataset.
  • Multiple LLMs: Support for multiple models including GPT-OSS (20B), Phi‑4 (14B), Qwen3 (8B), Cogito (8B), DeepSeek‑R1 (7B), and Gemma3 (4B).
  • Visual Explainability: Highlights cells identified by the model as relevant for the verification.
  • Precomputed Results Exploration: Explore results from various LLMs on the TabFact benchmark dataset with performance metrics and intuitive visualizations.
  • Multilingual Support: English, French, German, Spanish, Portuguese, Chinese, Arabic, Russian

πŸ–₯️ Demo

T-REX got accepted at ECML-PKDD 2025.

Experience the live demo here: https://t-rex.r2.enst.fr/

🎬 Watch the video demo:
Watch the video

πŸ“‹ Usage

Live Table Fact-Checking:

  • Input custom CSV-formatted tables directly or via file/image upload (with OCR support).
  • Enter custom claims or select pre-existing claims from the TabFact dataset.
  • Real-time inference with streaming outputs from supported LLMs.

Precomputed Results:

  • Analyze comprehensive benchmark results from various models (e.g., DeepSeek‑R1, Gemma3, Llama, etc.) on the TabFact dataset.
  • Detailed visual analytics, including confusion matrices and performance summaries.

πŸ”§ Technology Stack

  • Frontend: HTML, CSS, JavaScript, Plotly.js, Chart.js, Choices.js
  • Backend: Python, FastAPI, Uvicorn
  • Inference Engine: Ollama
  • LLMs: GPT-OSS (20B), Phi‑4 (14B), Qwen3‑8B (8B), Cogito v1 Preview (8B), DeepSeek‑R1‑Distill‑Qwen‑7B (7B), and Gemma 3 (4B)
  • OCR: Tesseract, Granite3.2-vision (Ollama)

πŸ“š Dataset & Credits

T-REX uses the TabFact dataset by Wenhuchen et al. For more details, please refer to the original paper:

TabFact: A Large-scale Dataset for Table-based Fact Verification
Wenhuchen et al., ICLR 2020.
https://github.com/wenhuchen/Table-Fact-Checking

πŸš€ Getting Started

Prerequisites

  1. Python: 3.10+ recommended.
  2. uv: Fast Python package manager and runner.
    • Install with pipx or Homebrew:
      pipx install uv
      # or
      brew install uv
  3. Ollama: Install from https://ollama.com/.
    • Pull the required models (depending on which ones you would like to use):
      ollama pull gpt-oss:20b
      ollama pull phi4
      ollama pull qwen3:8b
      ollama pull deepseek-r1:latest
      ollama pull gemma3
      ollama pull cogito
      ollama pull granite3.2-vision # For Ollama OCR
    • Ensure the Ollama service is running.
  4. Tesseract OCR (Optional): if you plan to use Tesseract for OCR.
    • Install the Tesseract binary:
      • macOS: brew install tesseract
      • Ubuntu/Debian: sudo apt-get install tesseract-ocr
    • Ensure it’s discoverable: which tesseract and tesseract --version

Installation & Local Setup

  1. Clone the repository:

    git clone https://github.com/TimLukaHorstmann/T-REX.git
    cd T-REX
  2. Sync dependencies (first time only):

    uv sync
  3. Run Everything (Single Command, Dev):

    • Ensure the Ollama service (with required models) is running (see Prerequisites).
    • From the project root, run:
      ./dev.sh
    • Then open: http://localhost:8000

    Notes:

    • The FastAPI app also serves the frontend from frontend/, so / loads the UI and /api/* serves backend endpoints. Dataset assets are served under /static/data/*.
    • Live reload is enabled; changes under backend/api auto-reload the server.
  4. Alternative (Separate Servers): If you prefer running frontend and backend separately:

    • Backend:
      uvicorn main:app --reload --host 0.0.0.0 --port 8000 --app-dir backend/api
    • Frontend (new terminal):
      cd frontend && python3 -m http.server 8080

    This requires a dev proxy to avoid CORS and path issues because the frontend fetches relative /api/* paths. The recommended approach is the single-command dev server above.

OCR Options

  • Recommended: Ollama Granite 3.2 Vision (pull granite3.2-vision). In the UI, select OCR engine β€œOllama” + model granite3.2-vision.
  • Optional: Tesseract OCR (system binary required on PATH)
    • Install the Tesseract binary:
      • macOS: brew install tesseract
      • Ubuntu/Debian: sudo apt-get install tesseract-ocr
    • Ensure it’s discoverable: which tesseract and tesseract --version
    • Advanced: If tesseract is not on PATH, set environment variable TESSERACT_CMD to the full path (e.g., /opt/homebrew/bin/tesseract).
    • Python pieces (pytesseract, pillow) are already listed in pyproject.toml and installed via uv sync.

Note on Deployment: The steps above describe a basic local development setup. For deploying T-REX to a server (like the live demo at t-rex.r2.enst.fr), you would typically:

  • Run the FastAPI backend using a production-grade ASGI server like uvicorn with gunicorn workers.
  • Set up a reverse proxy (e.g., Nginx or Caddy) to handle HTTPS, serve static frontend files efficiently, and forward API requests to the backend application.
  • Manage the backend process using a process manager (e.g., systemd, supervisor) to ensure it runs reliably.
  • Ensure the Ollama service is appropriately configured and accessible by the backend on the server.

These production deployment steps are environment-specific and beyond the scope of this basic setup guide.

πŸ“– Citation

If you use T-REX in academic work, please cite our ECML-PKDD 2025 demo paper:

@inproceedings{Horstmann2025TREX,
  author    = {Tim Luka Horstmann and Baptiste Geisenberger and Mehwish Alam},
  title     = {T-REX: Table -- Refute or Entail eXplainer},
  booktitle = {Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track and Demo Track, European Conference, ECML PKDD 2025, Porto, Portugal, September 15--19, 2025, Proceedings},
  series    = {Lecture Notes in Computer Science},
  volume    = {16022},
  pages     = {590--596},
  publisher = {Springer, Cham},
  year      = {2025},
  doi       = {10.1007/978-3-032-06129-4_33},
  url       = {http://dx.doi.org/10.1007/978-3-032-06129-4_33}
}

πŸ“„ License

This software is released under a Custom Non-Commercial License.
It is free to use for research, academic, or personal purposes.

πŸ›‘ Commercial use is prohibited without explicit written permission from the authors.

To inquire about commercial licensing, please contact:
tim.horstmann@ip-paris.fr

See the LICENSE file for full terms.

πŸ“ Authors

Institut Polytechnique de Paris


Β© 2025 T-REX: Table - Refute or Entail eXplainer




πŸ” How does T-REX compare?

Here's a comparison of T-REX with some other comparable table fact-checking and question-answering tools:

Tool (link) Year Live Demo / UI Real-time* Table Upload OCR / Image Evidence Viz† LLM Backend Code Open?
T-REX (ours)
Demo
2025 βœ… Live βœ… streaming βœ… CSV / text / image (OCR) βœ… Tesseract & Granite 3.2 βœ… cell highlighting & reasoning stream Phi-4, DeepSeek-R1, Cogito v1, Gemma3 βœ…
OpenTFV
Paper
2022 ⚠️ Prototype UI (conference demo; no public deployment) βœ… immediate synchronous βœ… CSV, JSON, PDF ❌ βœ… NL interp. & entity linking TAPAS & LPA ❌
Aletheia
Paper
2024 ⚠️ No public demo available ⚠️ async ❌ fixed datasets only ❌ βœ… interactive tables & visualizations Proprietary LLMs GPT-3.5/4 ❌
HF Space (J. Simon)
Demo
2023 ⚠️ HF Space (runtime errors) βœ… immediate βœ… CSV upload ❌ ❌ TAPAS βœ…
RePanda
Paper
2025 ❌ CLI only ❌ offline βœ… via Pandas API ❌ βœ… executable query scripts Llama-7B βœ…
TabVer
Paper
2024 ❌ CLI only ❌ offline βœ… code-based ingestion ❌ βœ… natural-logic proofs LLM-generated expressions βœ…
TART
Paper
2023 ❌ CLI only ❌ offline βœ… ❌ ❌ Plugin-based reasoning βœ…

* Real-time = immediate verdict; β€œstream” means token-level reasoning.
† Evidence Viz = visual or structured justification beyond a plain label.


πŸ“Š Model Performance Overview

Performance comparison of different models on the TabFact dataset as reported by Chen, 2025 and Meta AI or evaluated as part of this work.

Model Test Accuracy (%) Validation Accuracy (%) Year
ARTEMIS-DA Hussain et al., 2024 93.1 (on test-small) - 2024
Dater Ye et al., 2023 93.0 (on test-small), 85.6 (on test-all) - 2023
Human Performance: β‰ˆ 92% Chen et al., 2020
PASTA Gu et al., 2022 89.3 89.2 2022
Phi4 (Zero Shot) (Ours) 88.9 (on test-all) - 2024
UL-20B Tay et al., 2023 87.1 2022
Chain-of-Table Wang et al., 2024 86.6 - 2024
Binder Cheng et al., 2023 86.0 - 2022
Tab-PoT Xiao et al., 2024 85.8 - 2024
Phi4 (RAG Approach) (Ours) 85.7 - 2024
ReasTAP-Large Zhao et al., 2022 84.9 84.6 2022
TAPEX-Large Liu et al., 2022 84.2 84.6 2021
T5-3b (UnifiedSKG) Xie et al., 2022 83.7 84.0 2022
DecompTAPAS Yang et al., 2021 82.7 82.7 2021
Salience-aware TAPAS Wang et al., 2021 82.1 82.7 2021
Phi4 (Code Generation) (Ours) 81.9 - 2024
TAPAS-Large classifier with Counterfactual + Synthetic pre-training Eisenschlos et al., 2020 81.0 81.0 2020
ProgVGAT Yang et al., 2021 74.4 74.9 2020
SAT Zhang et al., 2020 73.2 73.3 2020
HeterTFV Shi et al., 2020 72.3 72.5 2020
LFC (Seq2Action) Zhong et al., 2020 71.7 71.8 2020
LFC (LPA) Zhong et al., 2020 71.6 71.7 2020
Num-Net Ran et al., 2019 72.1 72.1 2019
LPA-Ranking w/ Discriminator (Caption) Chen et al., 2020 65.3 65.1 2020
Table-BERT-Horizontal-T+F-Template Chen et al., 2020 65.1 66.1 2020
BERT classifier w/o Table Chen et al., 2020 50.5 50.9 2020

About

Real-time fact-checking of tabular claims with SOTA reasoning LLMs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published