Skip to content
Change the repository type filter

All

    Repositories list

    • harbor

      Public
      Harbor is a framework for running agent evaluations and creating and using RL environments.
      Python
      Apache License 2.0
      8981.4k79170Updated Apr 13, 2026Apr 13, 2026
    • skills

      Public
      Public agent skills catalog for Harbor
      Apache License 2.0
      1501Updated Apr 12, 2026Apr 12, 2026
    • terminal-bench-science

      Public
      Terminal-Bench-Science: Evaluating AI Agents on Complex Real-World Scientific Workflows in the Terminal
      Python
      Apache License 2.0
      3458116Updated Apr 11, 2026Apr 11, 2026
    • terminal-bench-3

      Public
      🚧 Accepting Task Submissions 🚧
      Python
      125116083Updated Apr 11, 2026Apr 11, 2026
    • benchmark-template

      Public template
      Harbor Benchmark Template
      Python
      7776Updated Apr 11, 2026Apr 11, 2026
    • harbor-cookbook

      Public
      Realistic examples of building evals and optimizing agents with Harbor
      Python
      Apache License 2.0
      45501Updated Apr 11, 2026Apr 11, 2026
    • awesome-harbor

      Public
      A curated list of awesome Harbor ecosystem projects
      22601Updated Apr 5, 2026Apr 5, 2026
    • t-bench-docs

      Public
      TypeScript
      13620Updated Apr 3, 2026Apr 3, 2026
    • terminal-bench-2

      Public
      Shell
      Apache License 2.0
      621711017Updated Apr 1, 2026Apr 1, 2026
    • harbor-docs

      Public
      MDX
      9206Updated Mar 31, 2026Mar 31, 2026
    • Shell
      3003Updated Mar 26, 2026Mar 26, 2026
    • terminal-bench

      Public
      A benchmark for LLMs on complicated tasks in the terminal
      Python
      Apache License 2.0
      4992k106186Updated Jan 22, 2026Jan 22, 2026
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.