Popular repositories Loading
-
mini-swe-agent
mini-swe-agent PublicForked from SWE-agent/mini-swe-agent
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >70% on SWE-bench verified!
Python
-
SWE-bench
SWE-bench PublicForked from SWE-bench/SWE-bench
SWE-bench: Can Language Models Resolve Real-world Github Issues?
Python
-
benchmark
benchmark PublicForked from AISBench/benchmark
AISBench Benchmark is a model evaluation tool built on OpenCompass, compatible with OpenCompass’s configuration system, dataset structure, and model backend implementation, while extending support …
Python
-
opencompass
opencompass PublicForked from open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Python
-
If the problem persists, check the GitHub status page or contact support.



