SJTUyh

Follow

Hanye SJTUyh

Follow

@AISBench @sjtu

@AISBench
Shanghai

Achievements

Achievements

Popular repositories Loading

mini-swe-agent mini-swe-agent Public

Forked from SWE-agent/mini-swe-agent

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >70% on SWE-bench verified!

Python
SWE-bench SWE-bench Public

Forked from SWE-bench/SWE-bench

SWE-bench: Can Language Models Resolve Real-world Github Issues?

Python
benchmark benchmark Public

Forked from AISBench/benchmark

AISBench Benchmark is a model evaluation tool built on OpenCompass, compatible with OpenCompass’s configuration system, dataset structure, and model backend implementation, while extending support …

Python
opencompass opencompass Public

Forked from open-compass/opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python
ci_test_tmp ci_test_tmp Public

Forked from AISBench/ci_test

test ci