visualizer

Aug 1, 2024

12b76ef · Aug 1, 2024

Name	Name	Last commit message	Last commit date
parent directory ..
app	app	Refactor human evaluation	Aug 1, 2024
README.md	README.md	Add FreeEval Visualizer	Aug 1, 2024
app.py	app.py	Add FreeEval Visualizer	Aug 1, 2024

README.md

FreeEval Visualizer

FreeEval Visualizer is a web-based tool designed to help researchers and practitioners visualize and analyze evaluation results for large language models. It provides an intuitive interface for exploring evaluation data, conducting human evaluations, and gaining insights into model performance.

Features

Dashboard: Get an overview of evaluation results with interactive charts and summary statistics.
Analysis: Dive deep into the data with detailed visualizations and correlation analysis.
Case Browser: Easily search and filter through individual evaluation cases.
Human Evaluation: Create and manage human evaluation sessions for more nuanced assessments.
Multi-mode Support: Compatible with various evaluation types including pairwise comparisons, direct scoring, and matching evaluations.

Installation

Clone the repository:

git clone https://github.com/WisdomShell/FreeEval.git
cd FreeEval

Create a virtual environment and activate it:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```
Run evaluation with FreeEval, and the results for visualization will be saved in a JSON file, the path will be shown in the console output.

Usage

Start the Flask development server:

python visualizer/app.py --mode [evaluation-mode] --result-path [path-to-results-json] --port [port-number] --addr [address]

Replace [evaluation-mode] with either pairwise-comparison, direct-scoring, or matching.

Open a web browser and navigate to http://localhost:[port-number] (replace with the actual port number you specified).
Use the sidebar navigation to explore different features of the visualizer.

Human Evaluation

To conduct human evaluations:

Click on "Human Evaluation" in the sidebar.
Create a new evaluation session or load an existing one.
Follow the on-screen instructions to annotate cases.
Use the progress bar to track your annotation progress.

Contributing

Contributions to FreeEval Visualizer are welcome! Please feel free to submit a Pull Request.

Acknowledgments

This project is part of the FreeEval framework for evaluating large language models.
Built with Flask, Tailwind CSS, and Flowbite components.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

visualizer

visualizer

README.md

FreeEval Visualizer

Features

Installation

Usage

Human Evaluation

Contributing

Acknowledgments

Files

visualizer

Directory actions

More options

Directory actions

More options

Latest commit

History

visualizer

Folders and files

parent directory

README.md

FreeEval Visualizer

Features

Installation

Usage

Human Evaluation

Contributing

Acknowledgments