Our project welcomes external contributions. If you have an itch, please feel free to scratch it.
To contribute code or documentation, please submit a pull request.
To report vulnerabilities privately, you can contact the authors by email (see MAINTAINERS.md).
A good way to familiarize yourself with the codebase and contribution process is to look for and tackle low-hanging fruits in the issue tracker.
Look at the Local Development Guide for instructions on setting up a local development environment.
Note: We appreciate your effort, and want to avoid a situation where a contribution requires extensive rework (by you or by us), sits in backlog for a long time, or cannot be accepted at all!
If you would like to implement a new feature, please raise an issue before sending a pull request so the feature can be discussed. This is to avoid you wasting your valuable time working on a feature that the project developers are not interested in accepting into the code base.
If you would like to fix a bug, please FIXME raise an issue before sending a pull request so it can be tracked.
The project maintainers use LGTM (Looks Good To Me) in comments on the code review to indicate acceptance.
For a list of the maintainers, see the MAINTAINERS.md page.
We have tried to make it as easy as possible to make contributions. This applies to how we handle the legal aspects of contribution. We use the same approach - the Developer's Certificate of Origin 1.1 (DCO) - that the Linux® Kernel community uses to manage code contributions.
We simply ask that when submitting a patch for review, the developer must include a sign-off statement in the commit message.
Here is an example Signed-off-by line, which indicates that the submitter accepts the DCO:
Signed-off-by: John Doe <john.doe@example.com>You can include this automatically when you commit a change to your local git repository using the following command:
git commit -sPlease feel free to connect with us through issues and pull requests on this repository.
Please see the README for setup instructions.
Testing is not available at this time.
EvalAssist relies heavily on Unitxt, a Python library designed for flexible and reusable data preparation and evaluation for generative AI models. EvalAssist's LLM-as-a-Judge evaluator are part of the Unitxt sourcode. To enhance the broader community and promote interoperability, we encourage you to contribute your evaluation criteria and related components to Unitxt.
The most straightforward way to add a new criterion to Unitxt is to create an issue and filling it with the required information. In the issue's body, copy and past your criteria from EvalAssist -you can copy/paste the json format in the JSON tab in the Evaluation criteria section-. For example:
{
"name": "Temperature in celsius and fahrenheit",
"description": "In the response, if there is a numerical temperature present, is it denominated in both Fahrenheit and Celsius?",
"options": [
{
"name": "Yes",
"description": "The temperature reading is provided in both Fahrenheit and Celsius."
},
{
"name": "No",
"description": "The temperature reading is provided either in Fahrenheit or Celsius, but not both."
},
{
"name": "Pass",
"description": "There is no numerical temperature reading in the response."
}
]
}In the issue, you should add information about when this criteria should be used, what data you used to test it, how you validated it and how well it works. While we do not make decisions about new Unitxt criteria, perhaps general guidance could be how valuable a tested criteria could be to the broader AI research community. You may want to include any comments in this regard as well. Please also add the names of the context variables used and the name of the response variable. In addition, add a mapping between the option names and numerical values. Unitxt uses numerical values to compute metrics like Pearson correlation when benchmarking evaluators on specific criteria. For example:
Context variables: []
Response variable: response
Option map: {"Yes": 1.0, "No": 0.5, "Pass": 0.0}
If you are familiar with the Unitxt environment. You can perform the following steps:
- Find the
DirectCriteriaCatalogEnumorPairwiseCriteriaCatalogEnumenum in the llm_as_judge_constants.py file -depending on wether you want to contribute a direct or pairwise criteria-. - Create your criteria object (using either
CriteriaWithOptionsorCriteria). - Run prepare file to create the criteria's artifact.