[Feature] EvalAlgorithmInterface.evaluate should accept a list of DataConfigs for consistency

Today [EvalAlgorithmInterface.evaluate](https://github.com/aws/fmeval/blob/220fdc090ee4f1db7174f6e7661a282a5987d1af/src/fmeval/eval_algorithms/eval_algorithm.py#L33) is typed to return `List[EvalOutput]` ("for dataset(s)", per the docstring), but its `dataset_config` argument only accepts `Optional[DataConfig]`.

It seems like most concrete eval algorithms (like [QAAccuracy here](https://github.com/aws/fmeval/blob/220fdc090ee4f1db7174f6e7661a282a5987d1af/src/fmeval/eval_algorithms/qa_accuracy.py#L189)) **either** take the user's `data_config` for a single dataset, **or** take *all* the [pre-defined DATASET_CONFIGS](https://github.com/aws/fmeval/blob/220fdc090ee4f1db7174f6e7661a282a5987d1af/src/fmeval/eval_algorithms/__init__.py#L242) relevant to the evaluator's problem type.

...So the internal logic of evaluators is set up to support providing multiple datasets and returning multiple results already, but we seem to prevent users from calling `evaluate()` with multiple of their own datasets for no particular reason?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] EvalAlgorithmInterface.evaluate should accept a list of DataConfigs for consistency #269

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] EvalAlgorithmInterface.evaluate should accept a list of DataConfigs for consistency #269

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions