Add `only` option to `data` entries #61

AriPerkkio · 2024-12-12T13:23:01Z

While developing test cases, I want to focus on a single input.

In my setup I have quite large set of already verified evals. I want to add new one and work on that one in watch mode for a while. Running every single input unnecessarily can be slow and expensive.

evalite("Something", {
  data: async () => {
    return [
      // A should produce B when ...
      { input: 'A', expected: 'B' },

      // C should produce D when ...
      { input: 'C', expected: 'D' },

+     // E should produce F when ...
+     { input: 'E', expected: 'F' },
    ];
  }

I would like to be able to add only: true flag to a data entry. This should skip all other data entries.

evalite("Something", {
  data: async () => {
    return [
       { input: 'A', expected: 'B' },
       { input: 'C', expected: 'D' },
-      { input: 'E', expected: 'F' },
+      { input: 'E', expected: 'F', only: true },

The text was updated successfully, but these errors were encountered:

mattpocock · 2024-12-13T10:45:30Z

@AriPerkkio

This is interesting! It's something I have been noodling on a fair bit.

One thing I wondered is whether an aggressive, opt-in cache would be useful for designing datasets.

I.e. running evalite watch --experimental-cache-results, which would introduce an in-memory cache for data only.

There are obviously downsides to this: you might change a piece of non-data code, which would NOT invalidate the cache. You might accidentally leave it on, leading to confusion.

only is more explicit, is visible to the user, and intuitive.

mattpocock · 2024-12-13T10:49:31Z

I have also been thinking about using a separate file for data. Either a JSON file (bad for DX) or a markdown file:

# Input

My prompt here

# Expected

My expected output here

---

# Input

![](./image.png)

# Expected

A painting of a penguin in watercolours.

(image links would be used to pass images/files)

You would then reference this markdown/JSON file in tests:

evalite('Test', {
  data: './inputs.md',
});

The useful thing about a separate file is that it reduces the surface area for caching. You know that if a markdown file change causes a test re-run, only the inputs and 'expected' have changed.

Might also just be a nicer authoring experience than writing JS objects.

ShiboSoftwareDev · 2024-12-13T12:08:06Z

@mattpocock we used toml format to represent our data, it's neat and easy to parse

AriPerkkio · 2024-12-16T06:39:34Z

Opt-in cache sounds also good, but maybe it's a bit too magical/automatic. I like how explicit only flag would be.

For cache invalidation you could probably use Vite's module graph to see when module's depedency tree contains changes - exactly how Vite's HMR and Vitest's watch-mode work. Though in my use case it would not help, as I'm just calling same function with different inputs. Changes in the function would trigger re-run for all inputs I guess.

So far I've liked the format of data property, as in defining data in plain JS objects.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `only` option to `data` entries #61

Add `only` option to `data` entries #61

AriPerkkio commented Dec 12, 2024

mattpocock commented Dec 13, 2024

mattpocock commented Dec 13, 2024 •

edited

Loading

ShiboSoftwareDev commented Dec 13, 2024

AriPerkkio commented Dec 16, 2024 •

edited

Loading

Add only option to data entries #61

Add only option to data entries #61

Comments

AriPerkkio commented Dec 12, 2024

mattpocock commented Dec 13, 2024

mattpocock commented Dec 13, 2024 • edited Loading

ShiboSoftwareDev commented Dec 13, 2024

AriPerkkio commented Dec 16, 2024 • edited Loading

Add `only` option to `data` entries #61

Add `only` option to `data` entries #61

mattpocock commented Dec 13, 2024 •

edited

Loading

AriPerkkio commented Dec 16, 2024 •

edited

Loading