-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add only
option to data
entries
#61
Comments
This is interesting! It's something I have been noodling on a fair bit. One thing I wondered is whether an aggressive, opt-in cache would be useful for designing datasets. I.e. running There are obviously downsides to this: you might change a piece of non-data code, which would NOT invalidate the cache. You might accidentally leave it on, leading to confusion.
|
I have also been thinking about using a separate file for data. Either a JSON file (bad for DX) or a markdown file: # Input
My prompt here
# Expected
My expected output here
---
# Input

# Expected
A painting of a penguin in watercolours. (image links would be used to pass images/files) You would then reference this markdown/JSON file in tests: evalite('Test', {
data: './inputs.md',
}); The useful thing about a separate file is that it reduces the surface area for caching. You know that if a markdown file change causes a test re-run, only the inputs and 'expected' have changed. Might also just be a nicer authoring experience than writing JS objects. |
@mattpocock we used toml format to represent our data, it's neat and easy to parse |
Opt-in cache sounds also good, but maybe it's a bit too magical/automatic. I like how explicit For cache invalidation you could probably use Vite's module graph to see when module's depedency tree contains changes - exactly how Vite's HMR and Vitest's watch-mode work. Though in my use case it would not help, as I'm just calling same function with different inputs. Changes in the function would trigger re-run for all inputs I guess. So far I've liked the format of |
While developing test cases, I want to focus on a single input.
In my setup I have quite large set of already verified evals. I want to add new one and work on that one in watch mode for a while. Running every single input unnecessarily can be slow and expensive.
I would like to be able to add
only: true
flag to a data entry. This should skip all other data entries.The text was updated successfully, but these errors were encountered: