Description
DataChain currently uses the term "checkpoint" in user-facing messages for two different concepts:
- recovery from incomplete execution (UDF-level checkpoints)
- reuse of already computed datasets (save-level)
This leads to misleading UX. There is no problem with UDF-checkpoints (1) but dataset reuse leads to a misleading message:
Checkpoint found for dataset 'my_ds', skipping creation
However, "checkpoint" implies partial or incomplete progress and recovery failure. In this case, user thinks that failure happened in the last run.
It should be replaced to:
Reusing dataset 'my_ds@1.0.2' (skipping recompute)
This issue is about UX but this problem might affect internals as well.
Description
DataChain currently uses the term "checkpoint" in user-facing messages for two different concepts:
This leads to misleading UX. There is no problem with UDF-checkpoints (1) but dataset reuse leads to a misleading message:
However, "checkpoint" implies partial or incomplete progress and recovery failure. In this case, user thinks that failure happened in the last run.
It should be replaced to:
This issue is about UX but this problem might affect internals as well.