what is the difference sft data and rl data?

It seems that there is only one script for generating the training dataset:

bash scripts/run_nuplan_preprocessing.sh

I have a few questions:

1.Are both the SFT data and RL data generated by this unified preprocessing script?
2.What are the key differences in how SFT data and RL data are constructed?
3.How do we determine the appropriate SFT-to-RL data ratio when training on the full dataset, such as WOD-E2E and nuPlan?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

what is the difference sft data and rl data? #35

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

what is the difference sft data and rl data? #35

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions