ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
RoadMa
RoadQAQ
AI & ML interests
None yet
Recent Activity
updated
a dataset
about 5 hours ago
RoadQAQ/sft_for_rl
published
a dataset
about 5 hours ago
RoadQAQ/sft_for_rl
updated
a collection
3 months ago
Data for DataFlex