[SCoRe] Reinforcement Learning To Enhance LLM's Ability To Self-correct! Identify And Correct Errors In A Multi-step Process
[SCoRe] Reinforcement Learning To Enhance LLM's Ability To Self-correct! Identify And Correct Errors ...
Large Language Models