RLHF
Development Of LLM Chatbot Specialized For Multiple Choice Questions In Physics At Indian High School Level
Development Of LLM Chatbot Specialized For Multiple Choice Questions In Physics At Indian High Schoo ...
Large Language Models
[DPO] A Method For Directly Matching Large-scale Language Models To User Preferences Without Using Reinforcement Learning
[DPO] A Method For Directly Matching Large-scale Language Models To User Preferences Without Using R ...
RLHF
EUREKA: Automated Compensation Design With LLM
EUREKA: Automated Compensation Design With LLM
RLHF