Catch up on the latest AI articles

What is AI-SCHOLAR?

Semantics-Oriented Reward Design With "PrefBERT," A New Evaluation Method To Evolve Long Sentence Generation

Semantics-Oriented Reward Design With "PrefBERT," A New Evaluation Method To Evolve Long Sentence Ge ...

LLMs As Mentors Instead Of Humans? Reinforcement Learning Agents Trained In Natural Language

LLMs As Mentors Instead Of Humans? Reinforcement Learning Agents Trained In Natural Language

Development Of LLM Chatbot Specialized For Multiple Choice Questions In Physics At Indian High School Level

Development Of LLM Chatbot Specialized For Multiple Choice Questions In Physics At Indian High Schoo ...

09/09/2024 Large Language Models

[DPO] A Method For Directly Matching Large-scale Language Models To User Preferences Without Using Reinforcement Learning

[DPO] A Method For Directly Matching Large-scale Language Models To User Preferences Without Using R ...

02/02/2024 RLHF

EUREKA: Automated Compensation Design With LLM

EUREKA: Automated Compensation Design With LLM

04/12/2023 RLHF