Self-assessment
Semantics-Oriented Reward Design With "PrefBERT," A New Evaluation Method To Evolve Long Sentence Generation
Semantics-Oriented Reward Design With "PrefBERT," A New Evaluation Method To Evolve Long Sentence Ge ...
OpenScholar: Knowledge Synthesis And Reliability Enhancement Of Scientific Literature With LLM
OpenScholar: Knowledge Synthesis And Reliability Enhancement Of Scientific Literature With LLM
[Mustango] Music Generation Model Utilizing Domain Knowledge Of Music
[Mustango] Music Generation Model Utilizing Domain Knowledge Of Music
Audio And Speech Processing
A Method To Automatically Evaluate "the Accuracy Of LLM's Output Of Long Sentences" Was Created
A Method To Automatically Evaluate "the Accuracy Of LLM's Output Of Long Sentences" Was Created
Large Language Models
RLHF: How To Train Reinforcement Learning Agents Using Human Evaluation
RLHF: How To Train Reinforcement Learning Agents Using Human Evaluation
Alignment
Accurate Modeling Of Human-like And Interestingness Of Generated Text: MAUVE
Accurate Modeling Of Human-like And Interestingness Of Generated Text: MAUVE
Natural Language Processing