《Fine-Tuning Language Models from Human Preferences》

2023/03/15 RLHF 共 147 字,约 1 分钟
-->

Search

    Table of Contents