Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
fpgaminer
on Sept 6, 2023
|
parent
|
context
|
favorite
| on:
Can LLMs learn from a single example?
That's effectively what RLHF is; a means for LLMs to self train on their own output exclusively by using a small human curated dataset as guidance as to what a "good" and "bad" output is.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: