Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The technical term is constrained decoding. OpenAI has had this for almost a year now. They say it requires generating some artifacts to do efficiently, which slows down the first response but can be cached.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: