Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's an issue that you run into as long as you're forced to start with a yes/no answer. It's a problem forward-only LLMs have and diffusion models don't, and normal block diffusion is closer to forward LLMs than diffusion models.

You could increase the block size to act more like a full diffusion model, but you would lose some of the benefits of block diffusion.



Interesting. Makes me want to play around with an open diffusion LM. Do you have any recommendations?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: