Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not sure if anyone is using this this way, but this seems like a great way to test LLM reasoning performance as it is created anew each day and there is no chance it is included in the training data.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: