Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's really interesting, thanks for sharing!

Are you using that approach in production for grounding when PDFs don't include embedded text, like in the case of scanned documents? I did some experiments for that use case, and it wasn't really reaching the bar I was hoping for.



Yes, this was completely image-based. Not quite of a point of using it in production since I agree it can be flakey at times. Although I do think there's viable workarounds, like sending the same prompt multiple times, and seeing if the returned results overlap.

It really feels like we're maybe half a model generation away from this being a solved problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: