Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Cool post. We did a similar evaluation for document segmentation using the DocLayNet benchmark from IBM: https://ds4sd.github.io/icdar23-doclaynet/task/ but on modern document OCR models like Mistral, OpenAI, and Gemini. And what do you know, we found similar performance -- DETR-based segmentation models are about 2x better.

Disclosure: I work for https://aryn.ai/



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: