Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In this case I'm building a batch workflow: images come in, images get analyzed through a pipeline, images go into a GUI for review. The idea of using a VLM was just to avoid hand-building a solution, not because I actually want to use it in a chatbot. It's just interesting that a generalist model that has expert-level handwriting recognition completely falls apart on a different, but much easier, task.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: