Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>Does it handle complex documents better or something?

Exactly. It can handle tables of information with a variable number of rows from document to document. It can also handle arbitrary rotation, skew, offset and cropping. To see an example of some extreme cases: https://siftrics.com/hydra.html

>What would be the difference between Siftrics vs standalone OCR software?

The reason I started Siftrics is this:

Lots and lots of businesses need to put data from their documents into a database. Standalone text recognition gets you 75% there.

I thought, at the end of the day, companies still have to hire engineers to write programs --- which _do_ leverage standalone OCR/text recognition --- that are specially tailored to their documents.

I want to eliminate those specially-tailored programs. Now Hydra isn't perfect, but in many cases (people are willing to pay for it), Hydra reduces those specially-tailored programs to a single function call...

  client.recognize('avionics-invoice', ['invoice_1.pdf', 'invoice_2.pdf'])
...followed by your database insert. :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: