I came across this guide (dated 2025) a couple years ago and thought it was interesting. Not a quant or even in finance though, so I don’t know how accurate it is:
I use markitdown[0] religiously. You’ll lose fidelity for anything complex (math equations, images), but it does a great job 95% of the time in my experience.
I’m assuming the attack surface is reduced. I invoke it through a docker container. But this might be a misplaced sense of safety.
A one-way link (data diode) transmits it to a box with simplified hardware (eg RISC architecture). The box has a dedicated monitor and keyboard. Once you're finished, you sell the box on Craiglist. Then, buy a new, sealed replacement from Best Buy.
Pay per view was an expensive, business model for cable. For PDF's, it's even more expensive.
Note: It's more convenient than full, per-app, physical security.
https://www.dropbox.com/scl/fi/da7zfjj2rplwzf2sfiriz/Buy-Sid...