Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hmm, well the super secret stuff we’re working on comes directly to mind, but if I set that aside, boring entity resolution is actually a big pain point.

Regardless of their sophistication, 3rd party data products in football tend to rely on manually collected and maintained player metadata. It can be unreliable. If I could reliably have a durable unique ID for every player, manager, and team in world football along with reliable timestamps for every moment each said player entered and left play, that would be pretty great. When joining together disparate data sources, discrepancies in things this simple cause all sorts of pain downstream.



Has anyone tried to link and dedupe the various datasets using a probabilistic linkage tool like Splink?

https://moj-analytical-services.github.io/splink/

(Disclaimer: I am the lead author, but the tool is FOSS)


We've used various methods over the years, but we'll check this one out. Thanks!


with raw feed off fixed cameras YOLO could be trained to do that. exciting for whomever gets it!


This works really well until you're dealing with potato quality video, weird fonts, and shirts plastered with ads, or even pretty good quality video and white shirts with white lettering (https://www.arsenal.com/news/how-volunteer-part-no-more-red).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: