Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Using external memory instead of encoding all of the knowledge in the model will take over all branches of applied ML.

A recognition model should use a similar mechanism to store short term context in a memory buffer from previous frames and a large external database of long term key value pairs that retain relevant semantic information for given embeddings.

Doing so will make it possible to update and expand the models without having to retrain and enable much better zero/few shot learning.

We already have a hacky version of this in our production app for food recognition. For new users we use a standard CNN to predict the items present in the image, once a user logs a few meals we use nearest neighbor search to match new images against previously submitted entries, which works extremely well.



Yes! I have long thought that GPT type of model are huge because they are forced to encode a lot of raw knowledge, giving them the ability to search for knowledge in a database solves that problem which should help making them smaller while scaling to larger datasets.

The cherry on top is that you could get not only information from the model but also the sources it used to make up its mind!


Do you apply the NN search on the raw images themselves or on the latent vector from the CNN?


Image embeddings, NN on raw images would scale horribly and not return anything relevant.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: