Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You lose a lot of stability. Each operation's result is slightly off, and the error accumulates and compounds. For deep learning in particular, many operations are carried in sequence and the error rates can become inacceptable.


Deep learning is actually very tolerant to imprecision, which is why it is typically given as an application for analog computing.

It is already common practice to deliberately inject noise into the network (dropout) at rates up to 50% in order to prevent overfitting.


Isn't it just for inference? Also, differentiating thru an analog circuit looks... interesting. Keep inputs constant, wiggle one weight a bit, store how the output changed, go to the next weight, repeat. Is there something more efficient, I wonder.


Definitely, if your analog substrate is implementing matrix vector multiplications (one of the most common approaches in this area). Then your differentiation algorithm is the usual backpropagation, which has rank 1 weight updates. With some architectures this can be implemented in O(1) time simply by running the circuit in "reverse" configuration (inputs become outputs and vice-versa). With ideal analog devices, this would be many orders of magnitude more efficient than a GPU.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: