Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Most claims of "nearly as good results" are massively overblown.

Even the so called "good" quants of huge models are extremely crippled.

Nothing is ever free, and even going from 16 to 8bit will massively reduce the quality of your model, no matter whatever their hacked benchmarks claim.

No, it doesn't help because of "free regularization" either. Dropout and batch norm were also placebo BS that didn't actually help to back in the day when they were still being used.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: