Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

DeepSeek's MLA paper was published in 2024: https://arxiv.org/abs/2405.04434

DeepSeek's Sparse Attention paper was published in February: https://arxiv.org/abs/2502.11089

DeepSeek 3.2 Exp (combining MLA and DSA) was released in September.

You also had several other Chinese hybrid models, like Qwen3 Next and Minimax M1.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: