Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
menaerus
3 months ago
|
parent
|
context
|
favorite
| on:
vLLM large scale serving: DeepSeek 2.2k tok/s/h200...
It's not about the python at all. Optimization techniques are on a completely different level, on the level of the chip and/or hw platform and finding ways to utilize them in a max manner by exploiting the intrinsic details about their limitations.
Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: