After working pretty closely with vLLM and SGLang over the past few months, this is EXACTLY what I had envisioned what a successor project would look like - analyzing an operation dependency graph and then fusing (or, at a minimum, scheduling tasks smarter). Congrats to the team.
Thanks a lot for your positive feedback! We believe that MPK can enhance existing LLM serving systems, especially for low-latency LLM serving. We are very excited about the opportunity to collaborate with others on direction.