https://blog.lmcache.ai/2025-04-29-pdbench/ #22

2025-04-30T05:33:49Z

giscus[bot]
bot Apr 30, 2025

https://blog.lmcache.ai/2025-04-29-pdbench/

TL;DR: In our previous blog, we introduced LMCache’s integration with vLLM v1 and NVIDIA’s NIXL used in Dynamo, enabling Prefill-Decode Disaggregation (PD) for LLM inference. Today, we’re excited to share benchmark results that confirm this system achieves state-of-the-art PD performance, balancing time-to-first-token (TTFT) and inter-token latency (ITL) with unprecedented consistency....

https://blog.lmcache.ai/2025-04-29-pdbench/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

https://blog.lmcache.ai/2025-04-29-pdbench/ #22

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

https://blog.lmcache.ai/2025-04-29-pdbench/ #22

Uh oh!

giscus[bot] bot Apr 30, 2025

https://blog.lmcache.ai/2025-04-29-pdbench/

Replies: 0 comments

giscus[bot]
bot Apr 30, 2025