§ feed · cluster
LightSeek Foundation releases TokenSpeed, open-source LLM inference
MarkTechPost reports LightSeek Foundation’s MIT-licensed TokenSpeed inference engine (in preview) tailored for agentic workloads, claiming performance improvements over TensorRT-LLM on decode latency and throughput and describing a KV-cache-safety scheduler design.
§ sources1 publication · timeline below