Inproceedings,

SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification.

, , , , , , , , , , , , , , and .
ASPLOS (3), page 932-949. ACM, (2024)

Meta data

Tags

Users

  • @dblp

Comments and Reviews