Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization.

J. Yang, B. Kim, J. Bae, B. Kwon, G. Park, E. Yang, S. Kwon, and D. Lee. CoRR, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Hyogmyon Kwon

Jagun Kwon

Youngshin Kwon

Kwangsoo Kwon

Sanghee Kwon

Other publications of authors with the same name

Structured Compression by Unstructured Pruning for Sparse Quantized Neural Networks.S. Kwon, D. Lee, B. Kim, P. Kapoor, B. Park, and G. Wei. CoRR, (2019)Network Pruning for Low-Rank Binary Indexing.D. Lee, S. Kwon, B. Kim, P. Kapoor, and G. Wei. CoRR, (2019)Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization.J. Kim, J. Lee, S. Kim, J. Park, K. Yoo, S. Kwon, and D. Lee. CoRR, (2023)nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models.G. Park, B. Park, S. Kwon, B. Kim, Y. Lee, and D. Lee. CoRR, (2022)Modulating Regularization Frequency for Efficient Compression-Aware Model Training.D. Lee, S. Kwon, B. Kim, J. Yun, B. Park, and Y. Jeon. CoRR, (2021)Simulation-Based Optimization on the System-of-Systems Model via Model Transformation and Genetic Algorithm: A Case Study of Network-Centric Warfare.B. Kang, S. Choi, S. Kwon, J. Lee, and T. Kim. Complex., (2018)FleXOR: Trainable Fractional Quantization.D. Lee, S. Kwon, B. Kim, Y. Jeon, B. Park, and J. Yun. NeurIPS, (2020)Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models.J. Heo, J. Kim, B. Kwon, B. Kim, S. Kwon, and D. Lee. CoRR, (2023)Design and implementation of event-based DEVS execution environment for faster execution of iterative simulation.S. Kwon, and T. Kim. SpringSim (TMS-DEVS), page 14. SCS/ACM, (2012)Learning Low-Rank Approximation for CNNs.D. Lee, S. Kwon, B. Kim, and G. Wei. CoRR, (2019)

BibSonomy

Disambiguation of "Kwon, Se Jung"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization.

Please choose a person to relate this publication to

Hyogmyon Kwon

Jagun Kwon

Youngshin Kwon

Kwangsoo Kwon

Sanghee Kwon

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Kwon, Se Jung"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization.

Please choose a person to relate this publication to

Hyogmyon Kwon

Jagun Kwon

Youngshin Kwon

Kwangsoo Kwon

Sanghee Kwon

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization.