Author of the publication

Lifelong Language Pretraining with Distribution-Specialized Experts.

, , , , , , and . ICML, volume 202 of Proceedings of Machine Learning Research, page 5383-5395. PMLR, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation, , , , , , , , , and 2 other author(s). (2016)cite arxiv:1611.04558.A Linear Model for YUV 4: 2: 0 Chroma Intra Prediction., , , , and . ISCAS, page 1-5. IEEE, (2019)MiC: Multi-level Characterization and Optimization of GPGPU Kernels., , and . JETC, 15 (3): 25:1-25:24 (2019)GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism., , , , , , and . CoRR, (2018)Analysis of video codec buffer and delay under time-varying channel., and . VCIP, page 1-6. IEEE, (2012)Ensuring Kernel Integrity Using KIPBMFH., , , and . ICICS, volume 9543 of Lecture Notes in Computer Science, page 10-17. Springer, (2015)A Comparison of End-to-End Models for Long-Form Speech Recognition., , , , , , , , , and 4 other author(s). ASRU, page 889-896. IEEE, (2019)Mining block correlations to improve storage performance., , and . ACM Trans. Storage, 1 (2): 213-245 (2005)Improved Estimation of Transmission Distortion for Error-Resilient Video Coding., , , and . IEEE Trans. Circuits Syst. Video Technol., 22 (4): 636-647 (2012)GSPMD: General and Scalable Parallelization for ML Computation Graphs., , , , , , , , , and 6 other author(s). CoRR, (2021)