Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding.

Q. Chen, D. Gu, G. Wang, X. Chen, Y. Xiong, T. Huang, Q. Hu, X. Jin, Y. Wen, T. Zhang, and P. Sun. CoRR, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Tianwei Chen

Zhang Zhang

Meng Zhang

Methods and implementations of road-network matchingM. Zhang. TU München, (2009)

Yan Zhang

Other publications of authors with the same name

Shape-Sensitive Feature Extraction for Large-Aspect-Ratio Object Detection.T. Zhang, X. Sun, L. Zhuang, L. Gao, B. Zhang, and K. Zheng. IEEE Geosci. Remote. Sens. Lett., (2024)Share Your Data Carefree: An Efficient, Scalable and Privacy-Preserving Data Sharing Service in Cloud Computing.J. Sun, G. Xu, T. Zhang, H. Xiong, H. Li, and R. Deng. IEEE Trans. Cloud Comput., 11 (1): 822-838 (January 2023)Attacking and Protecting Data Privacy in Edge-Cloud Collaborative Inference Systems.Z. He, T. Zhang, and R. Lee. IEEE Internet Things J., 8 (12): 9706-9716 (2021)Weighted Pseudo-θ-Almost Periodic Sequence and Finite-Time Guaranteed Cost Control for Discrete-Space and Discrete-Time Stochastic Genetic Regulatory Networks with Time Delays.S. Sun, T. Zhang, and Z. Li. Axioms, 12 (7): 682 (July 2023)Exponential stability and synchronisation of fuzzy Mittag-Leffler discrete-time Cohen-Grossberg neural networks with time delays.S. Rao, T. Zhang, and L. Xu. Int. J. Syst. Sci., 53 (11): 2318-2340 (2022)Dynamic behaviours for semi-discrete stochastic Cohen-Grossberg neural networks with time delays.T. Zhang, S. Han, and J. Zhou. J. Frankl. Inst., 357 (17): 13006-13040 (2020)Adaptive Region Boosting method with biased entropy for path planning in changing environment.R. Kang, T. Zhang, H. Tang, and W. Zhao. CAAI Trans. Intell. Technol., 1 (2): 179-188 (2016)Performance releaser with smart anchor learning for arbitrary-oriented object detection.T. Zhang, X. Dong, X. Sun, L. Gao, Y. Qu, B. Zhang, and K. Zheng. CAAI Trans. Intell. Technol., 8 (4): 1213-1225 (December 2023)Design, Implementation and Verification of Cloud Architecture for Monitoring a Virtual Machine's Security Health.T. Zhang, and R. Lee. IEEE Trans. Computers, 67 (6): 799-815 (2018)InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding.Q. Chen, D. Gu, G. Wang, X. Chen, Y. Xiong, T. Huang, Q. Hu, X. Jin, Y. Wen, T. Zhang and 1 other author(s). CoRR, (2024)

BibSonomy

Disambiguation of "Zhang, Tianwei"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding.

Please choose a person to relate this publication to

Tianwei Chen

Tianwei Chen

Zhang Zhang

Meng Zhang

Yan Zhang

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Zhang, Tianwei"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding.

Please choose a person to relate this publication to

Tianwei Chen

Tianwei Chen

Zhang Zhang

Meng Zhang

Yan Zhang

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding.