Author of the publication

What Language Model Architecture and Pretraining Objective Works Best for Zero-Shot Generalization?

, , , , , , , and . ICML, volume 162 of Proceedings of Machine Learning Research, page 22964-22984. PMLR, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

What Language Model Architecture and Pretraining Objective Works Best for Zero-Shot Generalization?, , , , , , , and . ICML, volume 162 of Proceedings of Machine Learning Research, page 22964-22984. PMLR, (2022)The Falcon Series of Open Language Models., , , , , , , , , and 4 other author(s). CoRR, (2023)The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only., , , , , , , , and . CoRR, (2023)What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?, , , , , , , and . CoRR, (2022)Is the Number of Trainable Parameters All That Actually Matters?, , , and . ICBINB@NeurIPS, volume 163 of Proceedings of Machine Learning Research, page 27-32. PMLR, (2021)What Language Model to Train if You Have One Million GPU Hours?, , , , , , , , , and 9 other author(s). CoRR, (2022)RITA: a Study on Scaling Up Generative Protein Sequence Models., , , , and . CoRR, (2022)LightOn Optical Processing Unit : Scaling-up AI and HPC with a Non von Neumann co-processor., , , , , , , , , and 7 other author(s). HCS, page 1-11. IEEE, (2021)What Language Model to Train if You Have One Million GPU Hours?, , , , , , , , , and 8 other author(s). EMNLP (Findings), page 765-782. Association for Computational Linguistics, (2022)Is the Number of Trainable Parameters All That Actually Matters?, , , , and . CoRR, (2021)