We observed that generally the embedding representation is very rich and information dense. For example, reducing the dimensionality of the inputs using SVD or PCA, even by 10%, generally results in worse downstream performance on specific tasks.
E. Nie, S. Liang, H. Schmid, and H. Schütze. Findings of the Association for Computational Linguistics: ACL 2023, page 8320--8340. Toronto, Canada, Association for Computational Linguistics, (July 2023)