copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders

S. Xu, L. Yang, C. Kelly, M. Sieniek, T. Kohlberger, M. Ma, W. Weng, A. Kiraly, S. Kazemzadeh, Z. Melamed, J. Park, P. Strachan, Y. Liu, C. Lau, P. Singh, C. Chen, M. Etemadi, S. Kalidindi, Y. Matias, K. Chou, G. Corrado, S. Shetty, D. Tse, S. Prabhakara, D. Golden, R. Pilgrim, K. Eswaran, and A. Sellergren. (2023)cite arxiv:2308.01317.

Abstract

Our approach, which we call Embeddings for Language/Image-aligned X-Rays, or ELIXR, leverages a language-aligned image encoder combined or grafted onto a fixed LLM, PaLM 2, to perform a broad range of tasks. We train this lightweight adapter architecture using images paired with corresponding free-text radiology reports from the MIMIC-CXR dataset. ELIXR achieved state-of-the-art performance on zero-shot chest X-ray (CXR) classification (mean AUC of 0.850 across 13 findings), data-efficient CXR classification (mean AUCs of 0.893 and 0.898 across five findings (atelectasis, cardiomegaly, consolidation, pleural effusion, and pulmonary edema) for 1% (~2,200 images) and 10% (~22,000 images) training data), and semantic search (0.76 normalized discounted cumulative gain (NDCG) across nineteen queries, including perfect retrieval on twelve of them). Compared to existing data-efficient methods including supervised contrastive learning (SupCon), ELIXR required two orders of magnitude less data to reach similar performance. ELIXR also showed promise on CXR vision-language tasks, demonstrating overall accuracies of 58.7% and 62.5% on visual question answering and report quality assurance tasks, respectively. These results suggest that ELIXR is a robust and versatile approach to CXR AI.

Description

ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders

Links and resources

BibTeX key: xu2023elixr
entry type: misc
year: 2023
url: http://arxiv.org/abs/2308.01317
note: cite arxiv:2308.01317

@msn2708's tags highlighted

elixr

Cite this publication

%0 Generic %1 xu2023elixr %A Xu, Shawn %A Yang, Lin %A Kelly, Christopher %A Sieniek, Marcin %A Kohlberger, Timo %A Ma, Martin %A Weng, Wei-Hung %A Kiraly, Attila %A Kazemzadeh, Sahar %A Melamed, Zakkai %A Park, Jungyeon %A Strachan, Patricia %A Liu, Yun %A Lau, Chuck %A Singh, Preeti %A Chen, Christina %A Etemadi, Mozziyar %A Kalidindi, Sreenivasa Raju %A Matias, Yossi %A Chou, Katherine %A Corrado, Greg S. %A Shetty, Shravya %A Tse, Daniel %A Prabhakara, Shruthi %A Golden, Daniel %A Pilgrim, Rory %A Eswaran, Krish %A Sellergren, Andrew %D 2023 %K Elixr %T ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders %U http://arxiv.org/abs/2308.01317 %X Our approach, which we call Embeddings for Language/Image-aligned X-Rays, or ELIXR, leverages a language-aligned image encoder combined or grafted onto a fixed LLM, PaLM 2, to perform a broad range of tasks. We train this lightweight adapter architecture using images paired with corresponding free-text radiology reports from the MIMIC-CXR dataset. ELIXR achieved state-of-the-art performance on zero-shot chest X-ray (CXR) classification (mean AUC of 0.850 across 13 findings), data-efficient CXR classification (mean AUCs of 0.893 and 0.898 across five findings (atelectasis, cardiomegaly, consolidation, pleural effusion, and pulmonary edema) for 1% (~2,200 images) and 10% (~22,000 images) training data), and semantic search (0.76 normalized discounted cumulative gain (NDCG) across nineteen queries, including perfect retrieval on twelve of them). Compared to existing data-efficient methods including supervised contrastive learning (SupCon), ELIXR required two orders of magnitude less data to reach similar performance. ELIXR also showed promise on CXR vision-language tasks, demonstrating overall accuracies of 58.7% and 62.5% on visual question answering and report quality assurance tasks, respectively. These results suggest that ELIXR is a robust and versatile approach to CXR AI.

@misc{xu2023elixr, abstract = {Our approach, which we call Embeddings for Language/Image-aligned X-Rays, or ELIXR, leverages a language-aligned image encoder combined or grafted onto a fixed LLM, PaLM 2, to perform a broad range of tasks. We train this lightweight adapter architecture using images paired with corresponding free-text radiology reports from the MIMIC-CXR dataset. ELIXR achieved state-of-the-art performance on zero-shot chest X-ray (CXR) classification (mean AUC of 0.850 across 13 findings), data-efficient CXR classification (mean AUCs of 0.893 and 0.898 across five findings (atelectasis, cardiomegaly, consolidation, pleural effusion, and pulmonary edema) for 1% (~2,200 images) and 10% (~22,000 images) training data), and semantic search (0.76 normalized discounted cumulative gain (NDCG) across nineteen queries, including perfect retrieval on twelve of them). Compared to existing data-efficient methods including supervised contrastive learning (SupCon), ELIXR required two orders of magnitude less data to reach similar performance. ELIXR also showed promise on CXR vision-language tasks, demonstrating overall accuracies of 58.7% and 62.5% on visual question answering and report quality assurance tasks, respectively. These results suggest that ELIXR is a robust and versatile approach to CXR AI.}, added-at = {2023-08-06T13:05:07.000+0200}, author = {Xu, Shawn and Yang, Lin and Kelly, Christopher and Sieniek, Marcin and Kohlberger, Timo and Ma, Martin and Weng, Wei-Hung and Kiraly, Attila and Kazemzadeh, Sahar and Melamed, Zakkai and Park, Jungyeon and Strachan, Patricia and Liu, Yun and Lau, Chuck and Singh, Preeti and Chen, Christina and Etemadi, Mozziyar and Kalidindi, Sreenivasa Raju and Matias, Yossi and Chou, Katherine and Corrado, Greg S. and Shetty, Shravya and Tse, Daniel and Prabhakara, Shruthi and Golden, Daniel and Pilgrim, Rory and Eswaran, Krish and Sellergren, Andrew}, biburl = {https://www.bibsonomy.org/bibtex/2f1a8e435d7770370456d4900c9650766/msn2708}, description = {ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders}, interhash = {24bc17b6ea4108d33316acec30926d26}, intrahash = {f1a8e435d7770370456d4900c9650766}, keywords = {Elixr}, note = {cite arxiv:2308.01317}, timestamp = {2023-08-06T13:05:07.000+0200}, title = {ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders}, url = {http://arxiv.org/abs/2308.01317}, year = 2023 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders

Comments and Reviews
(0)