Synopsis of Social media discussions

Many participants noted that only 5% of the studies reviewed used real patient data, highlighting the disconnect between research and clinical application. For instance, one post remarked on the need for evaluations that consider fairness and bias, while another mentioned the inadequacies in current methodologies. The tone of urgency and phrases like 'need broader evaluations' suggest a strong community drive towards improving AI applications in healthcare.

A
Agreement
Moderate agreement

Most posts express a general agreement with the article's findings regarding the need for improved evaluation in healthcare LLMs.

I
Interest
High level of interest

The discussion shows strong interest in the implications of the research, highlighting its relevance to ongoing debates in healthcare technology.

E
Engagement
High engagement

Many participants engage deeply, referencing specific data points from the study and suggesting areas for improvement.

I
Impact
High level of impact

Contributors view the study as having a significant impact on the future of LLM applications in healthcare, emphasizing potential changes in evaluation standards.

Social Mentions

YouTube

2 Videos

Facebook

2 Posts

Twitter

117 Posts

Blogs

5 Articles

News

9 Articles

Metrics

Video Views

219

Total Likes

259

Extended Reach

1,994,700

Social Features

135

Timeline: Posts about article

Top Social Media Posts

Posts referencing the article

Evaluating Large Language Models in Healthcare: Insights and Tools

Evaluating Large Language Models in Healthcare: Insights and Tools

This panel discussion focuses on evaluating large language models (LLMs) with frameworks and tools in healthcare. Key topics include a systematic review highlighting evaluation shortcomings and recommendations from expert panelists, aiming to augment the assessment of LLM applications in medical settings.

January 31, 2025

119 views


Evaluating Large Language Models in Health Care Applications

Evaluating Large Language Models in Health Care Applications

This video discusses the influence of large language models (LLMs) in health care productivity and their potential applications. We analyze a systematic review highlighting key components such as data type and evaluation metrics, revealing challenges in addressing fairness and bias in current methodologies.


  • Teresa Hartman
    @thartman2u (Twitter)

    RT @AMAEdHub: New from JN Learning: Testing and Evaluation of Health Care Applications of Large Language Models https://t.co/N8L2Dw4bVA
    view full post

    February 23, 2025

    1

  • AMA Ed Hub™
    @AMAEdHub (Twitter)

    New from JN Learning: Testing and Evaluation of Health Care Applications of Large Language Models https://t.co/N8L2Dw4bVA
    view full post

    February 23, 2025

    2

    1

  • Teresa Hartman
    @thartman2u (Twitter)

    RT @AMAEdHub: New today: Testing and Evaluation of Health Care Applications of Large Language Models https://t.co/T3DSTvzU4x #CME
    view full post

    February 23, 2025

    2

  • AMA Ed Hub™
    @AMAEdHub (Twitter)

    New today: Testing and Evaluation of Health Care Applications of Large Language Models https://t.co/T3DSTvzU4x #CME
    view full post

    February 23, 2025

    2

    2

  • Marco.Care
    @Marco_Care_AI (Twitter)

    RT @fedelosco:
    view full post

    February 9, 2025

    4

  • MJGonzapelt
    @jgonzalezapelt (Twitter)

    RT @fedelosco:
    view full post

    February 7, 2025

    4

  • Cinthia
    @cinthiavgauna (Twitter)

    RT @fedelosco:
    view full post

    February 7, 2025

    4

  • Martín Angel
    @Martin_AngelMD (Twitter)

    RT @fedelosco:
    view full post

    February 7, 2025

    4

  • FLoscoMD
    @fedelosco (Twitter)


    view full post

    February 7, 2025

    13

    4

  • STITCHES Medicine - the Best of Medical Research
    @STITCHESMed (Twitter)

    Only 5% of studies evaluated large language models in healthcare using real patient care data, mostly focusing on medical knowledge assessments. by Bedi S, Liu Y (...) Shah NH et 16 al. in JAMA https://t.co/qGnsCScNuI #MedX #MedResearch
    view full post

    February 6, 2025

  • T.kimura
    @jjcrazydiamond (Twitter)

    RT @JAMA_current: This study identifies inconsistent evaluation practices of large language models (LLMs) in health care, finding a lack of…
    view full post

    February 2, 2025

    5

  • Manoj Mayogi Mishra
    @mayogisense (Twitter)

    RT @JAMA_current: This study identifies inconsistent evaluation practices of large language models (LLMs) in health care, finding a lack of…
    view full post

    February 2, 2025

    5

  • Fiatopichan
    @Fiatopichan (Twitter)

    RT @JAMA_current: This study identifies inconsistent evaluation practices of large language models (LLMs) in health care, finding a lack of…
    view full post

    February 2, 2025

    5

  • JAMA
    @JAMA_current (Twitter)

    This study identifies inconsistent evaluation practices of large language models (LLMs) in health care, finding a lack of standardized frameworks and limited use of real patient data. https://t.co/3CRQ4Cb5jd
    view full post

    February 2, 2025

    12

    5

  • A.R. García
    @air_garcia (Twitter)

    This systematic review characterizes the current performance of LLM in evaluating clinical health care settings, including uniformity, thoroughness, and robustness and proposes a framework for their testing and evaluation across health care applications. https://t.co/OMQn79N0Fi]
    view full post

    January 29, 2025

  • Salvador Pedraza
    @salvasapedraza (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models https://t.co/Aj9wsQ4tyN https://t.co/4ptRvoxsPp
    view full post

    January 28, 2025

  • ForensicPsyMD
    @ForensicPsyMD (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review | Digital Health | JAMA | JAMA Network https://t.co/cRqAU7aReh
    view full post

    January 28, 2025

  • Un1v3rs0 Z3r0
    @Un1v3rs0Z3r0 (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models https://t.co/5vqXJSF6dD
    view full post

    January 28, 2025

  • Dr. Xs (Fuu)Artificial Life Intelligence The I
    @_x_ai_i (Twitter)

    RT @AdamRodmanMD: But we know how to test efficacy in medicine. Clinical trials are messy, more expensive than in silico studies, and requi…
    view full post

    December 17, 2024

    4

  • Westyn Branch-Elliman, M.D., MMSc., FSHEA
    @wbranchelliman (Twitter)

    RT @AdamRodmanMD: But we know how to test efficacy in medicine. Clinical trials are messy, more expensive than in silico studies, and requi…
    view full post

    December 17, 2024

    4

  • Dan Morgan
    @dr_dmorgan (Twitter)

    RT @AdamRodmanMD: But we know how to test efficacy in medicine. Clinical trials are messy, more expensive than in silico studies, and requi…
    view full post

    December 17, 2024

    4

  • Josh Mandel, MD
    @JoshCMandel (Twitter)

    RT @AdamRodmanMD: But we know how to test efficacy in medicine. Clinical trials are messy, more expensive than in silico studies, and requi…
    view full post

    December 17, 2024

    4

  • Adam Rodman
    @AdamRodmanMD (Twitter)

    But we know how to test efficacy in medicine. Clinical trials are messy, more expensive than in silico studies, and require multidisciplinary expertise. But they're still the right thing to do. Only 5% of LLM in medicine studies even use real data (https://t.co/gc2sl3VFYn)!!
    view full post

    December 17, 2024

    23

    4

  • Dr M. Mahesh (ಮಹೇಶ್) (he/him/his)
    @mmahesh1 (Twitter)

    Interesting: "Existing evaluations of LLMs mostly focus on accuracy of ques answering for medical exams, without consideration of real patient care data. Dimensions such as fairness, bias, toxicity & deployment considerations received limited attention" https://t.co/b6TebwSlKE
    view full post

    November 21, 2024

  • Xosé M Fernández
    @xosegb (Twitter)

    Testing and Evaluation of Health Care Applications of #LLM : A Systematic Review ⁦@JAMANetwork⁩ https://t.co/lcEb5XYaWj
    view full post

    November 21, 2024

  • Stanford Department of Medicine
    @StanfordDeptMed (Twitter)

    From diagnostics to patient communication, large language models are transforming healthcare. This @JAMA_current review by #StanDOM's @drnigam, @niravrshah, Arnold Milstein & Michael Pfeffer, sheds light on their diverse applications & effectiveness. https://t.co/HD2KEwkvvp
    view full post

    November 1, 2024

    2

  • Tony Shanks
    @alshanks (Twitter)

    The pace of AI in medical education is rapidly advancing. I appreciate summaries like this that show the gaps and where we can focus. https://t.co/H8dwGRp4om
    view full post

    October 29, 2024

    1

  • Woojin Kim
    @woojinrad (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models
    view full post

    October 27, 2024

    2

  • Dr. Xs (Fuu)Artificial Life Intelligence The I
    @_x_ai_i (Twitter)

    RT @CeoImed: 「大規模言語モデルの医療応用の評価」 JAMA ・ 2022年〜2024年2月までに発表された 519 件の研究を対象としたsystematic review ・ 評価に実際の患者データを使用したのはわずか 5% ・ 評価の主に正確性であり、公平性…
    view full post

    October 23, 2024

    6

  • Yaron Einhorn
    @yaronoox (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review | Artificial Intelligence | JAMA | JAMA Network https://t.co/pxth5SMzu2
    view full post

    October 23, 2024

  • ...
    @ppoHeisenberg (Twitter)

    RT @juanelosag: cuantificar los sesgos, cubrir una gama más amplia de tareas y especialidades y reportar métricas de desempeño estandarizad…
    view full post

    October 23, 2024

    1

  • Juan E Losa. Infectólogo. HUFA. URJC. Sandoval Sur
    @juanelosag (Twitter)

    cuantificar los sesgos, cubrir una gama más amplia de tareas y especialidades y reportar métricas de desempeño estandarizadas para permitir una implementación a gran escala.” https://t.co/1bRFPXxnGN
    view full post

    October 23, 2024

    1

    1

  • Supriyo SB Chatterjee
    @sbc111 (Twitter)

    Testing and Evaluation of #HealthCare Applications of Large Language Models @JAMAplusAI @JAMANetworkOpen #AI #HealthAI #LLM #TechHartford https://t.co/PHImDKf3Fx
    view full post

    October 21, 2024

  • Srinivas Karri
    @xsrinikar (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models
    view full post

    October 21, 2024

  • Srinivas Karri
    @xsrinikar (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models
    view full post

    October 21, 2024

  • 葉隠れ
    @osanpochuudayo (Twitter)

    (2/2) Critical administrative tasks (prescribing, billing) were neglected (<1%), and bias assessment protocols were inadequate (15.8%). Bedi et al. Stanford, UCSF, et al. report in JAMA Oct 15, 2024 on doi:10.1001/jama.2024.21700 https://t.co/F4uAmqtbJo
    view full post

    October 21, 2024

  • 葉隠れ
    @osanpochuudayo (Twitter)

    (1/2)A systematic review (n=519) revealed substantial methodological flaws in LLM healthcare evaluations. Authentic clinical data were scarce (5%), with question-answering dominating (84.2%). https://t.co/JJoXDwo0O7
    view full post

    October 21, 2024

  • Kazu@精神科医 MD&PhD
    @peacewaffle (Twitter)

    RT @CeoImed: 「大規模言語モデルの医療応用の評価」 JAMA ・ 2022年〜2024年2月までに発表された 519 件の研究を対象としたsystematic review ・ 評価に実際の患者データを使用したのはわずか 5% ・ 評価の主に正確性であり、公平性…
    view full post

    October 20, 2024

    6

  • うさきち@冬コミは型月
    @usakichiusa (Twitter)

    RT @CeoImed: 「大規模言語モデルの医療応用の評価」 JAMA ・ 2022年〜2024年2月までに発表された 519 件の研究を対象としたsystematic review ・ 評価に実際の患者データを使用したのはわずか 5% ・ 評価の主に正確性であり、公平性…
    view full post

    October 20, 2024

    6

  • ただ/だた (pinmarch)
    @pinmarch_t (Twitter)

    RT @CeoImed: 「大規模言語モデルの医療応用の評価」 JAMA ・ 2022年〜2024年2月までに発表された 519 件の研究を対象としたsystematic review ・ 評価に実際の患者データを使用したのはわずか 5% ・ 評価の主に正確性であり、公平性…
    view full post

    October 20, 2024

    6

  • at_ayeaye
    @at_ayeaye (Twitter)

    RT @CeoImed: 「大規模言語モデルの医療応用の評価」 JAMA ・ 2022年〜2024年2月までに発表された 519 件の研究を対象としたsystematic review ・ 評価に実際の患者データを使用したのはわずか 5% ・ 評価の主に正確性であり、公平性…
    view full post

    October 20, 2024

    6

  • 河野健一 Kenichi Kono | 脳外科医 CEO|AI 医療 MBA|脳血管内手術支援AI
    @CeoImed (Twitter)

    「大規模言語モデルの医療応用の評価」 JAMA ・ 2022年〜2024年2月までに発表された 519 件の研究を対象としたsystematic review ・ 評価に実際の患者データを使用したのはわずか 5% ・ 評価の主に正確性であり、公平性、バイアス、毒性の評価はあまり研究されていない https://t.co/5wuW7A7dEP https://t.co/fRd9TgVDMR
    view full post

    October 20, 2024

    15

    6

  • Grupo Investigación Multidisciplinar Extremeño
    @GRIMEX_ (Twitter)

    https://t.co/jwSwM6swCg
    view full post

    October 19, 2024

  • EXTREMADURA SALUDABLE
    @EXTREMADURASAL1 (Twitter)

    https://t.co/S7GP9HZ269
    view full post

    October 19, 2024

  • Medical Research Library of Brooklyn
    @DMCLibraryBKLYN (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 18, 2024

    56

  • Dr. Gennadi Glinsky, MD, Ph.D.
    @gglinskii (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models. A Systematic Review. https://t.co/c0wYR1dN5G
    view full post

    October 17, 2024

  • Josh Davis
    @joshp_davis (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Jesse Burk-Rafel
    @jbrafel (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Venkat C
    @chalamalasetti (Twitter)

    https://t.co/xnv4q7QTO2
    view full post

    October 17, 2024

  • Oscar Camara
    @oscarcamararey (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Milton Tan
    @mtanichthys (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Arun Umesh Mahtani
    @ArunUMahtani (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Dmitrii (Dima) Smirnov
    @SmirnovDDD (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • EileenD6☮️
    @eileen_d6 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Adam Dunn
    @adamgdunn (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Leslie Vargas-Ramírez
    @lilo1278 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Leslie Vargas-Ramírez
    @lilo1278 (Twitter)

    RT @daforerog: Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review https://t.co/438Kp6WosP
    view full post

    October 17, 2024

    1

  • Diego Forero MD, PhD
    @daforerog (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review https://t.co/438Kp6WosP
    view full post

    October 17, 2024

    2

    1

  • Nicholas Tatonetti
    @proftatonetti (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Abstream
    @abstreamme (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models #science #publication #research #publications https://t.co/AsAvv0eBpa
    view full post

    October 17, 2024

  • Jason H. Moore, PhD
    @moorejh (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Nick_Zen
    @Nick_Zen (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • HAS-veille
    @HAS_veille (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Research Data MGMT & LIVING
    @fdmincoop (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Carlos KH Wong
    @CarlosWongHKU (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • SF
    @SofiaVi185 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Carlos Cmx
    @rulpogt (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Ryan Cello
    @ryan_c_cello (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Manish Sharma
    @msharmas (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Daisy Davis
    @daisy_davis2010 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Nicole Miller
    @veeh_2011 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • THEE Gregory Stewart
    @gstewtwo (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 17, 2024

    56

  • Suhana Bedi
    @BediSuhana42170 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Sergei Polevikov
    @AIHealthUncut (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Flappest
    @Flappest (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • ong beng hooi
    @ongbenghooi1 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Dr. Shashank Joshi
    @AskDrShashank (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Samantha_4JD
    @Samantha_July01 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Dr. Robert Glatter
    @DrRobertGlatter (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Lindvall Lab
    @lindvalllab (Twitter)

    Systematic review finds only 5% of #LLM studies in healthcare use real patient data. We need broader evaluations that address bias, fairness, and more diverse tasks. #AI #HealthCare #MedTwitter https://t.co/mFRB0BKnOD
    view full post

    October 16, 2024

    2

  • JC Stanford
    @JCDarnestown (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Martin Michaelis
    @MartMichaelis (Twitter)

    #Evidence that more scrutiny, care, and caution is needed regarding the use of #AI in #healthcare. I assume, this is also true for most other areas... https://t.co/jNngY6nAkM
    view full post

    October 16, 2024

  • Luis Eduardo Pino V
    @docpinoAI (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Josep M Garcia-Alamino, PhD (Oxonian)
    @JosepMGarcia75 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Edouard Lhomme
    @Edouard_Lhomme (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Ryan Flinn
    @RS_Flinn (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • strmdev
    @strmdev1 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Supriyo SB Chatterjee
    @sbc111 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Temerty Centre for AI in Medicine (T-CAIREM)
    @UofT_TCAIREM (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • George E. Dafoulas MD, MBA in HSM
    @GeorgeEDafoulas (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Biomedical Informatics Research
    @StanfordBMIR (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models https://t.co/KfwEUCsqsK
    view full post

    October 16, 2024

  • Brad Wouters
    @bradwouters (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • priya joseph
    @ayirpelle (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Eldon Edwards
    @eldonredwards (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Manuel Ramos-Casals
    @ramos_casals (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Fco. Rojas
    @RadBark (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Dwan Turner ⛴️
    @DwanTurner (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Soo-Yong Shin
    @likesky3 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Shivam Vedak, MD MBA
    @ShivamVedakMD (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Takefumi Kimura
    @VulletForMy (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Colorful MD Phd
    @FanYang38636272 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • syawal™ シ
    @syawal (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • 어옌
    @sapiens202 (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Ram Sesha
    @OncoAI (Twitter)

    RT @EricTopol: Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam ht…
    view full post

    October 16, 2024

    56

  • Eric Topol
    @EricTopol (Twitter)

    Of over 500 LLM #AI reports in healthcare, only 5% used real patient data. https://t.co/SAhfJxcdhz @JAMA_current @drnigam https://t.co/EXl5HYzd0w
    view full post

    October 16, 2024

    154

    56

  • thetranscendedman
    @atranscendedman (Twitter)

    A review of 519 studies found only 5% used real patient data to evaluate Large Language Models (LLMs) in healthcare. Most focused on medical exams, few on admin tasks like writing prescriptions. Real data and broader evaluations needed. https://t.co/Ui46zI9dLD
    view full post

    October 16, 2024

  • Xema Pérez
    @Xemadeyaka14 (Twitter)

    RT @emiliomonteb: Testing and Evaluation of Health Care Applications of Large Language ModelsA Systematic Review https://t.co/b7NBd05m1c…
    view full post

    October 16, 2024

    1

  • Emilio Monte
    @emiliomonteb (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language ModelsA Systematic Review https://t.co/b7NBd05m1c #AI #IA #LLM https://t.co/IevhiBcVvV
    view full post

    October 16, 2024

    2

    1

  • Amadeo Wals
    @AmadeoWals (Twitter)

    https://t.co/qSFwuEJS5G #RADONC #LLMs #AI
    view full post

    October 16, 2024

  • Dr. Suchitra Kataria
    @Suchitrk (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models https://t.co/Bi0xVukb7m?
    view full post

    October 16, 2024

  • Ryan Nipp, MD, MPH, MBA, FASCO
    @RyanNipp (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review. https://t.co/E4reSupeee @JAMA_current @JAMANetwork #ArtificialIntelligence #MedEd #MEDTECH #DigitalHealth @StanfordMed https://t.co/6XN7M8Qzhf
    view full post

    October 15, 2024

    5

  • Yonemoto N
    @nyonenyone (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models https://t.co/APxR4xVXct
    view full post

    October 15, 2024

  • Supriyo SB Chatterjee
    @sbc111 (Twitter)

    RT @JAMAplusAI: This study identifies inconsistent evaluation practices of large language models (LLMs) in health care, finding a lack of s…
    view full post

    October 15, 2024

    1

  • JAMA+ AI
    @JAMAplusAI (Twitter)

    This study identifies inconsistent evaluation practices of large language models (LLMs) in health care, finding a lack of standardized frameworks and limited use of real patient data. https://t.co/XlRwt5Uweq https://t.co/xZAsuEnxIW
    view full post

    October 15, 2024

    5

    1

  • Epic Plain
    @EpicPlain (Twitter)

    Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review https://t.co/ULthWLsDZW #LLMs
    view full post

    October 15, 2024

  • Temerty Centre for AI in Medicine (T-CAIREM)
    @UofT_TCAIREM (Twitter)

    RT @JAMA_current: This study identifies inconsistent evaluation practices of large language models (LLMs) in health care, finding a lack of…
    view full post

    October 15, 2024

    2

  • JAMA
    @JAMA_current (Twitter)

    This study identifies inconsistent evaluation practices of large language models (LLMs) in health care, finding a lack of standardized frameworks and limited use of real patient data. https://t.co/O7NbeErlL3 https://t.co/zqkUcvCfkg
    view full post

    October 15, 2024

    9

    2

Abstract Synopsis

  • Large language models (LLMs) show potential in health care but existing evaluation methods may not effectively highlight their best applications.
  • A systematic review analyzed studies from January 2022 to February 2024, identifying five components: data type, health care task, NLP/NLU tasks, evaluation dimensions, and medical specialty.
  • Out of 519 studies reviewed, only 5 utilized real patient data; the majority focused on assessing medical knowledge and answering questions, with accuracy being the main evaluation metric, while other factors like fairness and bias were rarely addressed.