
Unless you've been living under a rock for the past year or so, you've seen the news rife with mentions of Artificial Intelligence (AI) and how iterative improvements in machine learning will disrupt whole industries. In fact, publicly available tools like OpenAI's ChatGPT can generate cogent, comprehensible, and articulate answers to questions posited in vernacular ranging from plain English to technical jargon. For those of us less familiar with programs like ChatGPT, Bard, and DALL-E, they are AI software programs designed to interact or produce content like a human.
Based on large amounts of data, history, news, and even scientific publications that have been fed to it by the programmers, AI can synthesize information and make predictions on what a “correct” response would be. The AIs can be broad and sweeping like ChatGPT and can converse like a (somewhat stilted) human, or they can be very narrow in scope and spit out information and amalgamations of technical data, a la Lost in Space.
Much like people, AI can only filter and synthesize information based on what they've been taught. Naturally, data-driven professions have felt the creeping edge of worry about whether their job might be supplanted by a tireless machine without the need for pay or subject to the frailties of human health. Given that the medical profession prides itself on data-driven care and best practices distilled from large amounts of empirical and experimental data, the natural next question is, “Can a machine replace me?” To wit, I have a radiologist acquaintance who is currently taking as much volume as he can right now because “when they replace the first one of us, it’s over.”
Testing the Waters
When I was introduced to ChatGPT in person, my first inclination was to ask it Google-like queries like “What's the square root of 693,427? Who was the fourth Roman Emperor? Why aren’t Kraft American Singles considered real cheese?” Realizing that the point was not so much to ask knowledge-based facts, I started asking philosophical questions like, “What would you recommend to a man who just lost his job? Which Baldwin is the better actor? How many licks does it take to get to the Tootsie Roll center of a Tootsie Pop?”
ChatGPT sometimes provides answers that are unintentionally humorous and idiosyncratic—kind of like that one person in every friend group (P.S. If you don’t know who that is, it’s you). After getting a few laughs (apparently engineer students from Purdue studied and published the answer to the Tootsie Pop question, which is 364), I asked, “What is the diagnosis for a high school football player who sustained a hit to the outside of the knee? He felt a pop and had a large amount of swelling above the kneecap, and now he feels a buckling and shifting sensation when he walks.”
I tried to ask the question like how a patient might ask the internet, and its answer is copied verbatim below.
“I am not a medical professional, but the symptoms you described could potentially be indicative of a serious knee injury, such as an anterior cruciate ligament (ACL) tear. However, a definitive diagnosis can only be made by a qualified healthcare professional after a thorough examination, including imaging tests like an MRI.
The pop sensation, significant swelling, and the feeling of buckling and shifting in the knee are commonly associated with ACL injuries. Other possible injuries that could cause similar symptoms include meniscus tears, collateral ligament injuries, or damage to other structures within the knee joint.”
It's pretty spot on; I described a reasonable mechanism and presentation for an ACL tear. I tried to make ChatGPT give me a percentage of certainty, but the algorithm is designed specifically not to do that and reiterated the need to be seen and examined. While I appreciate the deferral to qualified medical professionals, the question that needs to be asked is whether this AI could passably do my job as a non-surgical sports doctor.
More information here:
I Asked the AI Bot That’s Taking the Internet by Storm for Financial Advice — Here’s How It Went
Is Anybody Else Getting Nervous About an AI Bubble in the Stock Market?
Current Usage
Currently, AI is being developed at all levels—from providers to individual patients, insurance companies, telehealth practices, and hospital groups. There are articles on AI in the NEJM, Future Healthcare from the Royal College of Physicians, The Journal of Family Medicine and Primary Care, etc. While the current opinion is that AI is good for refining information and generating a list of possibilities, diagnosis and treatment are murkier waters.
For example, Docus.AI is a model that offers a combination of AI with physician oversight. For individuals, it will let you upload labs, imaging, symptoms, health history, etc., and it will generate possibilities based on current symptoms or tell you what screening may be indicated. For docs, it basically functions as a targeted differential generator and suggests treatments, more questions to ask, and further workup.
Our current EMR, ironically monikered “Modernizing Medicine,” is developing an AI scribe tool to help with things like History of Present Illness, and it might find its way into transposing the clinical exam. Currently, it’s being fed data (with patient consent) to see how providers word their histories and exams in order to suggest possible next steps in what you’re going to chart. This is much like Google’s AI tools in its G Suite, which takes word and phrase inputs and predicts your next several.
There are also AI “solutions” for employers and insurance providers. You can bet your bottom dollar that insurers are going to feed custom AI data and synthesize questions for doctors about the care they provide to patients.
“Dr. Smith, we see that on patient X you ordered an MRI of the knee after dictating these five clinical factors. In 72% of your previous patients, you began with physical therapy prior to ordering this imaging. Before we authorize this imaging, please send your clinical decision-making to our Combined, Reconciled Analytics of Providers (CRAP) tool.”
Limitations of AI
As it stands, AI can’t introspect. It will simply generate the outcome or answer it feels is most correct given the information provided. I asked ChatGPT several clinical scenarios with fairly sparse clinical history (and no exam), and it generated the most common causes of pain or problems in those areas. It tellingly did not ask for more information.
It can also be intentionally misled. At its heart, all AIs are software programs cycling through information given to them by the user and reconciling that against the information they’ve internalized. If given a set of fairly objective and verifiable data, like the mechanism for a pulmonary embolism along with history and exam findings, someone could deliberately and falsely input the appropriate symptoms (or at least misunderstand the symptoms) and come to the incorrect conclusion.
I asked ChatGPT about its own limitations, and it appeared to corroborate my suspicions.
AI is just like any other technology out there in that it creates job opportunities in some regions or sectors and eliminates them in others. Will some specialties lose out? Assuredly. I can see specialties like pathology becoming more niche, a future where there are more intraoperative biopsies that are time sensitive and fewer cases in which time is less of the essence. I can see something similar happening to radiology where diagnostic radiologists lose out but interventional sees more volume.
I doubt anyone with a hands-on career will lose their job in the near future—good news for patient care techs, phlebotomists, surgeons, and physical therapists. Medical billing and coding might see a decrement in job numbers given that an intelligent AI can be taught to search out and denote phrasing and information required by insurers to bill a certain code. AI may also suggest changing verbiage to better suit a certain level of billing or CPT code as well as warn providers that an order, like advanced imaging, may be denied based on what is documented in the note.
I’m in a fairly safe specialty. Patients still struggle to put feelings and sensations into words, and given that a physical exam is an integral part of my interaction with my patients, I don’t see that going away. Will I hate it when some AI-supercharged insurer denies my request for an MRI based on some excerpt it’s read in the Journal of Applied Forces in Human Systems out of Bangladesh? Absolutely. Will it be any more malevolent than the current cadre of claim-denying bean counters employed in droves by payors? Hopefully less, much less.
More information here:
Crystal Ball Prediction
Knowing how much Dr. Jim Dahle likes crystal ball statements, I’ll submit these predictions for reflection in a few years’ time. I anticipate that EMRs will incorporate AI as a basic integration into their software in a few years. This is not out of any desire to make physicians more expedient or patients’ care more accurate but because EMRs are glorified billing software—the companies of which are remunerated as a function of revenue generated or codes billed, either directly like Athenahealth or indirectly via the salespeople touting “better” billing.
Second, I anticipate that payors will ask AI about the most expensive and “low yield” interventions and imaging studies that physicians order, and they will apply pressure at even the practice level to lower costs. How else will Andrew Witty make another $23.5 million in 2024?
Last, I do anticipate that physicians’ diagnoses will become at least somewhat more accurate. The unfortunate siloing of physicians into ever more niche practices means that our depth outside of our subspecialty shallows significantly without regular, intentional education. To wit, while my subspecialty is in sports medicine from family medicine, I have begun to see shoulder pain as musculoskeletal, radicular, or vascular nearly exclusively. I can’t say I’ve actively entertained referred gallbladder pain in the differential for some time unless someone mentioned stomach pain as well.
AI could be that bit of perspective needed to keep our differentials broad and our egos in check. I’m hoping that at least there will be fewer errors for malpractice attorneys to bite on!
Are you worried about AI and what will become of your job? Or do you think the potential greatness of the technology overrides those concerns? How else can AI be helpful (or harmful) to your career?
Thank you for this interesting and timely article. I’ve wondered how AI will impact various medical specialties, including psychiatry. For simple starters, some workplaces have streamlined chart notes with autofill for lab results and diagnoses, which has become mandatory in some places. While helpful for saving time and energy on something like entering lab results, it can also be messy sometimes with a host of pre-filled inaccurate or vague diagnoses. I wonder if or when it will impact clinical impression notes. Telepsychiatry has been argued to be efficient for some but isn’t the same as an actual face-to face human encounter, in my opinion, especially in corrections. I also wonder whether it will turn people into machines in some form. I’d like to think that we are too unique and human for this to happen, but I wonder about the long term. On the flip side, it’s been well known for some time about the projected impact in criminology.
Can you share any information or data about ethical committees in place in medical institutions in the US as safeguards specifically in regard to AI?
thanks again
Thanks for the response. While I’d like to agree that humans are too nuanced for a “computer” to do most of the leg work, it’s also true that the same patterns and heuristics we’re taught in training can be fed into a computer. The human component is a little tougher, something where there’s data being withheld or deliberately obfuscated. I think that for specialties which require more “human touch” that AI will be tougher to fit into the practice model.
Regarding ethical considerations, the AMA has a four page, handout style .pdf that it’s published more as a press release than an actual set of guideposts.
https://www.ama-assn.org/system/files/2019-08/ai-2018-board-policy-summary.pdf
In true government fashion, CMS and HHS have published 100+ page documents getting into some more specifics, complete with size 10 font and MS-Paint style graphics. In particular the CMS page is more of an internal document discussing how they plan to implement AI. Here are those.
https://www.hhs.gov/sites/default/files/hhs-trustworthy-ai-playbook.pdf
https://ai.cms.gov/assets/CMS_AI_Playbook.pdf
I thought I posted a reply, but it didn’t show up. So if this answer is here twice, I apologize.
Thank you for the kind words. The human element (intentional obfuscation, deception, withholding information, hesitancy, mistrust, etc.) is going to be the hardest thing to pin down. Pathways, surgical approaches, and heuristics and diagnostic information (DSM-V, Harrison’s Textbook of Medicine, Grey’s Anatomy, and Campbell’s Operative Orthopedics) is just a matter of inputs.
There are three things I’ve come across regarding ethics of AI in Medicine. Tellingly the AMA has a four-page .pdf almost like a press release, without much detail. CMS and HHS and both 100+ pagers that, size 10 font included, would practically bludgeon the reader into submission. Most of it has to do with implementation, particularly compliance with laws and minimization of bias. I’ll link those below. If you’d like an ironic chuckle, pay for the expensive version of ChatGPT and then input both the CMS and HHS documents, asking for an executive summary of 20 pages or less.
Cheers!
https://www.ama-assn.org/system/files/2019-08/ai-2018-board-policy-summary.pdf
https://ai.cms.gov/assets/CMS_AI_Playbook.pdf
https://www.hhs.gov/sites/default/files/hhs-trustworthy-ai-playbook.pdf
Thanks for these links! I skimmed through the long ones as well. You’re right. Very technical. Safety checks also seemed unclear to me. It’s interesting how AMA calls AI ‘augmented’ intelligence as opposed to Fidelity ‘disruptors’.
I have to note that you are taking credit for a post when AI did maybe half the text for you 😉
For sure! It reminded me of having fellows, ha!
As a mid-late career radiologist, I am just hoping to see the day when PowerScribe and Dragon can correctly predict if I mean “524” or “520 for”.
Nothing like having optimistic dreams and hopes 😂😉 my dictation software routinely transcribes my op notes to include the placement of 228 fr chest tubes, instead of two, 28 fr chest tubes.
Exactly. AI is just a statistical algorithm that can perform some tasks very well, but is also very dumb in that it lacks common sense–judgement, thoughtfulness, and reason, qualities which cannot be programmed. That is why AI is prone to strange errors that humans would never make. It is another great tool which needs human supervision, and it will absolutely change the way we all practice medicine, but it will not independently replace physicians. Radiologists need not worry.
My partners are raving about Open Evidence https://www.openevidence.com/ and think it might replace UpToDate eventually as a reference.
Caveat emptor.
The document pool that OpenEvidence claims it is retrieving information from is the 37 million “journal articles” in PubMed. In reality, PubMed contains 37 million citations (not articles) – and the majority of those citations are to articles that are behind publisher paywalls and inaccessible without a subscription. So, OpenEvidence is ingesting article abstracts, only. There is a significant body of evidence regarding the inadvisability of practicing medicine “off of the abstract”!
The document pool excludes valid information sources, such as clinical guidelines, FDA recalls, and CDC’s MMWR, etc. that are not discoverable in PubMed.
The OpenEvidence algorithm attempts to look at the “quality of the input source” when determining how much weight to give any individual published study. In practice, this means that OpenEvidence looks at the journal Impact Factor of the publication in which the article as published. Articles published in “relatively high impact” journals are judged to be of superior quality. This is a deeply flawed approach! The Journal Impact Factor is in no way a valid proxy measure for the quality of any individual article. (See: Wakefield’s debunked vaccine/autsim study, which was published in The Lancet). In that vein, OpenEvidence is unable to identify and flag retracted articles.
Given these, and other, limitations, on it’s own OpenEvidence is unlikely to supplant UpToDate. But Elsevier has acquired OpenEvidence’s AI technology and will use it to power Elsevier’s forthcoming clinical decision support tool: ClinicalKeyAI.
I would advise any doc or doc-in-training to use OpenEvidence as a tool to find sources… but then go to the source(s) directly and ignore, or remain skeptical, of the AI-generated summary recommendations.
Thanks for the additional details. Kind of illustrates the point that there is a lot of info on the internet but that you still need 4 years of college, 4 years of med school, 3-7 years of residency, and 10 years in practice to interpret it well.
I saw an article recently on how well AI is doing with financial info gleaned from the internet. Something like 43% of recommendations were wrong. Hopefully open evidence and its peers will never be that bad and will continually be getting better.
57% win rate, thx for the tip WCI. Where do I sign up? If I yolo a martingale call strategy on meme stonks with this win rate, it’s like free money!!
If nurses with online degrees can replace physicians and practice independently why couldn’t a computer. I have a sneaking suspicion that a good number of patients I get consulted on from the ER have not been seen by the provider. RN gets a chief complaint and the provider orders the algorithmic work up (anybody with a belly gets a CT) and then I am read a CT report sometimes with mispronounced words…I bet AI could do that.
Really? Ever worked a shift in the ED? Comments like this make me think maybe EM should be a mandatory med school rotation.
Not seeing the patient and then billing for their care is not only unprofessional, but unethical and fraudulent.
Sometimes I call consultants about CT findings that I don’t know what to do with. Take it as a compliment. I’m calling you because I think you know more than I do about that particular subject. I don’t know your specialty, but there is probably some sort of medical problem that emergency physicians typically know more about than you do. Trust me when I say if hospital admin thought an ED could be profitably and competently run without docs it would be.
In the same vein, I always appreciate a good urgent care follow-up with four pages of vaccination history, ill-fitting DME, and the diagnosis of “leg pain” from the practitioner. Usually followed by “the person at the urgent care said they couldn’t tell if it was fractured on the x-ray because there was too much swelling.”
Having said that, I’m positive I’ve sent my fair share of vague referrals to outside specialties who must think that I’m singularly obtuse.
Hospitals are rapidly turning into triage for just radiology, which is why hospitals are currently shutting down after abusing the most important specialty in medicine for decades. It’s sad but true. Check a box while the Hospital System prints money on their scam lobbying to abuse labor.
The future isn’t bright in medicine. DEI just made for more box checkers, while systems got paid more and trapped more physicians in their systems nonsense with admins sucking the rest of the taxpayer fake world reserve currency.
I think I would feel depressed if that was the way I viewed medicine, the world, and our currency. Do you ever feel that way?
This is one of the things I’ve learned about life and in recent years, why so few people are able to think critically, JD. My highest value is truth, honesty, and security in seeking those. Insecurity leads to corrupted thinking, groupthink, mindless following, etc.
What’s happening is happening regardless of how we feel about it. Adapt. You think the money printers, politicians and corporations care about you? They want you to believe the lie.
I’m happy to say a lot are waking up, but sadly, a lot have been locked out because they don’t have the ability to save and the middle class took another huge hit during the covscam.
“Assuredly. I can see specialties like pathology becoming more niche, a future where there are more intraoperative biopsies that are time sensitive and fewer cases in which time is less of the essence.”
What does this even mean? What intraoperative consultations are NOT time sensitive? That’s the whole point of a frozen section, touch prep, etc. – to be as fast as possible.
Believe me, when we call back to the OR and the surgeon has already left, we are not happy. That’s an inappropriate use of pathology services.
“cases where time is not of the essence” refers to pathology cases which aren’t intra-op. For example the growth that’s been there for 5 years that the local dermatologist shaves off and sends to you. Hope that helps
DS
No need to be condescending. Still think you could have written it more clearly.
Perhaps, “There may be increased utilization of intraprocedural consultations where diagnoses are rendered within minutes, and less reliance on routine processing.”
Anyway, I don’t see pathologists being replaced by AI. Our roles will certainly change though. I’d love for AI to be able to help triage cases and do the boring, tedious tasks like counting mitoses. Bring it on. Why isn’t it already mainstream if it’s as good as they say??
I didn’t think it was condescending when I typed that, nor do I honestly now. But if offense was taken, rest assured it wasn’t meant.
To wit, while I’m sure I could always be more clear, the phrasing I used also seemed to me fairly clear. It seems my writing style doesn’t translate to everyone well.
Yes, the less nuance required (maybe that’s the art of medicine), the more I think AI will be of use. The biggest questions regarding AI, outside of clinical judgement or knowledge, is going to be medicolegal issues and issues of bias. Since medicine is human, and the textbooks are written by humans, we have an element of bias in medicine. That and errors of omission (rather than commission, which are easier for a person to catch and remedy) are big road blocks that will slow the implementation in actual clinical decision making.
Best,
DS
I was skeptical at first, however in the months I have been using it, OpenEvidence has changed my practice.
It is great for surface level information that I ‘know but have forgotten.’ You have to learn how to ask questions and it isn’t a tool trainees should use because you absolutely have to have a fund of knowledge to stay out of trouble.
But it is incredibly helpful for helping with things like drug interactions/pharmacodynamics and kinetics, quickly access society guidelines…the sort of factoids that are in the back of your head but you can’t quite put your finger on.
I fed it almost the entire MKSAP question bank and for the questions it could process, it was spot on. For example, does this pneumothorax require a chest tube? Do we still give laxatives battery ingestions? What is the general workup for Adrenal Insufficiency. What are the current recs if central line is found to have a thrombosis on it. Can this or that drug be crushed an put down an NG tube. Yada Yada.
Every doc should try if for a few months, if for nothing else other than to learn how to begin communicating with these sorts of programs. I agree with the crystal ball projection–EPIC and admin will eat this stuff up. Expect more pop ups!
But it will find a place in an Internists daily practice.
In case anyone is looking at this, a pre-publication article just came out comparing AI-augmented docs and docs using traditional resources in FM, ER, IM.
https://www.medrxiv.org/content/10.1101/2024.08.05.24311485v1
I think the key is to have AI augment the doc, not replace the doc. No surprise that helps.