Same Symptoms, Different Care: How Medical AI Encodes Inequality

The promise was straightforward enough. Large language models, trained on the sum total of medical literature, would help emergency physicians triage patients faster, assist radiologists in catching what the human eye missed, and give overwhelmed clinicians a second opinion when the waiting room was full and the clock was running. The reality, according to a growing body of peer-reviewed research, is considerably more uncomfortable. The most capable AI systems available today do not simply reflect the biases embedded in their training data. They amplify them, sometimes dramatically, and they do so in clinical contexts where the consequences land on real human bodies.
In September 2025, a team of researchers led by Mahmud Omar and Eyal Klang at the Icahn School of Medicine at Mount Sinai posted a preprint on medRxiv that tested OpenAI's GPT-5 across 500 physician-validated emergency department vignettes. Each case was replayed 32 times, with the only variable being the sociodemographic label attached to the patient: Black, white, low-income, high-income, LGBTQIA+, unhoused, and so on. The clinical details remained identical. The model's recommendations did not.
GPT-5 showed no improvement in sociodemographic-linked decision variation compared with its predecessor, GPT-4o. On several measures, it was worse. The model assigned higher urgency and recommended less advanced testing for historically marginalised groups. Most striking was the mental health screening disparity: several LGBTQIA+ labels were flagged for mental health evaluation in 100 per cent of cases, compared with roughly 41 to 73 per cent for comparable demographic groups under GPT-4o. The clinical presentation was the same. The only thing that changed was who the patient was described as being.
This is not a theoretical problem. It is a design problem, a procurement problem, and increasingly a legal problem. And it raises a question that hospitals, insurers, and diagnostic tool developers have been remarkably slow to answer: if the most advanced AI model on the market still encodes the biases of the data it was trained on, what exactly are institutions assuming when they plug these systems into patient care?
The Evidence Is Not Subtle
The Mount Sinai findings did not emerge from a vacuum. They are the latest in a pattern of research that has been building for years, each study confirming what the last one suggested and what the next one will almost certainly reinforce.
The same research team published a broader companion study in Nature Medicine in 2025, evaluating nine large language models across more than 1.7 million model-generated outputs from 1,000 emergency department cases (500 real, 500 synthetic). Each case was presented in 32 variations, covering 31 sociodemographic groups plus a control, while clinical details were held constant. Cases labelled as Black, unhoused, or LGBTQIA+ were more frequently directed toward urgent care, invasive interventions, or mental health evaluations. Certain LGBTQIA+ subgroups were recommended mental health assessments approximately six to seven times more often than was clinically indicated. The bias was not confined to one model or one developer. It was a property of the category.
In 2024, Travis Zack and colleagues published a model evaluation study in The Lancet Digital Health examining GPT-4's behaviour across clinical applications including medical education, diagnostic reasoning, clinical plan generation, and subjective patient assessment. The results were damning. GPT-4 failed to model the demographic diversity of medical conditions, instead producing clinical vignettes that stereotyped demographic presentations. When generating differential diagnoses, the model was more likely to include diagnoses that stereotyped certain races, ethnicities, and genders. It exaggerated known demographic prevalence differences in 89 per cent of diseases tested. Assessment and treatment plans showed significant associations between demographic attributes and recommendations for more expensive procedures, as well as measurable differences in how patients were perceived. For 23 per cent of cases, GPT-4 produced significantly different patient perception responses based solely on gender or race and ethnicity.
The broader research landscape tells a consistent story. A systematic review published in 2025 in the International Journal for Equity in Health, encompassing 24 studies evaluating demographic disparities in medical large language models, found that 22 of those studies, or 91.7 per cent, identified biases. Gender bias was the most prevalent, reported in 15 of 16 studies examining it (93.7 per cent). Racial or ethnic biases appeared in 10 of 11 studies (90.9 per cent). These are not edge cases. They are the norm.
And the problem extends well beyond language models. In dermatology, AI models trained primarily on lighter skin tones have consistently shown lower diagnostic performance for lesions on darker skin. A 2025 study in the Journal of the European Academy of Dermatology and Venereology found that among 4,000 AI-generated dermatological images, only 10.2 per cent depicted dark skin, and just 15 per cent accurately represented the intended condition. Meanwhile, analyses of dermatology textbooks used to train both human clinicians and AI systems have shown that images of dark skin make up as little as 4 to 18 per cent of the total. A 2022 study published in Science Advances confirmed that AI diagnostic performance for dermatological conditions was measurably worse on darker skin tones, a disparity directly traceable to training data composition.
The consequences are not abstract. Individuals with darker skin tones who develop melanoma are more likely to present with advanced-stage disease and experience lower survival rates. An AI system that performs poorly on these patients does not merely fail a technical benchmark. It compounds an existing disparity. And a 2024 study from Northwestern University found that even when AI tools themselves were calibrated for fairness, the interaction between physicians and AI-assisted diagnosis actually widened the accuracy gap between patients with light and dark skin tones, suggesting that the problem cannot be solved at the algorithm level alone.
When Machines Hallucinate in the Emergency Room
Bias is not the only vulnerability. In August 2025, a study published in Communications Medicine, a Nature Portfolio journal, tested six leading large language models with 300 clinician-designed vignettes, each containing a single fabricated element: a fake lab value, a nonexistent sign, or an invented disease. The results were striking. The models repeated or elaborated on the planted error in up to 83 per cent of cases. A simple mitigation prompt halved the overall hallucination rate, from a mean of 66 per cent across all models to 44 per cent. For the best-performing model in the study, GPT-4o, rates declined from 53 per cent to 23 per cent. Temperature adjustments, often proposed as a fix for hallucination, offered no significant improvement. Shorter vignettes showed slightly higher odds of hallucination.
For GPT-5 specifically, the Mount Sinai preprint found that its unmitigated adversarial hallucination rate was higher than that observed for GPT-4o. The same mitigation technique achieved a lower rate than before, meaning the baseline risk was worse even as the ceiling for improvement was slightly better.
The clinical implications are severe. If a language model is deployed as a clinical decision support tool and a patient's record contains an erroneous data point, whether through transcription error, system glitch, or adversarial input, the model is more likely to incorporate that error into its reasoning than to flag it as anomalous. It will confabulate around the mistake, generating plausible-sounding but clinically dangerous recommendations. The model does not know what it does not know, and it cannot distinguish between a real lab result and a fabricated one.
This is not a bug that can be patched with a software update. It is a structural property of how these models process information. They are optimised to produce coherent, contextually appropriate text, not to distinguish between real clinical findings and fabricated ones. The distinction matters enormously when the output influences whether a patient receives a chest X-ray or is sent home.
Who Bears the Cost
The populations most affected by AI bias in healthcare are, with grim predictability, those who already face the greatest barriers to adequate care. Racial minorities, women, elderly patients, LGBTQIA+ individuals, people experiencing homelessness, and low-income populations appear repeatedly in the literature as groups for whom AI systems produce systematically different, and often inferior, clinical recommendations.
The Mount Sinai study found a clear socioeconomic gradient in testing recommendations. GPT-5 directed less advanced diagnostic testing toward lower-income groups, with a negative 7.0 per cent deviation for low-income patients and a negative 6.8 per cent deviation for middle-income patients, while high-income patients received a positive 2.2 per cent deviation. Same symptoms, different workups, determined entirely by a label the model should have been ignoring.
The pulse oximetry debacle offers a useful precedent for understanding how bias in medical technology compounds racial health disparities. Research published in the New England Journal of Medicine demonstrated that pulse oximeters systematically overestimated blood oxygen levels in Black patients, with the frequency of occult hypoxaemia that went undetected being three times greater among Black patients compared with white patients. During the COVID-19 pandemic, this meant Black patients were less likely to receive supplemental oxygen when they needed it. The FDA released new draft guidance in January 2025 with updated testing standards, recommending a minimum of 24 subjects from across the Monk Skin Tone scale for clinical studies. But the damage from years of deployment with known racial bias had already been done. As Health Affairs Forefront noted in January 2025, the imperative to develop cross-racial pulse oximeters was “overdue” by any reasonable measure.
The pattern is consistent: a technology is developed, tested primarily on populations that do not represent the full range of patients who will encounter it, deployed at scale, and then studied retrospectively when the harm becomes impossible to ignore. AI in healthcare is following this trajectory with remarkable fidelity.
Sepsis prediction offers another cautionary tale. Epic Systems deployed its widely used Epic Sepsis Model across hundreds of hospitals. When researchers at Michigan Medicine analysed roughly 38,500 hospitalisations, they found the algorithm missed two-thirds of sepsis patients and generated numerous false alerts. A 2025 study published in the American Journal of Bioethics highlighted that social determinants of health data, which disproportionately affect minority and low-income populations, were notoriously underrepresented in the electronic health record data used to train such models, with only 3 per cent of sentences in examined training datasets containing any mention of social determinants. The algorithm did not account for what it could not see, and what it could not see was shaped by who had historically been rendered invisible in medical data systems.
The Institutional Wager
When a hospital system integrates AI into its clinical workflows, it is making a bet. The bet is that the efficiency gains, the reduced clinician workload, and the potential for catching diagnoses that might otherwise be missed will outweigh the risks of systematic error. It is a bet that the tool will perform roughly as well for all patients, or at least that any disparities will be caught by the human clinicians who remain in the loop.
Both assumptions are questionable.
Epic Systems, which commands 42.3 per cent of the acute care electronic health record market in the United States with over 305 million patient records, has rolled out generative AI enhancements for clinical messaging, charting, and predictive modelling. By 2025, the company reported between 160 and 200 active AI projects, with over 150 AI features in development for 2026, including native AI-assisted charting tools, new AI assistants, and advanced predictive models. In February 2026, Epic launched AI Charting, an ambient scribe feature that listens to patient visits and automatically drafts clinical notes and orders. Oracle Health, following its acquisition of Cerner, debuted an entirely new AI-powered EHR in 2025, featuring a clinical AI agent that drafts documentation, proposes lab tests and follow-up visits, and automates coding. The agent is now live across more than 30 medical specialities and has reportedly reduced physician documentation time by nearly 30 per cent.
The efficiency argument is real. But efficiency and equity are not the same thing. When these systems produce different outputs based on demographic characteristics, as the peer-reviewed evidence consistently shows they do, the “human in the loop” defence becomes critical. It also becomes fragile. A clinician reviewing AI-generated notes under time pressure, in a system designed to reduce their workload, is not in an ideal position to catch the subtle ways in which the model's recommendations may have been shaped by the patient's race, gender, or income level rather than their clinical presentation.
The assumption that humans will catch AI errors is further undermined by automation bias, the well-documented tendency for people to defer to automated systems, particularly when those systems present their outputs with confidence and fluency. A November 2024 study examining pathology experts found that AI integration, while improving overall diagnostic performance, resulted in a 7 per cent automation bias rate where initially correct evaluations were overturned by erroneous AI advice. A separate study of gastroenterologists using AI tools found measurable deskilling over time: clinicians became less proficient at identifying polyps independently after a period of AI-assisted practice. A large language model does not hedge. It does not say “I am less certain about this recommendation because the patient is Black.” It produces a clean, authoritative-sounding clinical note, and the bias is invisible unless someone is specifically looking for it.
The Insurance Question
The integration of AI into healthcare is not limited to clinical decision-making. Insurers have been among the most aggressive adopters, and the consequences are already being litigated.
UnitedHealth Group, the largest health insurer in the United States, is facing a class-action lawsuit alleging that its AI tool, nH Predict, developed by its subsidiary naviHealth (acquired in 2020 for over one billion dollars), was used to systematically deny medically necessary coverage for post-acute care. The plaintiffs, who include Medicare Advantage policyholders, allege that the algorithm superseded physician judgment and had a 90 per cent error rate, meaning nine of ten appealed denials were ultimately reversed.
In February 2025, a federal court denied UnitedHealth's motion to dismiss, allowing breach of contract and good faith claims to proceed. The court noted that the case turned on whether UnitedHealth had violated its own policy language, which stated that coverage decisions would be made by clinical staff or physicians, not by an algorithm. A judge subsequently ordered UnitedHealth to produce tens of thousands of internal documents related to the algorithm's deployment by April 2025.
This case is significant not only for its specific allegations but for the structural question it raises. When an insurer deploys an AI system to make coverage decisions, and that system denies care at scale, who is accountable? The algorithm's developers? The insurer's management? The clinicians whose judgment the algorithm overrode? The regulatory framework has no clear answer, and in the absence of clarity, the cost falls on the patients who are denied coverage and must navigate an appeals process that many, particularly elderly and low-income individuals, are ill-equipped to pursue. The asymmetry is stark: the insurer benefits from the speed and scale of algorithmic denial, while the patient bears the burden of proving, one appeal at a time, that the machine was wrong.
The Regulatory Vacuum
Regulatory bodies are aware of the problem. Their responses have been uneven at best.
The United States Food and Drug Administration has authorised over 1,250 AI-enabled medical devices as of July 2025, up from 950 in August 2024. The pace of authorisation is accelerating even as the evidence of bias accumulates. The agency published draft guidance in January 2025 on lifecycle management for AI-enabled devices, introducing the concept of Predetermined Change Control Plans, which allow developers to obtain pre-approval for planned algorithmic updates. This is a meaningful step toward continuous monitoring. But the guidance focuses primarily on safety and effectiveness in technical terms, with limited attention to the question of whether a device performs equitably across demographic groups.
In June 2025, a report published in PLOS Digital Health, authored by researchers from the University of Toronto, MIT, and Harvard, laid bare the scale of the regulatory gap. Titled “The Illusion of Safety,” the report found that many AI-enabled tools were entering clinical use without rigorous evaluation or meaningful public scrutiny. Critical details such as testing procedures, validation cohorts, and bias mitigation strategies were often missing from approval submissions. The authors identified inconsistencies in how the FDA categorises and approves these technologies, and noted that AI's continuous learning capabilities introduce unique risks: algorithms evolve beyond their initial validation, potentially leading to performance degradation and biased outcomes that the current regulatory framework is not designed to detect.
In January 2026, the FDA released further guidance that actually reduced oversight of certain low-risk digital health products, including AI-enabled software and clinical decision support tools. The reasoning was that lighter regulation would encourage innovation. The concern is that it will also encourage deployment without adequate bias testing. The tension between promoting innovation and protecting patients is not new in medical device regulation, but the speed at which AI tools are proliferating makes the stakes unusually high.
The European Union has taken a more structured approach. Under the EU AI Act, which began phased implementation in August 2025, AI systems used as safety components in medical devices are classified as high-risk and subject to stringent requirements: risk management systems, technical documentation, training data governance, transparency, human oversight, and post-market monitoring. Full compliance for high-risk AI systems in healthcare is required by August 2027. The framework is more comprehensive than its American counterpart, but enforcement mechanisms remain untested, and the practical challenge of auditing AI systems for demographic bias at scale is formidable. The European Commission is expected to issue guidelines on practical implementation of high-risk classification by February 2026, including examples of what constitutes high-risk and non-high-risk use cases.
The World Health Organisation released guidance in January 2024 on the ethics and governance of large multimodal models in healthcare, outlining over 40 recommendations organised around six principles: protecting autonomy, promoting well-being and safety, ensuring transparency and explainability, fostering responsibility and accountability, ensuring inclusiveness and equity, and promoting responsive and sustainable AI. The principles are sound. Whether they translate into enforceable standards is another matter entirely. The WHO's Global Initiative on Artificial Intelligence for Health has been working to advance governance frameworks particularly in low- and middle-income countries, where the regulatory infrastructure to evaluate AI tools may be even less developed than in the United States or Europe.
The gap between what regulators recognise as a problem and what they are prepared to do about it remains wide. And in that gap, hospitals and insurers continue to deploy systems whose bias profiles have been documented in peer-reviewed literature but not addressed in procurement requirements.
Accountability Without a Framework
The liability question is perhaps the most unsettled aspect of AI in healthcare. Current legal frameworks were not designed for systems that learn, change, and produce different outputs for different patients based on patterns in training data that no human selected or reviewed.
If an AI clinical decision support tool recommends a less aggressive workup for a Black patient than for a white patient with identical symptoms, and the Black patient's condition is missed, who is liable? The developer who trained the model? The hospital that purchased and deployed it? The clinician who accepted the recommendation without questioning it? Under existing product liability regimes, device manufacturers are often shielded, and the burden tends to fall on clinicians and institutions. But clinicians did not design the algorithm, may not understand its internal workings, and in many cases were not consulted about the decision to deploy it.
Professional medical societies have generally maintained that clinicians retain ultimate responsibility for patient care, regardless of the tools they use. This position is legally and ethically coherent, but it places an extraordinary burden on individual practitioners to detect and override biases that are, by design, invisible in the model's outputs. It also creates a perverse incentive structure: the institutions that benefit from AI efficiency (reduced labour costs, faster throughput, fewer staff) externalise the liability risk to frontline clinicians who had no say in the technology's selection or implementation.
New legislation has been proposed in the United States to clarify AI liability in healthcare, but none has yet been enacted. The result is a regulatory and legal environment in which the technology is advancing faster than the frameworks meant to govern it, with patients and clinicians left to absorb the consequences of that mismatch.
What Meaningful Reform Requires
The research community has not merely identified the problem. It has outlined what solutions would look like. The challenge is that those solutions require effort, money, and institutional will that the current market incentives do not reliably produce.
First, training data must be representative. The persistent underrepresentation of dark-skinned patients in dermatological datasets, of women in cardiovascular research, and of LGBTQIA+ individuals in clinical trial data is not a new problem. But when that data is used to train AI systems that are then deployed at scale, the bias is industrialised. Studies have demonstrated that fine-tuning AI models on diverse datasets closes performance gaps between demographic groups. The data exists, or could be collected. The question is whether developers and institutions are willing to invest in obtaining it.
Second, pre-deployment bias auditing must become mandatory, not optional. The evidence that AI systems produce systematically different outputs based on demographic labels is overwhelming. Yet there is no requirement in the United States that an AI clinical tool be tested for demographic equity before it is integrated into a hospital's workflow. The EU AI Act moves in this direction with its training data governance and risk management requirements for high-risk systems, but enforcement remains a future proposition.
Third, post-deployment monitoring must be continuous and transparent. The FDA's introduction of Predetermined Change Control Plans is a step toward lifecycle accountability, but the focus remains on technical safety rather than equitable performance. An AI system that performs well on average but poorly for specific subpopulations is not safe for those subpopulations, and average performance metrics can obscure the disparity. The “Illusion of Safety” report's finding that the FDA's current framework is ill-equipped to monitor post-approval algorithmic drift makes this point with particular force.
Fourth, procurement processes must include bias testing as a criterion. Hospitals that would never purchase a pharmaceutical product without evidence of efficacy across demographic groups are integrating AI tools with no comparable requirement. The Mount Sinai research provides a template: test the system across sociodemographic labels, measure the variation, and make the results public before deployment. If a model produces different triage recommendations for patients labelled as low-income versus high-income, that information should be available to every hospital considering its adoption.
Fifth, liability frameworks must be updated. If AI systems are going to influence clinical decisions, the legal structures governing those decisions must account for the technology's role. This means clearer allocation of responsibility between developers, deployers, and users, and it means creating mechanisms for patients to seek redress when biased AI contributes to harm. The UnitedHealth litigation may ultimately push courts to establish precedents, but waiting for case law to fill a regulatory void is not a strategy; it is an abdication.
Finally, transparency must become the default. Patients have a right to know when AI has influenced their care, what role it played, and whether the system has been tested for bias relevant to their demographic group. This is not merely an ethical aspiration. In an era when AI-generated clinical notes may shape everything from triage decisions to insurance coverage, it is a basic requirement of informed consent. The WHO's guidance on transparency and explainability points in this direction, but voluntary principles are no substitute for binding obligations.
The Stakes Are Not Abstract
The title of the Mount Sinai medRxiv preprint captures the situation with precision: “New Model, Old Risks.” GPT-5 is, by most technical measures, a more capable system than its predecessors. It is also, by the evidence of this study, no less biased. The assumption that capability and fairness would advance in parallel has not been borne out. And the assumption that human oversight will compensate for algorithmic bias is not supported by what we know about how clinicians interact with automated systems under real-world conditions.
The institutions deploying these tools are making a calculation. They are betting that the benefits will outweigh the harms, that the efficiencies will justify the risks, and that the populations most likely to be harmed by biased AI are the same populations least likely to have the resources to hold anyone accountable.
That calculation may prove correct in the short term. In the longer term, it is the kind of institutional wager that generates class-action lawsuits, regulatory backlash, and, most importantly, measurable harm to patients who came to the healthcare system seeking help and received instead the outputs of a machine that treated their identity as a clinical variable.
The question is not whether AI will be integrated into healthcare. That integration is already underway, at scale, across the world's largest health systems. The question is whether the institutions driving that integration will treat equity as a design requirement or as an afterthought. The research is clear on what the problem is and how severe it remains. The gap between what we know and what we are willing to do about it is where the harm lives.
References
Omar, M., Agbareia, R., Apakama, D.U., Horowitz, C.R., Freeman, R., Charney, A.W., Nadkarni, G.N., and Klang, E. “New Model, Old Risks? Sociodemographic Bias and Adversarial Hallucinations Vulnerability in GPT-5.” medRxiv, September 2025. DOI: 10.1101/2025.09.19.25336180.
Omar, M., Klang, E., et al. “Sociodemographic biases in medical decision making by large language models.” Nature Medicine, 2025. DOI: 10.1038/s41591-025-03626-6.
Zack, T., et al. “Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study.” The Lancet Digital Health, January 2024. DOI: 10.1016/S2589-7500(23)00225-X.
“Multi-model assurance analysis showing large language models are highly vulnerable to adversarial hallucination attacks during clinical decision support.” Communications Medicine (Nature Portfolio), August 2025. DOI: 10.1038/s43856-025-01021-3.
“Evaluating and addressing demographic disparities in medical large language models: a systematic review.” International Journal for Equity in Health, Springer Nature, 2025. DOI: 10.1186/s12939-025-02419-0.
“Sociodemographic bias in clinical machine learning models: a scoping review of algorithmic bias instances and mechanisms.” Journal of Clinical Epidemiology, 2024. DOI: 10.1016/j.jclinepi.2024.111422.
Joerg, et al. “AI-generated dermatologic images show deficient skin tone diversity and poor diagnostic accuracy: An experimental study.” Journal of the European Academy of Dermatology and Venereology, 2025. DOI: 10.1111/jdv.20849.
“Disparities in dermatology AI performance on a diverse, curated clinical image set.” Science Advances, 2022. DOI: 10.1126/sciadv.abq6147.
Sjoding, M.W., et al. “Racial Bias in Pulse Oximetry Measurement.” New England Journal of Medicine, 2020. DOI: 10.1056/NEJMc2029240.
“The Overdue Imperative of Cross-Racial Pulse Oximeters.” Health Affairs Forefront, January 2025.
“Bias in medical AI: Implications for clinical decision-making.” PMC, 2024. PMCID: PMC11542778.
“The Algorithmic Divide: A Systematic Review on AI-Driven Racial Disparities in Healthcare.” PubMed, 2024. PMID: 39695057.
“The illusion of safety: A report to the FDA on AI healthcare product approvals.” PLOS Digital Health, June 2025. DOI: 10.1371/journal.pdig.0000866.
Estate of Gene B. Lokken et al. v. UnitedHealth Group, Inc. et al. Federal court ruling, February 2025. Georgetown Health Care Litigation Tracker.
U.S. Food and Drug Administration. “Artificial Intelligence-Enabled Device Software Functions: Lifecycle Management and Marketing Submission Recommendations.” Draft Guidance, January 2025.
U.S. Food and Drug Administration. “Artificial Intelligence and Machine Learning in Software as a Medical Device.” FDA AI/ML Device Database, July 2025.
European Commission. “EU AI Act: Regulatory Framework for Artificial Intelligence.” Phased implementation beginning August 2025, with full high-risk compliance required by August 2027.
World Health Organisation. “Ethics and governance of artificial intelligence for health: Guidance on large multi-modal models.” January 2024. ISBN: 9789240084759.
“Bias recognition and mitigation strategies in artificial intelligence healthcare applications.” npj Digital Medicine, 2025. DOI: 10.1038/s41746-025-01503-7.
“Automation Bias in AI-Assisted Medical Decision-Making under Time Pressure in Computational Pathology.” arXiv, November 2024. arXiv:2411.00998.
“Exploring the risks of automation bias in healthcare artificial intelligence applications: A Bowtie analysis.” ScienceDirect, 2024. DOI: 10.1016/j.caeai.2024.100241.
“Mitigating Bias in Machine Learning Models with Ethics-Based Initiatives: The Case of Sepsis.” American Journal of Bioethics, 2025. DOI: 10.1080/15265161.2025.2497971.
Wong, A., et al. “External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients.” JAMA Internal Medicine, 2021. (Epic Sepsis Model evaluation at Michigan Medicine.)
Epic Systems. AI Charting and generative AI clinical tools deployment, February 2026. Epic Newsroom.
Oracle Health. Clinical AI Agent deployment across 30+ medical specialities, 2025. Oracle Health press materials.
“Gender and racial bias unveiled: clinical artificial intelligence (AI) and machine learning (ML) algorithms are fanning the flames of inequity.” Oxford Open Digital Health, 2025. DOI: 10.1093/oodh/oqaf027.

Tim Green UK-based Systems Theorist & Independent Technology Writer
Tim explores the intersections of artificial intelligence, decentralised cognition, and posthuman ethics. His work, published at smarterarticles.co.uk, challenges dominant narratives of technological progress while proposing interdisciplinary frameworks for collective intelligence and digital stewardship.
His writing has been featured on Ground News and shared by independent researchers across both academic and technological communities.
ORCID: 0009-0002-0156-9795 Email: tim@smarterarticles.co.uk








