← Back
Empirical Priors
Section

Research

Work at the intersection of clinical medicine and machine learning, spanning NLP and large language models in healthcare, clinical informatics, computational biology, and health policy.

Publications

2025

Large language models in oncology: a review

BMJ Oncology, 4:e000759

Chen D, Parsa R, Swanson K, Nunez J-J, Critch A, Bitterman DS, Liu F-F, Raman S

Large language models (LLMs) demonstrate emergent human-like capabilities in natural language processing, presenting opportunities for healthcare integration. In oncology, these systems may support clinical decision-making, enhance patient care, and accelerate research through synthesis of complex multimodal data. The review examines clinician-facing applications including clinical decision support and automated data extraction from EHRs and literature, as well as patient-facing applications for cancer information dissemination and psychosocial support. Significant limitations remain (hallucinations, poor generalization across populations, ethical concerns, and integration challenges), and the authors propose incorporating LLMs within compound AI systems to augment, not replace, oncology practice.

LLMOncologyClinical AIReview
2024

Biomedical text readability after hypernym substitution with fine-tuned large language models

PLOS Digital Health, 3(4)

Swanson K, He S, Calvano J, Chen D, Telvizian T, Jiang L, Chong P, Schwell J, Mak G, Lee J

How do we simplify complex biomedical text to improve patient comprehension? We fine-tuned three large language model variants (a seq2seq + biomedical NER pipeline, T5, and GPT-J-6B) to replace intricate medical terminology with related hypernyms (broader umbrella terms). Evaluating 1,000 medical definitions from the NLM's UMLS using four readability metrics and two sentence complexity measures, all three approaches improved readability, with grade-level scores moving from collegiate down to high-school reading level. GPT-J-6B led on sentence complexity; T5 led on human-rated accuracy and fidelity. The work demonstrates that fine-tuned open-access LLMs can simplify biomedical terminology while preserving meaning and structure.

NLPLLMHealth LiteracyClinical Informatics
2023

Effect of Recent Abortion Legislation on Twitter User Engagement, Sentiment, and Expressions of Trust in Clinicians and Privacy of Health Information

Journal of Medical Internet Research, e46655

Swanson K, Ravi A, Saleh S, Weia B, Pleasants E, Arvisais-Anhalt S

We examined how the Supreme Court's Dobbs v. Jackson Women's Health Organization decision affected public discourse on Twitter regarding abortion, healthcare, and health information privacy. Using tweets from January 2020 through October 2022, we applied LDA topic modeling, RoBERTa sentiment analysis, and Word2Vec embeddings. After the decision leak we observed a transient 576.86% increase in daily users tweeting on these topics (95% CI 545.34–607.92%; P<.001), a sustained 19.81% drop in average sentiment (95% CI -22.98% to -16.59%; P=.001), and decreased latent-dimension association of trust across most clinician- and health-information-related terms. The ruling's impacts extend beyond restricting access; they appear to undermine patient-clinician relationships and confidence in health information privacy.

NLPReproductive HealthHealth PolicyPublic Health
2023

Paging the Clinical Informatics Community: Respond STAT to Dobbs v. Jackson's Women's Health Organization

Applied Clinical Informatics, 14(1):164–171

Arvisais-Anhalt S, Ravi A, Weia B, Swanson K, et al.

An invited multi-author editorial arguing that the clinical informatics community must respond urgently to the implications of the 2022 Supreme Court decision overturning federal abortion protections. Modern medical practice is inextricably linked to health IT, creating significant intersections between informatics expertise and reproductive healthcare challenges. The piece identifies eight key areas for action: strengthening privacy protections for reproductive health data, addressing HIPAA's limits, addressing data-sharing vulnerabilities across state lines, optimizing medical documentation, and extending privacy safeguards to non-covered entities, among others.

Clinical InformaticsHealth PolicyReproductive HealthPrivacy
2022

Website Usability Analysis of U.S. Military Residency Programs

Military Medicine

Chong P, Grob P, DiMattia G, Calvano J, Swanson K, He S, Gubler KD, LaPorta A

We evaluated U.S. Military residency program websites using a standardized usability framework. Across 96 ACGME-accredited programs and 106 websites, we scored four categories: accessibility, marketing, content quality, and technology. Technology was the highest-ranked category (mean 0.749 ± 0.039); marketing (0.414 ± 0.054) and content quality (0.428 ± 0.229) were the weakest. Automated analysis showed military residency websites had more external than internal backlinks and no social media backlinks, with the Army significantly behind the Navy on external backlinking. Improving website usability (particularly marketing and content quality) would meaningfully enhance how these programs communicate with prospective applicants.

Medical EducationWeb UsabilityHealth Informatics
2021

Pegylated Liposomal Doxorubicin Induced Nephrotic Syndrome and Renal Thrombotic Microangiopathy: A Case Report and Review

Clinical Oncology Case Reports

Swanson K, Barry D, Telvizian T, O'Connor C

Anthracyclines such as Pegylated Liposomal Doxorubicin (PLD) are well known for causing cardiotoxicity. Other adverse effects such as nephrotic syndrome and Drug-Induced Thrombotic Microangiopathy (DITMA) have been described in the literature but are not as well identified in clinical practice. A 76-year-old woman with metastatic breast cancer receiving PLD presented with generalized weakness and anasarca. She was found to have acute renal failure and nephrotic syndrome. After extensive workup, she was diagnosed with thrombotic microangiopathy on renal biopsy. The diagnosis of PLD-induced DITMA was made after a thorough work up ruled out other etiologies. Literature since the 1950s shows an association between anthracyclines and nephrotoxicity, specifically the development of nephrotic syndrome. However, large-scale comprehensive review articles for DITMA did not include anthracyclines until May 2020. Moreover, there is a lack of published data on PLD adverse effects as well as concurrent renally dosed adjustments in commonly accessed resources for physicians including UpToDate, LexiComp, NCCN, NCI and SEERS, making it difficult to diagnose. PLD-associated nephrotoxicity and DITMA needs to be addressed in clinical practice. In patients being treated with anthracyclines, we suggest close monitoring of renal function for early identification and intervention. We propose a large-scale multi-institutional analysis to further validate these suggestions.

OncologyNephrologyCase ReportDrug Toxicity
2019

Risk Adjustment Is Necessary in Value-Based Payment Models for Arthroplasty for Oncology Patients

Journal of Arthroplasty, 34(4):626–631

Tan TL, Courtney PM, Shohat N, Brown SA, Swanson KE, Abraham JA

We compared resource utilization between cancer patients and patients with fractures or osteoarthritis undergoing hip arthroplasty, using surgical quality data from 2013–2016 (296 oncology cases vs. 96,480 osteoarthritis and 13,406 fracture cases). Oncology patients had longer mean operative time (155.7 vs. 82.9 vs. 91.0 minutes; P<.001) and length of stay (9.0 vs. 7.2 vs. 2.6 days; P<.001), and after adjustment for demographics and comorbidities, substantially elevated risks of readmission and repeat surgery. Without appropriate risk adjustment, value-based payment models risk discouraging treatment of these higher-resource patients.

OrthopedicsHealth PolicyOncologyValue-Based Care
2015

Age-driven modulation of tRNA-derived fragments in Drosophila and their potential targets

Biology Direct, 10(1):51

Karaiskos S, Naqvi AS, Swanson KE, Grigoriev A

We investigated transfer-RNA-derived fragments (tRFs) in Drosophila melanogaster and revealed structural and functional parallels to microRNAs. tRFs map to over 100 nuclear and mitochondrial tRNA genes across all 20 amino acids, display distinct isoforms preferentially originating from the 5′ or 3′ ends of precursor tRNAs, and contain short "seed" sequences matching conserved regions in 3′ UTRs across 12 Drosophila genomes. tRFs are loaded into Ago1 and Ago2 proteins, with both expression and Ago2 loading changing significantly with age. Gene-ontology analysis suggests tRF targets are enriched for genes involved in neuronal function and development, pointing to a regulatory role in aging-related processes and brain activity.

Computational BiologyGenomicsAgingSmall RNA

Posters & Presentations

2023

Twitter Sentiment Analysis Of Epidural Vs Non-Epidural Analgesia During Parturition🖼

ASA Annual Meeting, San Francisco, CA

Calvano J, Schwell J, Chong P, Petersen T, Bui E, Zapf M, Swanson K

There have not been any studies that explore patient sentiment around healthcare interventions, though calls for this work exist. We developed a sentiment-analysis methodology tailored to healthcare interventions and applied it to tweets about epidural analgesia during childbirth versus non-epidural birth. Using snscrape we collected 717,000 tweets from January 2007 through January 2023, then pre-processed with Latent Dirichlet Allocation (best model: coherence 0.502, perplexity -7.800) and manually removed tweets containing "surgery," "stimulator," "steroid," and "block." Further data-cleansing and grouping into epidural vs. natural birth was performed via the GPT-4 API, leaving 392,944 relevant tweets; 73,000 of these (18.6%) underwent dual sentiment analysis using both RoBERTa (twitter-roberta-base-sentiment-latest) and GPT-4. Chi-square analyses revealed statistically significant differences in sentiment proportions across both models (p ≪ 0.0001). RoBERTa showed a relatively equal sentiment distribution for natural birth and a higher proportion of negative sentiment for epidural use; GPT-4 showed comparatively high positive sentiment for natural birth and roughly balanced positive/negative scores for epidural sentiment. This methodology may help anesthesiology providers evaluate their interventions and gather a unique understanding of patient perspectives and concerns.

NLPAnesthesiologyObstetricsSentiment Analysis
View Poster →
2022

Hypernym Substitution for the Simplification of Biomedical Definitions

TextXD Conference, UC Berkeley

Swanson K, He S, Calvano J, Chen D, Telvizian T, Schwell J, Jiang L, Chong P

A pipeline that bridges medical-record availability (e.g. OpenNotes) and patient comprehension via hypernym substitution. Three approaches were compared: (1) NLP plus programmatic rules using a seq2seq model and biomedical NER, (2) a fine-tuned T5 model, and (3) a fine-tuned GPT-J model, each trained with biomedical definitions paired with gold-standard hypernym-substituted versions. On 1,000 UMLS biomedical definitions, the NLP+programmatic and GPT approaches reduced average grade level from collegiate down to early-high-school. The conference precursor to the 2024 PLOS Digital Health paper.

NLPLLMHealth LiteracyClinical Informatics
2021

Lost in Translation: DoctorLingo Aims to Improve Doctor-Patient Communication Discourse🖼

AMIA 2021 Virtual Clinical Informatics Conference

Calvano J, Swanson K, Beach J, Carlson M, Shen C, Malone M, Lai D, He S

Approximately 36% of US adults have limited health literacy, which is linked to poorer decision-making, therapeutic outcomes, and adherence. The Federal Open Notes Act now allows direct patient access to medical notes with no filters, and patients consistently report jargon as the biggest comprehension barrier. We propose Doctorlingo (DL): a Progressive Web Application that dejargonizes medical language into plain English, designed around usability, accessibility, and readability. The system uses a GraphQL API, a custom medical corpus of ~14 million terms, syntactic/semantic sentence-relationship analysis, and named entity recognition to interpret medical notes. The PWA is disability-friendly with hundreds of ARIA attributes and supports crowdsourced human-in-the-loop NLP corpus expansion. As of April 2021, DL had 6,000 monthly users across 46 countries (78% on mobile), an average term reading level of 8th grade, and held the #1 Google position for 8 keywords with 195 in the top 20. DL serves as an automated medical-jargon translation tool to enhance spoken and written interaction between clinicians and lay people, with potential applications in drug prospectus comprehension, therapeutic adherence, and supplementing other medical information sources.

Clinical InformaticsNLPHealth LiteracyPatient Communication
View Poster →
2019

Risk Adjustment Is Necessary in Value-Based Payment Models for Arthroplasty for Oncology Patients

AAOS 2019 Annual Meeting, Las Vegas, NV

Tan TL, Courtney PM, Shohat N, Brown SA, Swanson KE, Abraham JA

Poster version of the Journal of Arthroplasty paper. After adjusting for demographics and comorbidities, oncology patients undergoing arthroplasty had significantly longer operative times and length of stay and higher rates of readmission and reoperation than osteoarthritis or fracture patients, supporting the need for risk-adjustment in bundled-payment models.

OrthopedicsHealth PolicyOncologyValue-Based Care

Master's Thesis

2015

MicroRNA Discovery in Belgica antarctica: MicroRNA Loci Relocation across Taxa from Duplication

Master's Thesis, Rutgers University-Camden / CCIB

Swanson KE (Advisor: Grigoriev A)

Small non-coding RNAs are a diverse class of molecules with wide biological importance, including regulatory roles, implications for evolution, and possible medical therapeutics. This thesis utilizes and expands upon current methodologies of computational discovery, sequencing analysis, and visualization for non-coding RNA (particularly microRNA, or miRNA) in the Antarctic midge Belgica antarctica and Drosophila melanogaster. These methods, combined with the unique properties of B. antarctica's genome, lead to discoveries of evolutionary and functional importance, especially for a class of miRNA called mirtrons: mirtrons within B. antarctica can relocate to alternative gene loci, or be lost from their host gene. This relocation and loss is based on computational discovery and predictions, supported by examples in the literature across a wide range of taxa, and suggests a re-examination of the mechanisms that birth miRNA, specifically in terms of evolutionary duplication events. The thesis also describes and expands upon Genome Navigator, a tool for the in silico visualization of small non-coding RNA sequencing data; new functionalities were used to elucidate biogenesis properties of tRNA-derived fragments (tRFs), which strikingly resemble the canonical biogenesis cleavage patterns of miRNA.

Computational BiologyGenomicsmicroRNAEvolutionary Biology
empiricalpriors.io