“LLMs are not yet able to address interoperability challenges when it comes to structuring unstructured text into standardized medical terminology,” says Dr. Carina Vorisek, digital health innovation enthusiast at the Berlin Institute of Health and a medical informatics expert focused on building AI-ready, interoperable health systems. In our conversation, we explore data standards, AI in medicine, and everything in between.
The healthcare sector is eager to adopt AI, yet many systems still struggle with basic data interoperability. In your view, how realistic is widespread AI adoption without first solving these interoperability challenges?
Interoperability is foundational for the meaningful and safe adoption of AI in healthcare. Right now, most AI applications I’ve encountered are siloed, developed, and deployed within individual institutions. In a survey we conducted among AI developers, the majority reported working with imaging data from just one institution. That’s highly problematic: AI models trained in such limited environments cannot generalize safely across diverse patient populations. Imaging is often chosen because it’s relatively standardized through formats like DICOM.
But when it comes to textual or structured clinical data – like diagnoses, procedures, or lab results – we face much greater fragmentation. Even though international standards like SNOMED exist, many systems still rely on proprietary codes or free text, making data exchange and AI integration extremely difficult.
During my time at a university hospital in the OB/GYN (editors: Obstetrics and Gynecology department), I had to use seven different IT systems within the same department. That’s a clear example of how fragmented our infrastructure is. For AI to deliver real value, whether through clinical decision support or predictive analytics, it needs access to interoperable data from across systems.
You’ve worked hands-on with standards like FHIR and SNOMED. Which data standards do you believe hold the most promise for truly interoperable healthcare, and what practical steps would accelerate their global implementation?
I believe FHIR and SNOMED offer the strongest foundation for truly interoperable healthcare. FHIR is flexible enough to work across many different healthcare settings, which is essential because if a standard is too strict, it simply won't be usable in real-world care. At the same time, FHIR still allows us to structure data while leaving room for the free-text notes that are so common in clinical practice.
SNOMED is especially powerful because it includes over 360,000 medical concepts, allowing for very detailed and accurate recording of medical information. That level of detail is key if we want to use data effectively for research, AI, or personalized care. In comparison, systems like ICD-10 were mainly designed for billing and only cover a limited number of conditions, though ICD-11 is a big step forward with around 85,000 items.
To support global use of these standards, we need to create real incentives – for example, financial ones – to adopt them.
During your time at Boston Children’s Hospital and Harvard Medical School, what lessons did you take away about the intersection of translational medicine, digital innovation, and artificial intelligence? How could those insights be adapted in Europe?
My time in Boston changed my perspective on how to look at the healthcare system as a doctor. Before, I had a very clinical way of thinking about medicine. But if you want to be innovative, there's so much more to consider: There's medical data, you need to understand how it is collected, analyzed, and reported. Then there are technological advances, so you also need to understand which manual tasks technology can replace and which use cases in medicine can truly benefit from AI and digital tools.
To me, digital health sits at the intersection of all these fields. And all stakeholders, especially patients and doctors, who are often left out, must be part of the conversation to drive meaningful innovation.
One major lesson was the close involvement of clinicians in research. We had weekly meetings where clinicians critically evaluated ongoing research projects, ensuring they stayed aligned with practical clinical needs. In Germany, I see a huge separation between research and clinical practice, which means many good ideas never make it to patients. We need better structures to combine clinical work and research, like medical informatics residencies, to empower clinicians to shape innovation while being able to remain in patient care.
Another key takeaway was the tight collaboration between research and industry. In Germany, research and industry often operate in silos. To change that, we need career paths that allow professionals to move between both areas. Right now, it’s often either research or industry, which limits collaboration and slows progress.
There’s a saying in AI: Garbage in, garbage out. Given the reality of inconsistent or low-quality medical data, do you believe it’s responsible to deploy AI in clinical settings before solving these quality issues?
Here, I’m definitely influenced by my time in the U.S., where the mindset is more: let’s just do it. There’s incredible technology available, and at the same time, we face many challenges in healthcare that digital tools could help solve. So, it would be a shame not to use AI simply because the data isn't perfect yet.
That said, we absolutely have a responsibility to improve data quality as much as possible. We must collect data representing the entire population, not just white men, and implement international standards to ensure interoperability. We also need open conversations and education around bias, risks, and AI limitations in clinical use.
Of course, medicine needs to remain in the hands of healthcare professionals. However, AI can be a powerful support tool, not only for doctors but also for nurses, caregivers, and patients to improve outcomes and well-being.
From your perspective as a physician and researcher, how mature are current AI models when it comes to supporting real-world clinical decision-making, especially in fields like women’s health?
As a physician, I would say that current AI models still feel very immature regarding real-world clinical decision-making, especially when I compare what I see in hospitals to the technological advancements outside of clinical settings. In contrast, I feel we’ve made tremendous progress as a researcher. Especially in Germany, we have powerful algorithms and promising AI systems, but what’s missing is the real-world infrastructure to actually implement them in care.
When it comes to women’s health, things feel even further behind. We still lack innovation in this field, and even more concerning, we lack sufficient data on women in medicine. This creates serious risks when applying AI, because models trained mostly on male data may not be safe or effective for women.
At the same time, the potential for AI to improve women’s health is enormous. For example, we could use AI to reanalyze historical clinical studies that excluded or underrepresented women. Therefore, we need to support the collection of female health data across all life stages by funding research, encouraging female-led startups, and addressing structural barriers like childcare that prevent women from participating in clinical studies. Most importantly, we need to educate all stakeholders – clinicians, researchers, developers – about the importance of inclusive data if we want AI to truly support better health outcomes for everyone.
Large language models, like GPT, can analyze unstructured medical texts. Could their ability to work with unstructured data ease interoperability challenges, or might they mask underlying data quality problems instead?
I get this question a lot, and honestly, there isn’t much solid literature on it yet. It’s something that really needs further investigation. We recently looked into how LLMs could support the coding of medical terms to SNOMED, for example. The results were quite unsatisfying, which reaffirmed the continued need for specialized coders. So, at the moment, LLMs are not able to address interoperability challenges when it comes to structuring unstructured text into standardized medical terminology.
That said, it would be great if AI could support interoperability tasks. They are time-consuming and labor-intensive. But currently, we can't rely on it for that purpose.
I see potential in using LLMs to help process the vast amount of free-text documentation in healthcare systems. That might be the most promising path forward, since there will always be free text in medicine. Ideally, we’ll have a hybrid approach: parts of the data encoded with medical terminologies, and parts processed by LLMs.
Of course, we should always aim to maximize the amount of structured, high-quality, and well-defined data. Structured data remains far more reliable than free text or proprietary formats, especially when we want to ensure consistent and safe use in healthcare.
What is the most critical shift needed, technical or cultural, for healthcare systems to fully realize equitable, AI-driven, and evidence-based digital health solutions?
The most critical shift we need is a cultural one: as a society, we must recognize the value of data sharing, not just for individual care, but for the collective good. Once we embrace that mindset, we can move forward and reshape our healthcare system into one that is more equitable, efficient, and genuinely innovative.
What worries you most about the current development of AI in healthcare?
What worries me most is that it's not happening fast enough. We have the tools, the knowledge, and the potential, but we’re not using them in clinical practice. We need to move to implementation. Let’s go!