We don’t start with AI, but with clinical problems to solve

Mon 25 August 2025
CHATGPT
Interview

Clínica Alemana de Santiago (Chile) has integrated large language models (LLMs) into its electronic health record (EHR) system to support clinical tasks like summarizing records, generating reports, and improving patient communication. Dr. Alejandro Mauro, Chief of Digital Transformation, explains what is critical for safe and effective use of LLMs.

Clínica Alemana has introduced large language models into its electronic health record system. How exactly is this integration working?

In November 2022, ChatGPT was released, surprising the world with its new language-handling capabilities. Those of us who had already explored and “tested without implementing” previous language models, such as BERT, RoBERTa, or medicalBERT, were aware of their potential but also of their limitations and the difficulty of incorporating them safely into a medical environment.

Incorporating ChatGPT into EHR

The emergence of ChatGPT marked a turning point, adding what previous models lacked: retraining with human feedback, known as Reinforcement Learning with Human Feedback (RLHF).

In December 2022, we began to think of use cases to integrate it into our Electronic Health Record (EHR), developing clinical cases and running initial tests. At that time, it was not technically feasible to incorporate ChatGPT directly into the EHR, but we trusted that technology would advance. The opportunity came when Microsoft enabled the secure execution of OpenAI models in a private cloud infrastructure on Azure, which allowed us to accelerate the project.

As with any innovation initiative, this involved significant risk, and we lacked practical experience implementing LLMs within the EHR. Therefore, we decided to start development with a solid governance and control system, allowing us to create, edit, publish, and maintain prompts, as well as define precisely where in the EHR their use would be authorized.

AlemanaGPT

Thus, AlemanaGPT was born in April 2023. We prioritized the project, first building the prompt manager, then the EHR modules for invocation, and finally deploying it to production in October 2023, almost a year after the release of ChatGPT.

The first functionalities included:

  • EHR summarization. Those of us who develop EHRs know that, despite years of designing robust and well-organized systems, we often end up with platforms that are impractical to review in the real-time demands of clinical practice. ChatGPT’s arrival offered the possibility of solving this historical problem: turning hundreds of pages of structured and unstructured clinical information into a clear, useful summary, allowing professionals to make the most of the 20–30 minutes of a consultation.
  • Medical report generation automates documents requested by insurance companies, educational institutions, or other entities, based on a clinical note or the entire patient history, replacing what clinicians see as “administrative” tasks.
  • Question creation for residents assists educators in generating questions based on clinical cases to facilitate teaching and reduce the creative workload of the instructor.
  • Patient handouts help adapt technical medical language into plain Spanish understandable to non-medical individuals.
  • Improvement of clinical notes aims to expand and refine brief medical text into a more complete, clear, and grammatically correct version.
  • Treatment plan suggestions is an experimental feature that was withdrawn after three months due to risks outweighing benefits.

All these tools are customized by specialty and work area, so, for example, a summary prompt for a dermatologist is not the same as one for a surgeon, tailored to the specific needs of each discipline.

Can you please describe what was critical to make LLMs easy and safe to use?

Mainly, the ability to have secure availability through private clouds that fully comply with legal requirements, both Chilean regulations and international standards. Many people, due to technical ignorance, are unaware that by using public tools hosted on shared infrastructure from companies like Meta, OpenAI, or Google, they may be exposing sensitive patient data and potentially violating their rights.

Those of us who are experts understand how to implement emerging technologies safely and under strict regulatory frameworks. That is why, when Microsoft, one of the clouds we work with, offered to help us develop the project in our private cloud, we didn’t hesitate and immediately began presenting the proposal to the Medical and Legal Departments.

How did you approach the adoption of LLMs, from initial steps to evaluation and ongoing system maintenance?

We conceive our EHR as a living organism that must grow and evolve; any system that does not evolve is destined to disappear. For us, this project was just another step in that evolution.

We usually release EHR improvements without major announcements, simply making them available to everyone or to certain users. It is valuable to see who notices the changes and tries them out. That’s how this project started: we added the AlemanaGPT icon in different places in the EHR and observed user behaviour: who clicked, which features were most attractive, who never used it, who tried it once, and who used it regularly.

From that information, we identified three groups to engage with to understand their experience, perceptions, and improvement opportunities. The feedback allowed us to optimize prompts, adjust specialty adaptations, and refine functionalities.

AlemanaGPT is a living project, with weekly changes. This involves regular meetings with clinical teams for feedback, continuous prompt improvements, and adjustments to the language models used. We initially focused on GPT-3.5 and now utilize various models from OpenAI, Anthropic, and Google, depending on the task.

Even before LLMs, the clinic had implemented various algorithms into clinical workflows. Could you share some examples of AI tools that have had the most positive impact?

Since 2018, after a trip to Israel – home to numerous health AI models – we have been implementing AI algorithms to gain experience and assess their applicability in our cultural and clinical context. Each year, we test 2–4 commercial models with certifications that we can validate as safe for controlled trials. The usual process is to test them for six months and, based on results, decide whether to incorporate them permanently or discard them. From these years, I highlight three key projects:

  • RapidAI is a set of models that identify salvageable brain tissue after a stroke, enabling treatments that would be impossible with traditional methods without this support. It has benefited hundreds of stroke patients at Clínica Alemana, significantly reducing sequelae.
  • LimbusAI has been adapted in our radiotherapy center, contouring CT scans of organs to avoid irradiating them – a step essential for safe treatment planning – which used to take up to three hours. With this AI algorithm, the process was reduced to just two minutes, optimizing time and resources.
  • DeepC. One of the challenges of incorporating AI is that each algorithm involves a lengthy project with configurations, contracts, and parameterization. DeepC, a sort of AI model marketplace, allows us to implement and test new models in a few clicks, drastically reducing setup times and speeding up trials.

In adopting cutting-edge technologies, what key lessons have you learned that could benefit other healthcare organizations?

The main lesson is that no technology solves all cases or is perfect. Many seem revolutionary and promise to change the game, but it is an implementation that reveals their real scope and limitations.

The advantage of having the maturity to test quickly is that we can also discard quickly. We evaluate many technologies, but only a few remain in production, those that deliver real, sustained value.

Our most reliable metric is simple: do healthcare professionals use it? If yes, it adds value; if little or none, we know we must look elsewhere.

The next stage of AI development involves AI agents, and you’ve already developed a modular multi-agent platform. Could you describe how this system works?

In 2024, publications on AI agents and Retrieval-Augmented Generation (RAG) proliferated, with promising proposals. We decided to look for a technology partner to help develop an agent for our EHR.

Working with a more experienced team, we discovered the scope should not be limited to the EHR but extended to all clinic systems: patient portal, website, billing, and more. Thus was born the Clínica Alemana Agent System, launched in production in March 2025.

This system allows us to easily create and connect agents to our information systems, offering a centralized, modular platform to deliver value in any software, whether proprietary or commercial. Agents can incorporate different “tools” that securely access clinic data, transform and analyze it through LLMs.

The first three implementations are planned for the coming months:

  • EHR agent that allows professionals to “chat” with a patient’s medical record, asking specific questions. The agent searches the entire record, including test results, and returns the information in the requested format.
  • Secure AI portal for professionals to prevent the risky practice of uploading identifiable clinical data to public systems, offering a “private ChatGPT” within our cloud.
  • Clinical protocol agent that allows querying protocols via chat, even across hundreds of PDFs, and in a later stage, integrates with the EHR to automatically check compliance.
Clínica Alemana de Santiago, Chile (Photo: www.fotografiaaerea.ci)

You currently serve as Chief Digital Transformation Physician and previously held the role of Chief Information Officer. How do these two roles differ?

Medical informatics specialization is rare in Latin America, with few training centres. The belief that “anyone can talk about technology and lead projects in this discipline” has led to many failed initiatives, for reasons well-documented in scientific literature.

It is like putting a dermatologist in charge of an ICU and expecting mortality not to rise. Each area of medicine requires expertise, and health informatics is no exception. It has its own research, literature, and body of knowledge.

In 2012, Clínica Alemana tasked me with leading the development and evolution of its information systems. I created the Biomedical Informatics Department, a medical department on par with Orthopedics, Pediatrics, or General Surgery. I led it from 2012 to 2022, then became Chief of Digital Transformation, focused on cutting-edge projects.

Currently, I split my time between reading papers, taking courses, exploring new technologies, and executing selected innovative projects.

How do you balance the adoption of technologies with proven clinical benefits, the need to experiment with breakthrough innovations, avoiding the hype around AI, and ensuring compliance with evolving regulations?

We are deeply sceptical and evidence-driven, but also aware that health informatics is a socio-technical discipline where organizational culture plays a decisive role.

We evaluate technologies objectively and methodically, first identifying risks of experimental use. This includes reviewing validation studies, FDA/CE certifications, and technical documentation.

We then conduct pilot tests with professionals to gather direct impressions, followed by larger trials before enabling solutions for all users. Finally, we monitor impact – usage, frequency, context, and real value delivered.