Article

Transforming prospective risk adjustment with agentic AI

Learn why GenAI alone fails for risk adjustment and how Reveleer's hybrid approach combines evidence extraction with clinical reasoning to reduce noise by 50%.

November 20, 2025

Learn why pure GenAI fails at scale. Explore the hybrid agentic AI architecture that guarantees consistency and auditability in risk adjustment.

Written by: Paul Burke, Chief Product Officer, Reveleer

Written by:

Paul Burke, Chief Product Officer, Reveleer

Julien Brinas, SVP, AI & Technology, Reveleer

Hybrid AI Approach - Key Takeaways

Key Takeaways: Why the Hybrid AI Approach Wins

Pure GenAI is unreliable at scale: Generative AI alone lacks the control, transparency, and clinical reasoning needed for reliable prospective risk adjustment.
The Agentic Hybrid Solution: Reveleer's architecture splits the task: Generative AI extract and validate raw evidence, while a clinician-owned, rules-based library handles deterministic clinical reasoning.
Audit-Ready Transparency: The solution provides a traceable evidence graph for every suspect, ensuring compliance and easy audit defense, unlike "black box" models.
Operational Efficiency: This hybrid approach reduces "noise" (false positives) by up to 50%, improves capture rates, and allows for rapid, auditable responses to guideline changes.

‍

The future of risk adjustment and quality gap closure is being shaped by smarter technology that improves how we identify patient risks, ensure accurate reimbursements, and drive better health outcomes. Success depends on how well these tools perform in real-world settings to deliver practical, reliable results.

To achieve this, organizations must invest in clinically sound approaches that accurately predict conditions and coding opportunities. This helps streamline operations across payers, providers, and vendors. Strong collaboration between technology providers and healthcare organizations is essential to unlock these benefits.

In value-based care, Generative AI (GenAI) is gaining attention for its potential to transform risk adjustment. By combining natural language processing (NLP) and data synthesis, GenAI has demonstrated in specific use cases that it can create a complete picture of a patient’s health, helping providers close specific care gaps without disrupting their workflow. Many theorize that GenAI will be able to automatically detect at-risk conditions, suggest diagnosis codes, and flag inconsistencies all in real time. It is a great premise, but Gen AI on its own cannot do that accurately and reliably. For one-off demos on small sample data and specific use cases, this technology has shown potential. But at scale, it is unlikely to be able to manage the complexities of healthcare without human-in-the-loop clinical expertise delivering the corpus of facts and formulas to deliver the right results.

With our years of experimentation, we don’t believe that GenAI alone can be the full solution. First, it cannot offer the level of control and transparency to deliver suspecting at the point of care in ways that can be continuously micro adjusted and secondly, it cannot perform clinical reasoning reliably at scale on complex clinical diagnosis. We have to think of GenAI as an unreliable toddler. It can have moments of genius, but it’s extremely unreliable.

What Gen AI can do well is interpret unstructured clinical language and extract more discoverable clinical evidence objects that can then be compared against a new generation of NLP models that understand this same unstructured language. With this unique combination, this hybrid solution can automatically detect at-risk conditions, suggest diagnosis codes, flag inconsistencies, and create a reliable, transparent, non-AI based clinical reasoning layer.

The real breakthrough lies in how it’s integrated into a broader cognitive system—one that combines clinical expertise, prompt engineering, and AI-driven workflows. This enables timely, accurate predictions and recommendations that support a shift toward proactive, prospective risk management.

Ultimately, the key to progress is not just the technology itself, but how thoughtfully it’s applied to improve service, scale operations, and reduce costs.

Our agentic-based suspecting solution‍

Reveleer has been on a multi-year journey to bring generative suspecting to life by methodically delivering this bottom-up, clinical first model. Our solution is an LLM-agnostic cognitive architecture built to support a widening array of prospective suspecting use cases. This proprietary AI-powered platform replaces our legacy NLP system for prospective suspecting with a more accurate, scalable, and cost-effective solution.

This new solution comprises two core services that power the intelligent prospective risk and quality workflows before, during, and after the point of care:

Evidence Extraction Service – extracts structured and unstructured data from diverse clinical chart formats—including PDFs, claims, and FHIR files. It interprets complex conditions, identifies relevant chronic diagnoses, and discerns family history, even from checkboxes.

Prospective Suspecting Engine – this rules-based engine applies over 3,300 clinical rules (including Rx and curated logic) to identify clinically relevant conditions. It supports multiple HCC models (V28, Rx-V08, ESRD-V24) and can be extended to quality models.‍

‍

The real impact of GenAI suspecting for our customers ‍

Generative AI at the point of care is transforming clinical workflows by automating documentation, enhancing record accuracy, and allowing clinicians to focus on the patient while maintaining oversight of their notes. It enables smarter and faster identification of undiagnosed conditions for risk adjustment to drive both better financial and clinical outcomes in value-based care environments. Our new cognitive architecture delivers on all those fronts and supports real-time identification of chronic conditions, and has proven to reduce noise by 50% with our customers, including:

Higher capture rates: Our new hybrid architecture reduces noise by 50% for a similar match rate.

Compliance and accuracy: Automated identification and suggestion features reduce human error, support accurate documentation, and ensure compliance with CMS requirements.

Holistic patient view: Provides a comprehensive, real-time picture of patient risks, enabling more tailored and effective care interventions.

‍

It is important to remember that for each customer, the patient population, data set, and value-based care priorities are different. As we roll out this new system, we ensure that our solution matches or exceeds legacy performance benchmarks, extracts complete evidence types, meets cost targets and customer expectations, and delivers an intuitive UI with traceable evidence. We operate as strategic partners to continuously optimize the technology to meet the requirements and conditions of our customers’ environment. We fully embrace a continuous improvement and optimization mindset to capture and support all of the unique clinical scenarios of each population. ‍

Prospective suspecting with cognitive architectures ‍

Our new technology is now available for prospective suspecting. Prospective suspecting is the process of identifying potential chronic conditions or care gaps before a patient encounter, enabling proactive clinical action in workflow-integrated delivery. Integrating at the point of care to provide the best point of care experience includes providing these insights in admin portals and directly in the integrated EMR. Our solution is helping providers identify chronic conditions before they impact care or reimbursement. Key use cases include:

Suspecting for chronic conditions using NLP across multiple chart files.

Reducing noise in HCC suspecting by over 50%.

It integrates into the clinical workflow through our prospective risk adjustment solutions in the following ways:

Chart ingestion: It receives chart files in various formats.

Evidence extraction: NLP processes the content to extract relevant clinical indicators.

Rule application: The suspecting engine applies clinical rules to identify potential diagnoses.

Output delivery: Results are surfaced in our Prospective Risk Adjustment UI with evidence links and model support.
‍

How agentic suspecting differs from previous methods

In a legacy extraction pipeline, what most companies use today, charts go through a HIPAA-eligible natural language processing (NLP) service that uses machine learning to extract unstructured medical information from clinical text, such as diagnoses, medications, symptoms, and tests. This system emits a flat list of entities and confidence scores, layered ontologies and custom statistical and data science models to eliminate noise and tie those entities to a single purpose such as retrospective coding or quality-gap checks.

Because the output is loosely structured, every new use case (prospective risk, a new quality measure, specialty logic, etc.) requires building a fresh set of filter models. There is no common evidence model that the rest of the organization can reuse.

With an agentic pipeline, our solution optimizes the current workflow to reduce abrasion and improve efficiency and outcomes. First, any CDA, HL7, PDF, or scanned page is converted into a loss-less JSON structure and OCR is applied where needed. An LLM agent then extracts every data point ("Clinical Indicator") our clinical rule library cares about, including medications, labs, tests, procedures, symptoms, disease mentions, values, units, dates, negations, standard codes—and writes them into a structured record. A second LLM agent validates those findings, checking timing, duplicate mentions, “rule-out” language, and clinical plausibility and specificity. Anything that passes becomes part of a validated evidence graph tied to exact page-and-line locations in the source chart. Downstream reasoning lives in a generic algebra-style formulas engine.

With this initial process complete, the same evidence graph can now drive prospective suspecting, HEDIS quality gap closure, retrospective MEAT/TAMPER audits, or any future rule family we add.

Why this matters for our customers

Our new hybrid architecture separates evidence generation from clinical reasoning, creating a single, traceable evidence graph and a modular formulas engine. This enables us to move faster, serve more customer needs from the same data, and lead the market with a scalable, next-generation prospective risk adjustment solution with the following benefits:

Reusable evidence layer. The Gen-AI work is done once per chart. Every current and future use case draws on the same high-quality evidence.

Faster product expansion. Because reasoning is handled by interchangeable formulas, new rule families can launch in weeks without new extraction code or extra data-science projects.

Quick response to guideline changes. Clinicians edit a rule or prompt in the admin UI, publish, and the system redeploys automatically with full audit history without the need to rebuild dictionaries or retrain models.

Clear evaluation dashboards. Performance and results are tracked separately for the evidence layer (precision, recall, noise rate) and for each rule model (HCC/ICD match rate, RAF accuracy, quality-gap closure, etc.). Results are versioned, grouped by disease, and shared with customers to show continuous quality improvement.

Audit-ready transparency. Every suspect includes the supporting evidence chain, satisfying payer, provider, and regulatory reviewers.

Uniquely capable of handling complex, multi-factor scenarios

Take the 65-year-old diabetic with chronic kidney disease, chest pain, elevated troponins, prior coronary disease, and a history of pulmonary embolism. Our evidence layer captures each fact with context—dates, units, negations—and feeds it into a multi-clause formula that mirrors how a cardiologist thinks: (acute MI criteria met AND CKD stage ≥ 3) OR (chest-pain mention + rising troponin + CAD history).

Because the logic is explicit, we can test, tune, and prove its performance on every future release of our software. An LLM that reasons internally may get this right today and wrong after the next fine-tune, and there is no way to know why. It's a black box.

How our approach compares

Some newer GenAI-focused healthcare tech are betting that one large language model can read a chart, reason about everything it finds, and immediately declare whether the case meets a given clinical rule. As a demonstration of potential, the concept is remarkable. There is no clinician-built rules, no explicit logic. You just ask the model and it answers which is a phenomenal vision but unfortunately when the system has to operate at scale in real payer and provider workflows, that approach is likely to hallucinate and misrepresent because a monolithic model cannot address condition, provider, and plan nuances that exist in healthcare. Our approach takes those key healthcare market dimensions into account with the following benefits:

Consistency and control: Every time an LLM is retrained or upgraded, its internal reasoning shifts in subtle ways. That means the same chart can produce a different suspect list from one release to the next, even if no guideline has changed. In the Reveleer solution, the “reasoning” step is captured in deterministic, algebra-style formulas that our clinical team owns. We can guarantee that a diabetes formula behaves the same tomorrow as it did yesterday unless we intentionally change it.

Explainability: Health plan users regularly ask, “Why did you flag this member?” and “Can you tighten the logic here?” A black box model can only offer a probability score or a generated paragraph. Our solution returns the exact evidence chain (page, line, value) plus the formula that evaluated that evidence. If a customer wants a tweak—say, a longer look-back window for chronic kidney disease—the formula can be edited in the admin UI.

Ethical and regulatory risk: When the model itself decides the clinical threshold, any hidden bias or drift becomes a patient-safety issue. Auditors will ask for the rule book; a generative model has no rule book. Our approach keeps humans “in the loop.” LLMs gather and validate facts, but a human-readable rule reviewed and approved by clinicians makes the final call.

Flexibility without rework: Because of the same issues of variable reasoning as LLMs evolve, competitors must retrain or prompt-engineer the whole model whenever it wants a new use case. Our solution just writes a new formula that consumes the existing evidence graph. Prospective risk, HEDIS gap closure, retrospective MEAT/TAMPER audits all share the same extracted facts.

Performance accountability: We track performance separately for the evidence layer (precision, recall, noise rate) and for every formula (HCC/ICD match rate, RAF uplift, quality-gap closure). Those dashboards are visible for customers. A model-only system has only end-to-end accuracy. If results slip, no one can tell whether the issue is extraction or reasoning or both.

‍

The bottom line

A single model in isolation is powerful, but it simply cannot address the nuances of healthcare. Payers and providers need repeatable, explainable, transparent, and adjustable logic they can trust. Reveleer’s new hybrid AI technology splits the problem into two clear responsibilities: (1) LLMs collect and validate the raw evidence and (2) Clinician-owned formulas decide what that evidence means. This hybrid design enables our solution to improve coverage rapidly while preserving the auditability and clinical confidence the market demands whereas a pure “model does it all” approach cannot guarantee at enterprise scale.

About the Author

Paul Burke, Chief Product Officer, Reveleer

With over 25 years of experience in digital product innovation, Paul is a leader in developing technology-driven solutions that bridge gaps in healthcare. As Chief Product Officer at Reveleer, Paul leads our product vision and strategy, ensuring the company responds to the evolving needs of healthcare organizations embracing value-based care.

About the Authors

Paul Burke, Chief Product Officer, Reveleer

Julien Brinas, SVP, AI & Technology, Reveleer

As an entrepreneur and engineering leader focused on leveraging AI to transform complex clinical data for healthcare organizations, Julien brings more than 20 years of experience building and running software across industries from banking to healthcare. As co-founder of MDPortals, a prospective suspecting platform acquired by Reveleer, he helped pioneer point-of-care intelligence for risk adjustment. Today, as SVP of AI & Technology at Reveleer, he leads the company’s next generation of AI for prospective suspecting, which helps payers and risk-bearing providers identify diagnosis gaps earlier, improve RAF accuracy, and scale value-based care programs.

Transforming prospective risk adjustment with agentic AI

Key Takeaways: Why the Hybrid AI Approach Wins

Our agentic-based suspecting solution‍

The real impact of GenAI suspecting for our customers ‍

Prospective suspecting with cognitive architectures ‍

How agentic suspecting differs from previous methods

Why this matters for our customers

Uniquely capable of handling complex, multi-factor scenarios

How our approach compares

The bottom line

About the Author

About the Authors

Heading

Heading

Heading

Transforming prospective risk adjustment with agentic AI

Key Takeaways: Why the Hybrid AI Approach Wins

Our agentic-based suspecting solution‍

The real impact of GenAI suspecting for our customers ‍

Prospective suspecting with cognitive architectures ‍

How agentic suspecting differs from previous methods

Why this matters for our customers

Uniquely capable of handling complex, multi-factor scenarios

How our approach compares

The bottom line

About the Author

About the Authors

Related Resources

Heading

Heading

Heading