Building Decision Agents with LLMs & Machine Learning Models

IBM Technology

16,445 views • 1 month ago

Video Summary

Decision agents are crucial for agentic AI tackling complex problems, but large language models (LLMs) are ill-suited for this role due to inconsistency, opacity, and inability to process historical data. Instead, established technologies like decision platforms and business rules management systems offer the necessary consistency, transparency, and agility. These platforms enable domain experts to manage decision logic through low-code environments and integrate analytical insights for more precise outcomes.

These decision platforms provide robust tools for developing and managing decision logic, including specialized editors and repositories for version control. They facilitate thorough validation and testing of rules, ensuring logic completeness and accuracy. Furthermore, simulation tools allow for impact analysis of potential rule changes before deployment, leading to more reliable decision-making processes.

For probabilistic components of decisions, machine learning platforms are employed to build predictive models from historical data, generating insights on factors like fraud risk or creditworthiness. These models are deployed as endpoints and consumed by decision agents. While LLMs aren't primary decision-making tools, they can enhance data ingestion from documents and provide human-readable explanations of complex decision logs, bridging the gap between system logic and human understanding.

Short Highlights

Large language models (LLMs) are not suitable for decision agents due to inconsistency, lack of transparency, and poor handling of historical data.
Decision platforms and business rules management systems are recommended for building decision agents, offering consistency, transparency, agility, and the ability to integrate domain knowledge.
These platforms include editors, repositories for version control, validation tools, and simulation capabilities to ensure accurate and robust decision logic.
Machine learning platforms are used to create predictive models for probabilistic decision components, which are then integrated into the agentic framework.
LLMs can enhance decision agents by aiding in data ingestion from documents and generating explanations for complex decision logs.

Key Details

The Limitations of Large Language Models for Decision Agents [00:15]

Large language models (LLMs) are famous for inconsistency, leading to unpredictable behavior day-to-day.
They are notoriously "black box," making it difficult to explain why a certain decision was made, which is crucial in many business contexts (e.g., loan or job rejections).
LLMs struggle to process historical data and turn it into analytical insight, a key requirement for informed business decisions.
These limitations make LLMs unsuitable for building autonomous decision agents needed for complex problem-solving in agentic AI.

The challenge is that if you have a complex decision that you need to make autonomously, and if you're building agentic AI, you're going to need decisions to be made autonomously. Ah, but these decisions are not a great fit for large language models.

The Advantages of Decision Platforms and Business Rules Management Systems [01:42]

Decision platforms and business rules management systems are well-established automation technologies providing key value propositions for decision agents.
They offer ruthless consistency, ensuring the same decision is made the same way every time for every customer.
Complete control over how decisions are made is provided, with transparent logs and formal definitions of the steps and rules followed.
Agility is a significant benefit, allowing for quick responses to changes in competitors, markets, or regulations without lengthy retraining processes.
These systems facilitate engagement with domain experts through low-code environments, allowing them to manage decision agent behavior programmatically.
They enable the embedding of analytical insights derived from historical data to improve decision precision and accuracy.

First and foremost, we're going to get consistency. So if I use one of these platforms, it's going to make the same decision the same way every time.

Requirements for Decision Agents [02:00]

Consistency: Decisions must be made identically every time, ensuring fairness and predictability.
Transparency: The logic and steps behind a decision must be explainable and auditable.
Agility: The ability to quickly adapt decision-making processes to changing external factors like market shifts or new regulations.
Domain Knowledge Integration: Easy incorporation of expertise from individuals with deep understanding of the business area.
Low-Code Environment: A user-friendly interface that allows non-programmers to manage and define decision logic.
Embedding Analytics: The capability to integrate analytical insights from data to enhance decision accuracy.

Why LLMs Fall Short on Decision Agent Requirements [04:03]

LLMs are inherently inconsistent, with their variation and randomness being a core feature, not a bug, making them difficult to control for consistent decision-making.
They are opaque, making it hard to explain decision processes, which erodes confidence when interacting with customers or regulators.
Changing LLM behavior can be difficult without retraining, unlike explicitly coded business rules.
LLMs are not good at processing structured data or building predictive models from historical data to improve decision precision.

They are definitely not transparent, right? They're very opaque about how they did things. Even attempts to get them to explain themselves are problematic.

Building Stateless, Side-Effect-Free Decision Agents [06:27]

Decision agents need to be stateless, meaning they respond based solely on the data provided at the moment, without retaining past states.
They must be side-effect-free, meaning they only make decisions and do not perform actions that alter external systems.
Statelessness allows workflow agents to manage state and gather data, keeping decision agents simpler and more scalable.
Side-effect-free design enables reuse of decision agents across multiple workflows and contexts (e.g., loan origination, customer communication).
This separation of decision-making from action improves modularity and reusability.

So what does stateless mean? It means that you want them just to respond to whatever data they're given at the moment they're given the data.

Components of a Decision Platform [08:19]

Decision platforms are software stacks designed to build decision services, which are then wrapped into decision agents.
They typically include editors, such as a technical IDE and a low-code editor, for writing business rules and decision logic.
A single repository is linked to these editors, often with specialized features for business rules, including version control and branching.
Validation tools analyze the rules in the repository to check for completeness, overlapping ranges, and other logical issues, ensuring more robust logic.
Testing tools, ranging from simple JSON interfaces to sophisticated test suites, allow for rigorous validation of decision logic against expected outcomes.
Impact tools (or simulation tools) allow for running simulations on historical data to assess the business impact of rule changes before deployment.
A deployment engine deploys the validated rules as a service, which can then be exposed as an agent via protocols like NCP.

Once it's in this repository, and because it's a decision platform focused only on decision-making logic that is stateless and a side effect-free, you can do a lot more testing and validation of the logic.

Managing Rule Changes and Deployments [13:19]

Decision platforms are adept at handling rule changes by updating the engine and managing in-flight transactions without breaking them.
Rules can be packaged as decision services, reused across multiple services, and deployed.
These services are then wrapped as agents and exposed within an agentic framework.
The repository manages versions, allowing for updates and rollbacks of rules.

So what this lets me do is it lets me build these rules, build these decision agents in a very robust way, and then deploy them as a service that I can then use to support my agentic framework by exposing them as agents.

Incorporating Probabilistic Elements with Machine Learning [14:00]

Many decisions involve probabilistic elements, often built using predictive analytics and machine learning on historical data.
Machine learning platforms are used to build models for predictions like fraud likelihood, credit risk, or payoff risk.
These ML models are deployed as endpoints that can be consumed by decision agents.
The process involves data merging, feature engineering (creating predictive characteristics), and running various ML algorithms (neural networks, regression, decision trees) to find patterns and make predictions.
ML can be supervised (guided by a data scientist) or unsupervised (discovering patterns independently).
These deployed endpoints run algorithms trained on historical data to calculate scores or risks in real-time.

These are probabilistic elements that are typically built using predictive analytics, machine learning from my historical data.

Enhancing Decision Agents with Large Language Models [18:30]

Decision platforms and machine learning platforms, while distinct from LLMs, can be enhanced by them.
Data Ingestion: LLMs excel at extracting necessary data from various sources like documents, brochures, or recorded conversations, making it easier to supply data to decision agents.
Explanation of Results: LLMs can take detailed logs of how a decision was made and translate them into human-readable explanations for call center representatives or customers.
This integration makes interaction with decision agents more fluid, both for inputting data and understanding outcomes.

So one of the other use cases for LLMs is to take this log data and turn it into an explanation.

Learning and Improvement in Agents [20:29]

Analytic Agents: Unsupervised ML agents can learn automatically by updating themselves with new data, while others may require data scientists to periodically retrain models based on new analysis.
Decision Agents: These agents do not "learn" in the same way; their behavior is intentionally fixed. Improvement comes through experimentation with different versions of rules.
A/B Testing/Champion Challenger: Multiple versions of rules can be coded, and the system can run comparisons to determine which performs better.
Framework-Level Learning: For complex decisions with long feedback loops (like loan repayment), learning requires a systematic process. This involves logging decision parameters and outcomes over time, then conducting detailed analysis to inform future improvements to the overall agentic AI framework.