#316 Enterprise AI Agents with Jun Qian, VP of Generative AI Services at Oracle

DataCamp

441 views • 2 months ago

Video Summary

The conversation explores the rapid evolution and enterprise adoption of AI agents, emphasizing that while the underlying models are abundant, the "secret sauce" lies in their generative features and the integration of capabilities like web search and memory. The speaker posits that future iterations, like GPT-5, will likely be agents themselves, envisioning a future with personalized digital companions. The discussion highlights the challenges and opportunities in making organizations "AI agent ready," focusing on managing corporate data, governing AI agents in production, and the crucial role of Retrieval Augmented Generation (RAG) systems.

A key focus is the leap from traditional chatbots to RAG-based systems, explaining how large language models have overcome previous limitations in natural language understanding and context, while the ongoing challenge remains in maintaining and updating knowledge bases. The speaker offers practical advice for enterprise adoption, suggesting that leveraging existing enterprise search systems and data management practices is a viable way to build RAG capabilities quickly. This approach allows organizations to add a RAG layer on top of their current infrastructure without a complete rebuild, enabling faster adoption and tangible results, particularly in areas like customer support.

The discussion also delves into the practicalities of enterprise AI agent implementation, including the critical need for robust data management, security, and governance. It stresses the importance of collaboration between technical teams, legal, and compliance departments to navigate the tension between speed and control. The speaker encourages an iterative approach to development, starting with simpler use cases like support chatbots and gradually advancing to more complex agents, while emphasizing continuous learning and broad-minded exploration of AI's potential across various domains, from coding assistance to future advancements in robotics and computing infrastructure.

Short Highlights

AI agents are distinguished by their generative features and capabilities like web search and memory, moving beyond foundational models.
Retrieval Augmented Generation (RAG) systems are crucial for enterprise AI, enhancing traditional chatbots by integrating large language models with updated knowledge bases.
Effective data management and governance are paramount for AI agent integration, requiring collaboration between technical, legal, and compliance teams.
Adopting AI agents is an iterative process, with RAG systems and support chatbots serving as good starting points for tangible results.
Future AI advancements include more powerful computing infrastructure (e.g., 1 million GPUs), enhanced robotics, autonomous driving, and a fundamental rethinking of computing systems.

Key Details

The Evolution of AI Agents and Chatbots [00:00]

The distinction of AI agents lies in their generative features, not just the underlying foundation models.
ChatGPT is cited as an example of a current AI agent, with its success attributed to its unique generative capabilities.
Future AI development, such as GPT-5, is predicted to be more agent-centric than model-centric, envisioning personalized digital companions.
Key features that differentiate advanced AI agents include web search and memory capabilities.
The ability to recall past interactions and store user preferences is a significant factor in an agent's relevance.

The speaker argues that the true innovation in AI agents stems from their generative features, going beyond the foundational models. ChatGPT is presented as a prime example, with its integrated memory and search functionalities being its "secret sauce." This leads to the prediction that future AI, like GPT-5, will likely evolve into more sophisticated agents rather than just sophisticated models, paving the way for personalized digital companions.

What make the chat GBT difference is a gentic features right think about it right and I'm pretty sure you use chatb all the time but what makes chat GB different I think that's a secret sauce

Bridging the Enterprise AI Chasm [01:06]

There is a significant gap between declaring a company's commitment to AI agents and realizing actual value, especially in enterprise settings.
Key challenges for enterprise adoption include making organizations AI agent-ready, managing corporate data for AI agents, and governing these agents in production.
The guest, Jun Kan, has extensive experience in developing high-profile AI products, including Microsoft Power Virtual Agent.

This section introduces the core problem: the difficulty enterprises face in translating AI agent ambitions into tangible business value. It outlines the critical areas that need to be addressed: preparing the organization, handling data, and establishing governance. The introduction of the guest, Jun Kan, highlights his relevant expertise in bringing AI solutions to the enterprise market.

AI agents are the hottest of hot topics, but there's a big leap from your CEO declaring your company to be allin on AI agents to actually getting some value from them, especially in an enterprise setting.

The Distinctiveness of ChatGPT as an Agent [02:16]

ChatGPT is considered the "hottest topic" in AI and is now being categorized as an AI agent.
While many possess access to open models, ChatGPT's user experience stands out due to its generative features.
Early versions of ChatGPT lacked features like web search, memory, and code generation, which have since been integrated.
The "secret sauce" of ChatGPT lies in these generative features that differentiate it from other models.
A prediction is made that GPT-5 will be an agent, not just a model.

The speaker emphasizes that ChatGPT's current prominence stems from its unique generative capabilities, distinguishing it from other available AI models. Features like web search and memory have been crucial additions that enhance its functionality as an agent. The speaker forecasts that future iterations, such as GPT-5, will be designed as agents rather than standalone models.

Why and the chat GBT experience is quite different from lot of agent you saw right and then think about right and go back to 12 months and uh even more than 12 months right when CHP first started it's missing a lot of features for example web search features and the memory features code generation features right some now actually it generate code around the code right so think about it right

The Importance of Memory in AI Agents [04:01]

Memory is identified as a critical component for AI agents, enabling them to retain context from conversations.
OpenAI's introduction of memory features in ChatGPT is highlighted, with users able to see past interactions in their settings.
A "secret command" allows users to view all stored information about their interactions with ChatGPT.
This memory function contributes significantly to ChatGPT's ability to answer a wide range of user questions effectively.
The accuracy of what the AI remembers is crucial, as evidenced by a humorous anecdote about the AI misinterpreting the user's interest in ultra running.

Memory is presented as a key differentiator for AI agents, allowing them to build upon previous interactions and understand context. The speaker points to ChatGPT's memory feature as a prime example, enabling more personalized and relevant responses. However, it's also noted that the AI's memory needs to be accurate to avoid misinterpretations.

Memory is also very important because right and you probably noticed right and uh uh open launched called memory features right and not long ago right three six months ago however they actually using this memory features more than six months ago right if you go to your CHBD settings you can actually see what they memorized all the combinations you had with CHBT and also there are another secret command you can type in strategy we call the Bio we just tell the BIO into CHBT you can see all your informations they saved for you so I think that's one of the secrets right and why it makes CHBD so relevant to a lot of questions you have I think that's agent right so memory to calling that's the the the key component of the agents

Understanding Retrieval Augmented Generation (RAG) [05:17]

RAG systems are explained as a way to improve the accuracy and relevance of AI responses by grounding them in specific data.
The popularity of RAG is contrasted with the limitations of chatbots built eight years prior, which struggled with follow-up questions and knowledge base maintenance.
Traditional chatbots relied on intent classification and hardcoded answers, leading to a lack of contextual understanding and difficulty in updating information.
Large Language Models (LLMs) have significantly improved natural language understanding, eliminating the need for explicit intent classification and better handling follow-up questions.
The challenge of keeping the RAG knowledge base updated remains a critical aspect of RAG implementation.

The speaker clarifies that RAG systems are a significant advancement over older chatbot technologies. Previous chatbots struggled with context and required constant manual updates to their knowledge bases. LLMs have largely solved the natural language understanding problem, but maintaining the currency of the RAG system's knowledge base is still a significant challenge.

I think everybody knows rag right and we don't need to explain what is rag and however I think why it makes rag so popular if we ask this questions for example I think you know concept from your perspective what do you think why is rag suddenly get too popular right and remember about eight years back right we had chatbot right and everyone is building chatbot but now everyone is building ragbased chatbot so what is the difference between the chatbot we built 80 years ago now the rack.

The Practicality of RAG for Enterprise Support [10:02]

RAG systems can be implemented relatively simply by layering them on top of existing open search databases.
This allows teams to build useful RAG systems even with basic knowledge and existing data.
A successful use case is a RAG-based chatbot for internal support, which can answer employee questions about IT issues without human intervention.
This implementation can significantly reduce support effort and costs within an enterprise.
Implementing RAG for support is considered a "low-hanging fruit" that can yield immediate, dramatic results.

The speaker highlights the accessibility of RAG implementation, particularly for enterprise support. By leveraging existing search databases, organizations can quickly build functional RAG systems that provide immediate value. The example of an internal support chatbot demonstrates how RAG can automate responses to common issues, leading to significant efficiency gains and cost savings.

But I think it's just like a very simple implementation of the rag in the support system can save a lot of the effort, right? And uh support effort any enterprise, right? I think that's kind of a low hanging food right to build a rank system for support and I think right that's is a very basic starting point and you can achieve dramatic results immediately.

Data and Knowledge Management for AI Agents [12:11]

Managing and governing data for AI agents is identified as a significant challenge, linking back to traditional enterprise search problems.
Organizations with robust existing enterprise search systems and regularly updated knowledge bases are well-positioned to build RAG layers.
Leveraging existing knowledge base systems, such as those using open search, is more efficient than rebuilding them, especially with vector databases.
The key is to add a RAG layer on top of the existing infrastructure, allowing for query rewriting, reranking, and generation.
Existing data refreshing and management systems can often be retained, avoiding the need for complete overhauls.

The discussion emphasizes that effective AI agent implementation hinges on solid data and knowledge management, similar to enterprise search challenges. The advice is to build upon existing, well-maintained knowledge bases rather than starting from scratch. By adding a RAG layer to existing systems, organizations can achieve robust AI agent capabilities without extensive re-engineering.

This actually go back to the traditional enterprise search problem. So let's say forget the rack, right? And you just want to build the enterprise search, right? Because search the same thing, right? You want the latest information.

Iterative Development and Challenges in RAG Implementation [15:14]

An initial, straightforward RAG implementation might involve simple query rewriting and generation, but practical challenges soon emerge.
A common issue is that search results might return entire documents, which are not ideal for AI processing.
This necessitates changes to the data ingestion pipeline, such as chunking documents into smaller pieces for better embedding and retrieval.
The process of modifying existing knowledge management and data ingestion systems is a crucial second phase of RAG implementation.
Further complexities arise with multimodality, requiring ingestion and generation of data beyond text, such as images and graphs.

The iterative nature of RAG development is highlighted, starting with simple implementations and evolving as challenges arise. A key issue is the need for smarter data chunking and ingestion to ensure the AI receives relevant information. The progression moves from basic text processing to addressing multimodality, requiring more sophisticated data handling.

The search system give return the whole documents which is not useful right for the further processing. So and we because once we get the results back right and we won't do the rer for example to measure which result is bad. However you just return the whole documents back it was helpful.

Talent and Metrics for AI Agents [18:31]

Implementing AI agents typically requires a collaboration between data scientists and engineering teams.
Data scientists are crucial for measuring end-to-end accuracy and results, preparing sample questions, and creating evaluation datasets.
Key metrics for evaluating AI agents include "deflection rate" in support scenarios, indicating the percentage of queries resolved without human intervention.
Historically, a 20% deflection rate was considered good; modern RAG systems can achieve 70-80% deflection rates.
Other signals for performance include users not clicking "talk to human" buttons and clicking on provided links, indicating user satisfaction and information utility.

The successful deployment of AI agents relies on a blend of engineering and data science expertise. Data scientists play a vital role in establishing metrics, particularly deflection rates in support contexts, which have seen significant improvements with RAG systems. User interaction signals, like avoiding human escalation, also serve as important indicators of agent effectiveness.

One key measure called deflection rate. So, this is very standard across industry even 80 years ago, right? When I was in the Microsoft right and this defection rate is the uh the question the B can answer and without need a human involvement right for example right when we start a deflection rate 20% is pretty good right essentially 20% of question you can answer by bot accurately without talking to human that's great 20% is pretty high years ago right when we started this so wow you can you can solve 20% of the question that's that's awesome but nowadays Right.

Enterprise Data Privacy and Security Concerns [23:10]

Data privacy and security are top-of-mind concerns for organizations adopting AI agents.
Existing enterprise systems often have strong data privacy and security protocols, which can be leveraged for AI solutions.
Integrating AI agents with existing entitlement and authorization systems is a key challenge to ensure data is accessed only by authorized users.
Careful design is needed to avoid indexing all knowledge base data indiscriminately, which could lead to unauthorized access.
Collaboration with security, legal, and compliance teams is essential to ensure AI solutions are fully compliant and secure.

Data privacy and security are highlighted as paramount concerns in enterprise AI adoption. The speaker stresses that existing, robust security measures within enterprises should be integrated into AI agent design. This includes ensuring proper authorization and entitlement, preventing broad data indexing, and fostering close collaboration with security and legal departments to maintain compliance.

Data privacy data security are sort of top of mind in terms of the the challenges can you talk me through what people should be worried about uh with these things

Balancing Governance and Speed in AI Adoption [26:30]

The tension between the need for governance and the desire for rapid AI adoption is an ongoing challenge.
Educating compliance and legal teams about how AI and large language models work is crucial for building partnership and trust.
Understanding AI models as software that can be controlled and monitored helps demystify them for non-technical stakeholders.
As legal and compliance teams gain knowledge, they can become more insightful and even help accelerate AI projects.
Strict compliance with regulations, such as EU laws regarding multimodality, must be adhered to, while other areas can be streamlined through understanding and collaboration.

Addressing the conflict between AI's rapid pace and the slower, risk-averse nature of governance is framed as an opportunity. Education is key, helping compliance and legal teams understand AI as controllable software. This understanding fosters collaboration, enabling legal and compliance departments to become partners in accelerating AI initiatives while ensuring adherence to necessary regulations.

Yes. Exactly. I think that's is our ongoing challenge. Uh however, I think the challenge is also opportunities. I think the first question is why complian say no. Then you had to go back to you I I get I get the I get a no every day you have to dial back. So I think the question is you need to educate the team right for example two years ago right when we started a lot of our compliance legal don't really understand how the larger the model chain works.

Key Use Cases and Future of AI Agents [32:12]

RAG systems are recommended as a strong starting point for organizations looking to adopt AI agents, particularly for support functions.
Leveraging existing knowledge bases to build RAG systems can yield tangible results quickly.
Ambitious projects aiming for "AI employees" are noted, but simpler, impactful use cases are more immediately achievable.
Code agents are highlighted as a significant and productive area of AI development, transforming software development processes.
There's a potential for AI to revolutionize fundamental computing systems, including operating systems and command-line interfaces, making them more natural language-driven.

The discussion identifies RAG systems, especially for support chatbots, as an excellent entry point for enterprises into AI agents due to their tangible results. The speaker also points to code agents as a highly productive area that is already transforming software development. Looking further ahead, there's a vision of AI fundamentally altering computing infrastructure, moving towards more intuitive, natural language-based interactions with operating systems and tools.

Oh yeah. I I I think to mention I think uh red actually is really good use case because I think it solve a lot of problems right to start with right and if you already have the knowledge base build today right for example you already have some internal search systems and build a rack system on top of it I think that's is very tangible right you can generate very tangible results

Rethinking Computing and AI's Scientific Frontier [40:21]

The speaker, with 30 years of experience in computer science, has seen AI evolve from early expert systems to today's large language models.
Modern LLMs are described as vastly more capable "expert systems" than anything imaginable decades ago.
There is immense potential for AI to revolutionize computing systems, including operating systems and software development tools, making them more intuitive and efficient.
Despite advancements, fundamental scientific breakthroughs in AI, such as in reinforcement learning, are still needed.
Current AI development is seen as just the beginning, with significant opportunities for new algorithms and scientific advancements.

Reflecting on a long career in AI, the speaker contrasts early, limited "expert systems" with the current power of LLMs. The current era is viewed as a realization of long-held AI visions. While acknowledging the progress, the speaker also emphasizes that fundamental scientific advancements are still crucial, with much potential for future innovation in algorithms and AI research.

Essentially any of language today is a super super smart expert system we cannot even imagine to build 30 years ago, right? If you're 30 years ago, you can build a expert system like any of these large today. It's like a impossible task.

The Future of Physical AI and Computing Infrastructure [42:36]

Physical AI, or robotics, is seen as the next major wave, with personal robots and autonomous driving becoming realities.
The rapid development of computing infrastructure, particularly GPU clusters, is enabling unprecedented AI capabilities.
Projects like "Stargate" represent massive GPU clusters (approaching 500,000 H100 GPUs), dramatically increasing computational power.
This increased compute power accelerates innovation by enabling faster iteration cycles for AI model development and experimentation.
The acceleration of iteration speed, rather than just larger models, is a key benefit of advanced computing infrastructure.

The emergence of physical AI, including robotics and autonomous driving, is discussed as a significant trend, driven by exponential growth in computing power, exemplified by massive GPU clusters. This infrastructure advancement is crucial for accelerating the iteration speed of AI development, leading to faster innovation across agents and foundation models.

I think that's is a clearly trend right and we all have our digital companion physically right and now today we already have this digital companion like CHPD kind of however I think after five years I would assume right we will all have our little cute robotics robots with us and we will all have the autonomous driving I would assume right

Recommendations for Staying Informed in AI [53:35]

Staying informed involves following leading AI researchers and organizations, as well as engaging with podcasts and diverse sources.
Open-mindedness and continuous learning from various perspectives are crucial for developing one's own judgment in the AI field.
It's important not to be bound by a single person's or company's viewpoint, but to gather insights from a wide range of sources.
The rapid pace of AI development means there's always new information and perspectives to absorb.

The speaker advises a strategy of broad engagement and open-mindedness for staying current in the rapidly evolving AI landscape. This includes following key figures and organizations, consuming diverse content like podcasts, and actively synthesizing information to form independent judgments rather than adhering to a single source.

My my recommendation is just like listen and read as much if you can, don't be b any one of them right, the more you learn right and the longer you learn then you can build your own judgment right And so but don't bounce by single person for example oh I will only follow uh this person's blog or suggestion you know and I think I and be very open-minded and try to learn right and from different companies or different personnel right broadcast you know podcast YouTube right and anything right you could learn and uh yeah that's my suggestion