Cohere Founder, Nick Frosst: How To Compete with OpenAI & Anthropic, and Sam Altman’s AI Disservice

20VC with Harry Stebbings

2,459 views • 3 months ago

Video Summary

The discussion focuses on the practical applications and development of AI, particularly large language models (LLMs), and contrasts consumer-focused AI with enterprise solutions. The speaker emphasizes that while LLMs are powerful, their true value lies in augmenting human capabilities in the workplace rather than replacing humans entirely. The conversation also touches upon the limitations of current AI benchmarks, the ongoing debate between open and closed AI models, and the societal implications of AI adoption, including its potential impact on income inequality and the workforce.

A key point of contention is the hype surrounding AGI and existential threats, which the speaker believes detracts from addressing more immediate and tangible issues. The role of enterprise AI, exemplified by Cohere's focus on business applications and customized solutions, is highlighted as a critical area for LLM development. The conversation also explores the challenges and strategies of competing in the AI landscape, the importance of efficiency in model training and deployment, and the evolving nature of human-computer interaction driven by language.

Ultimately, the discussion underscores the need for responsible AI development and deployment, emphasizing the importance of policy and thoughtful integration to ensure AI benefits society broadly. The speaker expresses a more nuanced view of technological progress, acknowledging both the transformative potential of AI and the historical patterns of societal adaptation to new technologies, while also cautioning against the misleading rhetoric that often surrounds the field.

Short Highlights

Enterprise AI is focused on augmenting human work, not replacing it.
Current AI benchmarks are not always accurate reflections of real-world utility.
The hype around AGI and existential threats is misleading and detracts from real issues.
Data quality and efficient training are critical for LLM development.
AI's impact on income inequality depends heavily on policy decisions.

Key Details

The Misleading Hype Around AGI and Existential Threats [0:00]

Sam Altman's predictions about AGI being close are seen as inaccurate and a disservice to the technology.
The speaker criticizes the idea of AI posing an imminent existential threat as academically disingenuous and unhelpful for understanding the technology's real potential harms.
The focus on such threats distracts from more practical concerns like income inequality and workforce changes.

"I don't think Sam Alden has done a service to the world by talking about how close AGI is. I think he has made several predictions now that are wrong and that were obviously wrong at the time he made them."

Lessons from Working with Geoffrey Hinton [1:30]

Geoffrey Hinton approached research creatively and playfully, often using physical analogies to discuss algorithms.
This approach fostered curiosity and intuition, contrasting with a purely equation-driven methodology.

"I learned everything I know about research from those... I was very surprised at how creatively and playfully he approaches research."

Google's Position and the Transformer Architecture [3:57]

The transformer architecture, invented at Google, was not quickly commercialized or scaled within the company.
Many individuals who worked on the transformer left Google to continue developing it elsewhere.
DeepMind, now subsumed into Google, continues to produce good work and products.

Cohere's Focus on Enterprise AI [6:57]

Cohere is a foundational model company specifically targeting the enterprise sector.
Unlike consumer-focused models, Cohere's models are trained for enterprise tool use, integrating with business data and APIs.
The training data and objectives differ; the goal is to augment workplace productivity, not conversational engagement.

"We're a foundational model company – like those other two. So we build foundational models, we build language models."

Data as a Bottleneck and Synthetic Data [11:21]

Data remains a bottleneck, though synthetic data has significantly improved models.
Real-world data is still necessary to initiate the synthetic data generation process.
High-quality data access is a persistent challenge in AI development.

Compute, Algorithms, and Data: The Bottlenecks [13:54]

Algorithm changes have been minimal; the transformer architecture from 2017 is still widely used.
The primary bottleneck is acquiring and generating high-quality data, both real and synthetic.
Current LLM training involves multiple steps, including base modeling and reinforcement learning from human feedback.

The Plateauing of Scaling Laws and Compute [15:17]

The speaker questions the notion that simply throwing more compute at problems guarantees exponential progress, citing GPT-5 as potentially worse than GPT-4 due to cumbersome model selection.
Progress in AI is seen as shifting from pure scaling to more nuanced modeling and product work, including building better connectors and ensuring data security.
The focus is shifting towards enabling practical, multi-step enterprise tasks rather than general AGI.

"Do I think just throwing more compute like some people are thinking there's a plateau? Do I think there's more compute?"

Defining and Achieving AGI [21:07]

AGI is loosely defined as a computer treated and expected to behave like a person.
The speaker believes current technology does not yet meet this definition, as people do not treat language models as they treat humans.

The Role of Enterprise vs. Consumer AI [27:49]

Specializing models for specific use cases, like enterprise tool use, is crucial for product quality.
LLMs generalize well but not as perfectly as highly specialized models.
The trend is not towards a fully unbundled model world but rather models that are generally good at a task and then refined.

The Utility of Benchmarks vs. Real-World Application [34:14]

Benchmarks like LM1B, Hello Swag, and AIM are criticized for not reflecting real-world workplace needs.
Models are often trained on these benchmarks, leading to gamification rather than true utility.
The focus should be on whether a model works for a customer in production and delivers ROI, not benchmark scores.

"Do I think they're all [benchmarks are BS]? Um, I think taken like I think I think it's interesting. I don't know. There's there's good scientific work in some of them."

Model Evolution vs. Hardware Progression [39:03]

While models iterate rapidly, the core transformer architecture remains largely the same.
Training these models still requires significant time and resources, even with advancements in hardware.
The fundamental technology is still sequence modeling, with improvements coming from training methodologies.

The War for Talent and Compensation in AI [43:52]

High salaries for AI researchers are a reality, reflecting the industry's impact and the demanding nature of the work.
The speaker acknowledges the value brilliant people bring to the field and the importance of fair compensation.
Concerns exist about misleading headlines regarding compensation, but the value of skilled individuals is undeniable.

"I know that there are people that are bringing that much value. Um, and I know there's lots of brilliant people and I know that this is a really it's really demanding work."

The Hype vs. Reality of AI's Impact [48:48]

The hype around AI, particularly AGI and the idea of no work, is misleading and potentially damaging.
The technology is transformative, but the discourse needs to focus on real-world applications and implications.
The speaker expresses concern about the potential for AI to exacerbate income inequality without proper policy.

AI's Impact on Income Inequality and Policy [56:57]

AI's effect on income inequality is contingent on policy decisions.
Historical parallels with the industrial revolution suggest that technology can lead to societal shifts, necessitating new labor policies and worker protections.
The goal should be to ensure AI augments humans and facilitates a just transition for the workforce.

Open vs. Closed AI Models and Sovereignty [1:03:32]

Cohere's approach is to release weights for non-commercial use, balancing open access with commercial viability.
The speaker suggests this middle ground is a strategic sweet spot for businesses in the AI landscape.
Geopolitical factors and the desire for technological sovereignty may influence the adoption of models from specific regions.

"I think it's a good idea for countries to have infrastructure within their countries."

The Evolution of Prompting and User Interaction [1:13:09]

Prompting as a specialized skill is likely to diminish as LLMs become more intuitive and aligned with human expectations.
Understanding how language models work and their capabilities will remain crucial, similar to understanding how computers or telephones function.
The focus is shifting from "tricking" models to having natural, iterative conversations.

The Fundraising Journey and Efficiency at Cohere [1:19:40]

Fundraising has evolved from explaining basic concepts to discussing specific customer use cases.
Cohere prioritizes efficient training, enabling models to run on fewer GPUs, which is crucial for enterprise deployment.
The company has spent significantly less on foundational models compared to some competitors, highlighting their focus on efficiency.

"We train very efficiently, right? We train models – we train efficient models."

Competition and Enterprise Focus [1:25:58]

Cohere's singular focus on enterprise differentiates it from consumer-oriented AI companies.
The value proposition for enterprises is in practical ROI and deployment, rather than consumer subscriptions.
The speaker believes the most value from LLMs will be in work and enterprise applications.

The Role of Forward-Deployed Engineers and Enterprise Sales [1:30:29]

Forward-deployed engineers are essential for helping enterprises integrate and derive value from AI technology.
This approach acknowledges that enterprise technology often requires customization and support.
While enterprise revenue growth may be slower, it offers higher quality and stickiness.

Building a Generational Company and Human Values [1:36:30]

The desire to build a "generational company" stems from a human need to create something lasting and meaningful.
This ambition is independent of personal legacy and is about contributing to something larger than oneself.
The speaker reflects on past technological optimism and the importance of grounding expectations in reality.

"I mean, time scale generations – I don't mean like my generations, you know. I mean the idea of building something that is there for a long time."

Disagreements and the Importance of Product Building [1:44:57]

Disagreements with co-founders are typically on low-level business operations and API design, not fundamental vision.
The most important aspect of the business is building a product and solving customer problems, rather than public storytelling.

Sovereignty and Geopolitical Influences in AI [1:52:06]

Technological sovereignty, similar to infrastructure, is important for countries to develop their own AI capabilities.
The dominance of Silicon Valley in tech development has led to a desire for more localized and culturally fluent AI solutions.
Being Canadian is seen as an asset, offering an alternative to US-centric tech companies, especially given geopolitical shifts.

The Future of Input Devices and Human Connection [1:59:05]

Language will become a more central input method for computers, though graphical interfaces will still have their place.
The speaker is concerned about technology potentially disconnecting people and contributing to loneliness, emphasizing the need for tech to foster engagement.
Historical precedents show that each generation often perceives societal decline, but the current concerns about disconnection are valid and require thoughtful technological solutions.

"I think a lot of people are feeling that and I think a lot of people are feeling that because of the because they're experiencing what you're describing statistically in their personal lives."

Regulating AI and Avoiding Misguided Approaches [1:17:03]

Poorly informed regulation, based on a misunderstanding of AI as "digital gods," could stifle development.
Focusing on specific, gameable benchmarks for regulation is seen as a misstep.
The speaker advocates for regulations that understand the nuances of AI technology and its potential for misuse.

Predictions and Tools for Productivity [1:20:36]

A bold prediction for 2026 is that users will be able to seamlessly file expenses using AI, integrating various data sources and policies.
Tools like Whisper and Cohere's own product, North, are highlighted for improving productivity.
Cursor is noted as a valuable coding application, though cost increases for such tools are a business concern.

Personal Traits and Learning from Mistakes [1:23:47]

Curiosity and a contrarian nature are both assets and hindrances, enabling groundbreaking ideas but also leading to occasional errors.
Early technological optimism about monotonic human improvement proved to be an inaccurate belief.
The efficiency of reinforcement learning from human feedback was underestimated.

The Role of Founders and Storytelling [1:49:51]

Founders of foundational model companies often act as public spokespeople, espousing the organization's views.
While consumer-focused companies have a strong motivation for public storytelling, Cohere's enterprise focus makes it less critical.
Building a product and serving customers are prioritized over extensive public relations efforts.

Cultural Value and Labor Market Dynamics [1:33:19]

Societal value and compensation are often tied to the perceived difficulty, uniqueness, and impact of a skill.
The speaker acknowledges that work with broader systemic impact is valued more highly, but also notes that all work requires skill and effort.
The analogy of a cook versus an AI researcher highlights differences in economic valuation, though not necessarily the inherent value of the labor itself.

Competition with China and Model Capabilities [1:18:56]

While China is producing good models, they have not yet surpassed leading US models in overall capability.
The rapid release of multiple models from China indicates a strong and growing AI sector.
The speaker is not explicitly worried about China's AI capabilities but acknowledges their development.