Menu
How to Make Claude Code Your AI Engineering Team

How to Make Claude Code Your AI Engineering Team

Y Combinator

76,540 views 3 days ago

Video Summary

The video introduces a new era of software development powered by AI agents, dubbed the "agent era," where building with AI tools is akin to human teamwork with defined roles and processes. The creator, Gary, a seasoned engineer and Y Combinator president, shares his personal journey of coding extensively again, inspired by AI's potential. He highlights GStack, an open-source project he developed, which transforms AI models like Claude Code into an AI engineering team. A key feature demonstrated is the "Office Hours" skill, a distilled version of YC's founder coaching, designed to refine startup ideas through six forcing questions and extensive questioning to validate demand and business models. This process, exemplified by building a tax app to find 1099s from Gmail, reveals the power of AI in not just aggregating data but also in strategic product thinking and business model development, moving beyond simple document aggregation to lead generation for tax preparers, potentially increasing revenue by 10x.

A fascinating insight from the video is the "thin harness, fat skills" approach, where GStack provides a minimal framework to enable specialized AI skills. This system can execute complex tasks like browser automation, code generation, adversarial review of design documents, and even visual brainstorming for UI design, proving that AI is not just a tool for writing code but a comprehensive partner in the entire software development lifecycle. An interesting fact is that GStack has garnered more GitHub stars than Ruby on Rails in just three weeks.

Short Highlights

  • A new era of software development, the "agent era," is upon us, where AI agents work collaboratively like a human team.
  • GStack, an open-source project by YC's Gary, transforms AI models into an AI engineering team with specialized "skills."
  • The "Office Hours" skill, inspired by YC's founder coaching, uses six forcing questions to reframe and validate startup ideas.
  • Demonstrated is the creation of a tax app to find 1099s from Gmail, showcasing AI's ability to identify user pain points and develop business models, moving beyond mere data aggregation.
  • GStack supports advanced features like AI browser automation, code generation, adversarial review of design documents, and visual brainstorming for UI design.
  • The project has achieved significant traction, accumulating over 70,000 GitHub stars.
  • A key technique is the "thin harness, fat skills" approach, enabling AI agents to perform complex tasks.

Key Details

The Dawn of the Agent Era [00:43]

  • We are entering a new era of building software, termed the "agent era."
  • The most effective way to leverage AI agents for substantial work is by treating them as a team with defined roles and processes, mirroring human collaboration.
  • GStack was created to embody this approach and has quickly gained significant traction on GitHub.

"It turns out the way to get agents to do real work is the same way humans have always done it as a team with roles with process with review."

Personal Journey with AI Coding [01:11]

  • The speaker has coded more in the past two months than in all of 2013, marking a return to intense engineering work.
  • This renewed focus was sparked by exploring AI coding tools like Claude Code, influenced by peers who reported no longer writing code manually.
  • The speaker found himself "completely hooked" by the capabilities of these AI tools.

"I've coded more in the past two months than I did in all of 2013, which is the last time I worked really, really hard as an engineer."

The Limitations of Out-of-the-Box Models [01:52]

  • AI models, by default, can "wander" and lack deep understanding of specific data, leading to guesswork.
  • This guesswork can result in plausible-looking code that contains silent, critical bugs.
  • The primary bottleneck is not the AI model's intelligence, but rather the setup and guidance provided to it.

"Out of the box, the model wanders. It doesn't know your data well. So, it guesses. And guessing at that scale is how you get plausible looking code that silently breaks."

GStack: Thin Harness, Fat Skills [02:12]

  • GStack implements the "thin harness, fat skills" approach, providing a lightweight framework for specialized AI capabilities.
  • It transforms AI models into an AI engineering team, where each "skill" acts like a specialist.
  • This architecture allows for modular and powerful AI-driven development.

"GStack is my implementation of the thin harness fat skills approach. It's an open-source repo that I built that turns clawed code into an AI engineering team for you."

Office Hours Skill: YC's Philosophy in Action [02:31]

  • The "Office Hours" skill is a core component of GStack, modeled directly after the Y Combinator (YC) process of coaching startups.
  • It begins by asking six "forcing questions" to help founders reframe their product and business ideas.
  • This skill distills thousands of hours of experience from YC partners into a more accessible format.

"Office hours is one of those skills. It's actually modeled exactly after what we go through at YC as a partner doing office hours with startups."

Building a Tax App: From Idea to Business Model [03:03]

  • The demonstration involves creating a tax app to extract 1099 forms from Gmail and other financial institutions.
  • The "Office Hours" skill helps to refine the startup idea by asking critical questions about user demand and business viability.
  • The AI identified that the user's need extends beyond simple document aggregation to a potential funnel for tax preparer lead generation, a more lucrative business model.

"The hook is we'll find all your 1099 ins for you, solving an immediate pain. But the expansion is now that you have your docs, let's actually get your taxes prepared, which is matchmaking and lead genen for tax preparers."

AI Browser Automation and Security [08:00]

  • GStack's browser automation feature allows AI to log in, navigate to tax documents, and download PDFs, with the user observing the process.
  • This approach bypasses the need for storing credentials or using services like Plaid, and operations occur within the user's actual browser.
  • The system can also leverage code-level tools like Playwright and Chromium for bug fixing and complex interactions.

"The model would be user logs in AI takes over, navigates to tax docs, finds the 1099 ins, downloads it. No plaid, no stored credentials. The user watches the whole thing happen in the visible browser."

Strategic Planning and Design Approaches [09:29]

  • The AI, through the "Office Hours" skill, can generate multiple strategic approaches to a problem, considering different levels of technical complexity and market strategy.
  • Three distinct approaches for the tax app were presented: a simple Gmail search, a full-stack browser automation with a CPA marketplace, and a CPA-first go-to-market strategy.
  • The AI can also integrate user feedback to refine these strategies, for instance, suggesting using browser interaction to bypass OAuth and directly access Gmail.

"It actually thinks through and here's three different approaches. The first approach is Gmail off then search for tax doc not notification then output a checklist of banks which issue 1099s."

Adversarial Review and Design Improvement [13:04]

  • GStack incorporates a multi-step adversarial review process to critically assess and improve product ideas and design documents.
  • The AI actively identifies potential issues like lack of failure handling, privacy concerns, and two-factor authentication gaps.
  • This automated review process can catch and fix numerous issues, improving a design doc's score from 6/10 to 8/10.

"Now, what it's doing is a multi-step adversarial review. It's trying to put your idea through the paces. And as you can see, it's already found a bunch of things and it's going to try to autofix it."

Visual Brainstorming and UI Design [14:06]

  • The "Design Shotgun" tool within GStack generates multiple AI-driven visual designs for user interfaces based on a chosen checklist or dashboard.
  • These designs are created rapidly, often leveraging image generation capabilities.
  • Users can then rate and select the most suitable design, or regenerate options based on feedback.

"These are three directions. It takes about 60 seconds. it actually farms it out to uh OpenAI codecs which um is able to use image gen."

From Code Generation to QA and Deployment [16:13]

  • GStack offers a comprehensive suite of 28 commands, supporting various stages of the development lifecycle from planning to deployment.
  • The "Auto Plan" feature generates default recommendations for CEO, engineering, design, and developer experience reviews.
  • Post-coding, a "Review" skill performs bug catching, similar to a staff-level code review, to find issues missed during the planning phase.

"The sprint process actually works. We already talked about office space, but if you don't want to do a lot of back and forth, if you don't want to be in the weeds, I did create auto plan, which gets you through CEO, engineering, design, and developer experience review."

Automating QA and Browser Interactions [18:10]

  • A significant bottleneck identified was the manual QA process, which the speaker found to be the least enjoyable part of software development.
  • To address this, a new tool was developed that wraps Playwright and Chromium at the CLI level, enabling automated browser interactions, screenshots, complex actions, media downloads, and regression testing for JavaScript and CSS issues.
  • This automation extends to assessing real browser bug issues.

"One of the bottlenecks I ran into was that, you know, once the agent was doing all the work of planning and design and coding it, I found myself sitting there doing QA, probably the least fun part of software development."

The Power of Parallel Development [17:50]

  • The speaker utilizes GStack to run 10-15 parallel code sessions simultaneously, managing multiple projects or features concurrently.
  • This allows for parallel PRs, branches, and features to be developed and merged more or less simultaneously.
  • The system effectively manages a large volume of work, including reviewing hundreds of open-source PRs.

"I run 10 to 15 parallel cla code sessions all at the same time. I might in one session be running office hours on a brand new idea."

GStack's Impact on Productivity and Workflow [20:13]

  • The fear of AI coding leading to supply chain attacks is mitigated by GStack's comprehensive oversight.
  • The system eliminates the need for a traditional to-do list; new ideas, bug reports, or user feedback can be added as new "work items" that go through the standard review process.
  • This streamlined workflow enables the handling of 10 to 50 PRs per day, depending on meeting schedules.

"One of the things that has emerged is I actually click on whenever I have an idea or I get a bug report from a user or I see something on X where someone's frustrated with what GStack or GBrain does, I just click the plus icon in Conductor."

The Unprecedented Opportunity in Software Building [21:28]

  • The current period is described as the most incredible time in history to build software, with the barrier to entry dramatically reduced.
  • The fundamental question for aspiring builders is no longer about the tools or feasibility, but about what they choose to create.
  • The concluding message is an encouragement to innovate and build products that people genuinely want.

"This is the most incredible time in history to build software. The barrier to building just collapsed. The only question left is what are you going to build? It's time to let it rip. Go make something people want."

Other People Also See

The ‘Old Rules’ of SaaS Still Win
The ‘Old Rules’ of SaaS Still Win
Rob Walling 834 views
The rise and fall of OnlyFans
The rise and fall of OnlyFans
Michael Girdley 182,376 views
Building Claude Code with Boris Cherny
Building Claude Code with Boris Cherny
The Pragmatic Engineer 268,005 views