How to Think About System Design (GitHub Engineer's Perspective)
Beyond Coding
97,679 views • 25 days ago
Video Summary
The video discusses the crucial skill of system design for software engineers, emphasizing the difference between hobbyist practice and professional application, which requires concrete impact. It highlights that simplicity is often more complex at scale and that overengineering is a common pitfall, particularly when driven by the desire for "fancy architectures" or status. A key insight is that scaling should be organic and data-driven, rather than premature, with examples of services handling millions of requests on minimal infrastructure at GitHub. The conversation delves into the importance of understanding business constraints, translating technical needs into business impact, and the evolving nature of software development, especially with the advent of AI tools. An interesting fact is that some services at GitHub handling millions of requests per second run on as few as five or six containers in a small Kubernetes cluster.
The discussion also touches upon the "vicious cycle" of needing experience to get jobs at scale-focused companies, the role of online resources in bridging this gap, and the shift from craftsmanship to business impact as a measure of success. The speakers advocate for continuous learning, increasing breadth of knowledge, and approaching problems with curiosity rather than dogma. They stress that professional software engineering is about solving business problems, not just writing elegant code, and that "good enough for today" is often the most pragmatic approach.
Short Highlights
- Professional software engineering requires measurable business impact, not just "fluffy words."
- Simplicity is often more complex at scale; avoid overengineering and premature scaling.
- Services at GitHub handling millions of requests per second can run on as few as 5-6 containers.
- Scale should be addressed organically and based on data and projections, not assumptions.
- Understanding business constraints and translating technical work into business impact is crucial for success.
- Continuous learning, increasing breadth, and curiosity are vital for leveling up as a software engineer.
Key Details
The Importance of Numbers and Business Impact [0:00]
- Professional software engineering necessitates quantifiable results and landed impact on the business, moving beyond mere theoretical concepts.
- Solving business problems sometimes means building software that is "good enough" for the present needs.
- Simplicity at scale is a significant challenge, and the pursuit of it is often more complex than initially perceived.
"When you work professionally as a software engineer, this is not practicing a hobby. You need to have numbers, right? Not just fluffy words."
The Unique Nature of the Tech Community [0:47]
- The tech community is unique in its open sharing of software and products, with platforms like GitHub serving as a backbone for many open-source initiatives.
- This culture extends to learning and teaching, where knowledge is openly shared, and individuals who teach are often highly regarded.
- This contrasts sharply with other industries, like container terminals, where operational knowledge is rarely shared publicly.
"People create software, they create products, and they put them out in the open for free. And GitHub is like the backbone that enables a lot of these things."
Bridging the Gap Between System Design Theory and Practice [2:01]
- System design involves a lot of theory, but putting it into practice is challenging, influenced by the organization's context and future outlook.
- A viral post highlighted a startup that nearly failed due to a migration to Kubernetes, driven by VC pressure for "cloud native" rather than technical necessity.
- This situation resulted in increased AWS costs and a halt in feature shipment, demonstrating the pitfalls of adopting complex architectures without clear business needs.
"The problem with system design is again a lot of people want the fancy stuff but they the boundaries for when you need to scale are very blurry."
Pragmatic Scaling: Less is More [3:53]
- At GitHub, services handling millions of requests per second can operate on a surprisingly small infrastructure, such as five or six containers in a tiny Kubernetes cluster.
- This demonstrates that significant scale can be achieved with minimal resources, challenging the assumption that large-scale operations require extensive infrastructure.
- The approach to problems of scale at GitHub is to never overengineer, designing only for immediate needs and established scale.
"At GitHub, for example, I've built services that handle millions of requests per second that are just simply running on a five or six containers."
Organic Growth and Data-Driven Decisions for Scaling [5:13]
- Scaling decisions should be made when the existing architecture's limits are reached and demand clearly indicates future growth, not prematurely.
- A rewrite or redesign of components is justified when the current architecture is at its limit and future demand is demonstrably growing.
- This approach involves making trade-offs based on available data and realistic projections for future needs.
"We solved for that. We didn't prematurely scale. We just went all the way till the end with the existing architecture."
Startup Scaling: Start Small and Iterate [5:46]
- For startup founders or CTOs, the advice is to design for a smaller user base (e.g., 100 to 1,000 users) rather than anticipating massive scale from day one.
- Initial infrastructure can be as simple as a single VM or two, with basic database setups, avoiding complex clusters.
- Vertical scaling, using powerful single machines with ample CPUs and memory, can support growth significantly before horizontal scaling becomes necessary.
"If I was a startup founder or a CTO of a startup, right? I would never go and start building for 100x scale."
The Pitfall of Over-Reliance on "Big Tech" Architectures [7:53]
- The abundance of information available online about complex system design can lead to a pitfall where companies adopt elaborate architectures unnecessarily.
- This is often driven by observing what large organizations like GitHub use, without considering the specific needs and scale of their own operations.
- Many industries do not require the high availability or global reach that necessitates extremely complex systems.
"The benefit that we have a lot of information out there people can self-educate is now also turning into this pitfall where complex system design is almost I think in most industries where we don't actually have high availability across the globe is not needed."
Hiring for Immediate Needs, Not Future Speculation [8:14]
- Big tech companies, like GitHub, hire for skills that are immediately applicable, as they operate at a scale where complex problems are present from day one.
- While theoretical knowledge of system design is important for interviews, it doesn't always equate to hands-on experience.
- For smaller companies, hiring should focus on what is needed today, with the understanding that future hiring needs will be addressed when the company reaches those scales.
"We just hire for what we need today. Because let's say you're shipping you're shipping features to a service and a lot of our services have hundreds of millions of requests in short periods of time."
The Elusive Nature of Software Maintenance and Evolution [11:14]
- Software is not a "build once" product; it requires continuous evolution and maintenance to fight entropy.
- Unlike physical products like cars, software has a high and regular maintenance cost, which businesses often fail to fully grasp.
- Companies frequently seek a one-time investment for a solution that scales infinitely, which is unrealistic in the fast-evolving tech landscape.
"Software is evolved. It's not built. Uh and the the term here is evolving it so software needs to continuously be evolved and it needs to continuously fight entropy."
Designing for the Next Order of Magnitude [11:53]
- The recommended approach is to design for the next order of magnitude (e.g., if at zero, design for 10x or 100x; if at 10x, design for 100x or 1000x).
- This iterative design philosophy acknowledges that technologies and scales change rapidly, making long-term, fixed designs obsolete.
- This requires continual investment in software evolution, which can be challenging for business stakeholders to predict or budget for.
"We need to build literally for the next order of uh of magnitude."
Specificity Over Generality in Software Design [15:46]
- It's more effective to solve specific problems rather than building generic frameworks that attempt to address hypothetical future issues.
- Trade-offs in software design are inevitable, and focusing on a specific problem allows for clearer and more informed decisions about these trade-offs.
- At GitHub, the team prioritizes solving current problems without over-engineering, introducing optimizations like caching only when necessary.
"I like solving specific problems. I don't like solving generic problems because how can you make trade-offs then, right?"
The Vicious Cycle of Experience and the Role of Online Learning [18:18]
- There's a "vicious cycle" where companies at scale require experience that smaller companies cannot provide, making it difficult for engineers to gain that experience.
- The proliferation of system design courses and online resources (like "Hello Interview Guys") helps engineers theoretically learn concepts and pass interviews, even without extensive hands-on experience.
- This allows individuals to gain the foundational knowledge needed to be considered for roles at large tech companies, where they will then gain practical experience.
"This is the vicious cycle right? So do you how do how do you build experience that is required by this type of companies but there's no other type of company that's going to give you that experience."
The Cruciality of Understanding Business Context [21:48]
- Engineers must understand business constraints and communicate technical solutions in the language of business impact and value.
- Simply explaining technical details is often too abstract for business stakeholders, who may not grasp the magnitude of technical challenges.
- Empathetically understanding the stakeholder's perspective, including their business goals and pressures, is key to successful technical discussions and investment proposals.
"For people that don't understand how to build software, understand the business constraints. So you're never going to be able to convince the business folks about something technically."
The Impact of Business Constraints on Technical Decisions [23:18]
- In critical industries like container terminals, software failures can have life-threatening consequences. Here, sub-millisecond latencies and extreme reliability are paramount.
- The impact of any delay or malfunction directly translates to significant financial losses, emphasizing the need for robust and responsive systems.
- Understanding the daily realities and pressures faced by operators is essential for engineers to propose and justify technical solutions effectively.
"All they care about is I have a ship here who's docked. There's like 10,000 containers on it. I need to empty them into my yard. I have a very fixed window of time and if I don't do it in that window of time, this is going to cost me money."
The Trade-off Between Engineering Craftsmanship and Business Impact [29:39]
- In successful tech companies, rewards and growth are tied to business impact, not solely to technical engineering achievements.
- Engineers must justify their work by explaining its contribution to revenue, customer acquisition, problem-solving, or developer experience, using quantifiable metrics.
- This incentivizes engineers to align their efforts with strategic business goals, preventing them from getting lost in purely technical pursuits without clear business value.
"Your work or your reward bonuses growth whatever is associated with your business impact not your software engineering impact."
The Pragmatism of "Good Enough" Software [31:50]
- The primary priority in professional software engineering is solving the business problem, and sometimes this means delivering software that is "good enough" for the current needs, rather than striving for absolute perfection.
- This "good enough" approach provides a runway and allows the business to function, even if the solution isn't ideal for the distant future.
- A perfectionist mindset can be detrimental if it leads to over-investment in aesthetics or optimizations that do not align with business priorities or timelines.
"But the priority is for the business problem. And sometimes solving business problems means building software that is good enough."
The Evolution of Software Development and AI's Role [37:20]
- The term "vibe coding" is controversial, but AI tools like GitHub Copilot are transforming how code is written, with a significant portion of code now generated by agents.
- This shift frees engineers to focus more on operational aspects, system reliability, and preventing bugs with far-reaching consequences.
- AI can significantly accelerate tasks like benchmarking and analysis, allowing engineers to tackle complex problems more efficiently.
"90% of my code is written by agents. So I use the VS Code agent mode all the time and I've been shipping features to production with it all the time."
The Power of Continuous Learning and Breadth [41:10]
- To become a great software engineer, increasing breadth of knowledge is as important as deepening expertise.
- The ability to learn effectively and quickly adapt to new technologies and concepts is a critical skill in a rapidly evolving field.
- This continuous learning mindset allows engineers to tackle diverse problems, from quantum mechanics to specific technical challenges, fostering innovation and cross-disciplinary thinking.
"Increase your breath. There's a lot of focus on the depth uh early on in your career, which is important. And there's an advice that perpetuates which says uh focus true focus but at the same time don't dismiss everything else that's happening."
Embracing Curiosity as a Driving Force [43:14]
- Becoming comfortable with being uncomfortable and embracing the process of learning new and complex things is key.
- Curiosity fuels drive and allows for the cross-pollination of ideas from different disciplines, leading to a more diversified view of the world and enhanced empathy.
- Approaching the world with curiosity, rather than dogma or stubbornness, leads to more interesting and effective outcomes.
"And I've become very comfortable with being uncomfortable learning something new. And actually, that gives me a joy and kick to the point where it became sort of an addiction."
Other People Also See