CPSC 436C W25T2 Cloud Computing

fsgeek

4 views • 4 days ago

Video Summary

The video emphasizes the critical importance of thorough peer review in software development, distinguishing between performative feedback and substantive analysis. It highlights that developers often see what they intend the code to do, not what it actually does, making external review essential to catch bugs, especially those that manifest in production and are exponentially more expensive to fix. The discussion delves into various aspects of effective peer review, including focusing on actionable feedback, avoiding subjective opinions, critically examining technical decisions, and ensuring security best practices. A significant portion of the video is dedicated to analyzing sample student projects, demonstrating how to identify strengths and weaknesses in areas like technology choices, testing strategies, security measures, and documentation. The speaker uses these examples to illustrate how to provide constructive criticism, probe for deeper understanding, and learn from mistakes, ultimately fostering better engineering practices and protecting oneself from liability.

A particularly striking fact from the video is the anecdote about Target correctly predicting a woman's pregnancy based on her purchasing patterns before she even knew she was pregnant, underscoring the profound implications of data collection and analysis.

Short Highlights

Peer review is crucial because developers often miss bugs in their own code, seeing what they think it does rather than what it actually does.
Fixing bugs in the design phase is significantly cheaper than fixing them in production, which can cost millions of dollars.
Effective feedback must be concrete, actionable, and grounded in technical reasoning, not personal preference.
Reviewers should question technology choices, architecture, security practices (like least privilege and secret handling), and the presence of automated testing and deployment pipelines (CI/CD).
Analyzing sample projects reveals common pitfalls, such as vague claims, insufficient testing evidence, poor secret management, and a lack of clear reasoning behind technical decisions.
The Flow Health case study highlights the severe consequences of sloppy engineering, including sharing sensitive user data with third parties, and the importance of documenting concerns to protect oneself.
A significant insight from the transcript is that "everything will break in the cloud," emphasizing the need for robust rollback plans and rigorous testing.

Key Details

The Necessity and Cost of Peer Review [00:31]

The expectation of peer review is a critical part of professional software development.
Developers often fail to see errors in their own code because they project their intended logic onto it, rather than seeing what is actually written.
Bugs in production are exponentially more expensive to fix than those caught during design or development. One example cited a half-million-dollar cost for a bug missed in a peer review.

"The reason that we have peer review is because the person who wrote the code doesn't see what the code does. They see what the code is supposed to do."

The High Cost of Bugs in Production and Design [03:31]

The most expensive place to fix a bug is in the field (production).
Traveling to remote locations to fix pre-produced problems in data centers is a costly and time-consuming endeavor.
Design bugs are particularly impactful because they lead to building the wrong infrastructure from the start, and identifying these issues early in the design phase has the greatest impact.

"The best place to fix it is actually in the design. Why? Because you haven't written any code yet."

Principles of Effective Feedback [06:30]

Good feedback should identify something concrete and actionable.
If something is unclear, asking a question is more effective than making a direct accusation of error. This approach encourages the other person to re-examine the problem.
Avoid "taste-driven feedback," such as disliking a font color, as it's subjective and often irrelevant to code functionality.

"If something is clear to you, but you're asking a question of a senior person and you don't want to offend them, ask a question. Don't tell them they're wrong."

Critiquing Technical Decisions and Choices [09:05]

Reviewers should scrutinize technology choices (e.g., VMs vs. containers vs. serverless) and question the rationale behind them, rather than relying on trends or "magic eightballs."
The choice of database (relational, NoSQL, vector) and the motivation for that choice are important review points. For example, understanding why PostgreSQL might be used with a full-text search engine instead of Elasticsearch.
Terms like "Elasticsearch" are for searching free-text data, used when a database's native search capabilities are insufficient.

"What trade-offs were they considering? Or did they just simply roll the magic, you know, pull the magic eightball out and go, 'Uh, oh my eight-ball. What should I be doing?'"

Security and Infrastructure Reviews [11:44]

Evaluating managed vs. self-hosted solutions and the use of multi-cloud strategies is important. Understanding concepts like "egress charges" is crucial.
IAM design should follow the principle of least privilege, with separated roles and clearly defined permissions for each system component.
Secrets should never be stored in plain text or checked into repositories; dedicated secret management systems are necessary. GitHub scanners can detect such vulnerabilities.
Service identities should have scoped permissions to prevent over-privileging.

"When you're doing a code review, you get to be on the other side of this and say, 'Why did you choose a plain text file in your root directory of your repo to store all of your secrets?'"

CI/CD, Testing, and Rollback Strategies [14:54]

The absence of an automated build, test, deploy (CI/CD) pipeline is a significant red flag.
Tests must block deployments if they fail; tests that do not block are often ignored.
A rollback plan is essential because failures are inevitable in cloud environments.
The separation and similarity of development, staging, and production environments are critical to ensure that issues identified in testing are representative of potential production failures.

"If a test actually fails, does it block deployment or do you just simply put a red button on a console someplace that nobody pays attention to?"

Verifying Claims and Compliance [17:42]

Assertions made in projects, such as not storing Personally Identifiable Information (PII) or adhering to specific regional data policies, must be verifiable.
There needs to be an enforcement mechanism for controls and tests. If a claim is made, the evidence and the testing process to validate it should be readily available.
Compliance standards like SOC 2 require diligent data handling, sensitivity assessment, and regulatory adherence, including encryption at rest and proper key management.

"If somebody says we would test this, it's like great. Okay, well the road to hell is paved with these kinds of things, right?"

Analyzing Sample Projects and Identifying Weaknesses [21:06]

Sample projects were provided to allow students to practice review skills.
Effective project documentation is concise, clear, and demonstrates reasoning, not just a laundry list of technologies.
Weaknesses in projects often include vague claims, lack of specific metrics, unsubstantiated assertions of testing, and a failure to explain trade-offs.
The concept of "performative BS" applies when claims are made without evidence, placing the burden of proof entirely on the reviewer.

"I learned a lot. Okay. So was that what's the units for a lot? I mean, is it based in you know grams kilograms barrels?"

Understanding Cloud Technologies and Terminology [36:29]

Cloud computing is laden with terminology, necessitating a willingness to look up unfamiliar terms (e.g., CDN, NAT, WireGuard).
A Content Delivery Network (CDN) moves data closer to users to improve speed and reduce latency, a critical factor for user engagement and app adoption.
The internet's inherent slowness makes CDNs essential for delivering web content efficiently.

"The internet is slow. And the way that we speed things up on the internet... is we move the data that you might use so it's close to you."

Real-World Examples and Learning from Mistakes [42:40]

Projects that detail what was tried, what failed, and what was learned are highly valuable.
Providing paths to actual code and real-world test examples demonstrates transparency and allows for deeper inspection.
The speaker emphasizes that when reviews leave questions unanswered, reaching out to the project owner for clarification is key.
A sign of a good learning experience is admitting what went wrong, naming the trade-offs, and showing reasoning, even if the project isn't a complete technical success.

"Guess what for me. That's what I want to see. I tried something, it didn't work. This is what I learned. I tried something, it didn't work, I found something else, it didn't work as well as I wanted, but it worked better."

Data Privacy, Ethics, and Legal Ramifications [52:50]

The Flow Health case illustrates how sloppy engineering and the sharing of sensitive user data (like menstrual cycle information) with third-party providers (Google, Facebook) can lead to severe privacy breaches.
This practice can enable companies to infer sensitive information, such as a woman's pregnancy, before she might even be aware of it.
Documenting concerns about engineering practices, especially when instructed not to pursue them by management, is crucial for self-protection in legal contexts.

"Why are you sending labeled data to third party providers, not meaningless label, I'm a big fan of UU IDs because they really don't have any semantic information in them."

The Importance of Honesty and Skepticism in Review [49:35]

A project where "everything worked" and "all tests pass" on the first try can be suspicious. This perfection can indicate that the tests might be insufficient or fabricated.
LLMs, like ChatGPT and Grok, can generate confident-sounding but incorrect or misleading information, highlighting the need for human critical evaluation.
When analyzing projects, a healthy skepticism towards overly positive outcomes and unsubstantiated claims is warranted.

"Ever done a scientific experiment of any sort or run a piece of co test code and if all the tests pass and all the code compiles the first time out, I'm like, really? There is something that just does not work here."