Caching Strategies on AWS

AWS Developers

1,068 views • 2 months ago

Video Summary

Databases can be slow, especially when dealing with spinning disks. To address this, developers often implement caching, an in-memory data storage strategy that significantly reduces latency. This video explores various caching techniques, including "cache aside" (lazy loading) and "write through," explaining their mechanics, benefits, and drawbacks. It also delves into practical considerations like serialization overhead, Time-To-Live (TTL) management, cache invalidation, and mitigating cache stampedes. The discussion extends to in-process caching and introduces Valkyrie, an open-source fork of Redis available on AWS, highlighting its serverless and provisioned deployment options for enhanced performance and scalability. A fascinating fact is that implementing a cache can reduce latency from 100 milliseconds to single-digit milliseconds, a substantial improvement for user experience.

Short Highlights

Databases are inherently slow, with spinning disk databases offering latencies in the tens or hundreds of milliseconds.
Caching involves placing an in-memory data store in front of a slower database to achieve single-digit millisecond latencies.
The "cache aside" or "lazy loading" strategy involves the application first checking the cache; if data is not found (a miss), it retrieves it from the database, stores it in the cache with a TTL, and then returns it to the application.
"Write through" caching synchronously writes data to both the database and the cache upon any database write operation, ensuring data freshness but potentially increasing write latency and cache size.
Key challenges in caching include serialization/deserialization overhead, setting appropriate TTLs, cache invalidation, and mitigating "thundering herd" problems where multiple requests simultaneously hit the database on a cache miss.
The video demonstrates implementing caching strategies in Rust code and introduces Valkyrie, an open-source fork of Redis available on AWS, offering serverless and provisioned deployment options.

Key Details

Databases and Caching Fundamentals [0:00]

Databases, especially those on spinning disks, are relatively slow.
In-memory databases are significantly faster.
Caching is a strategy to put an in-memory store in front of a database to achieve single-digit millisecond latencies.
The basic caching flow involves checking the cache first; if the data is not found (a miss), it's retrieved from the database, stored in the cache with a Time-To-Live (TTL), and then returned to the application.