• Hungry Minds
  • Posts
  • ๐Ÿ”๐Ÿง  How AWS Lambda Serves Trillions on Requests a Month

๐Ÿ”๐Ÿง  How AWS Lambda Serves Trillions on Requests a Month

PLUS: Instagram System Design ๐Ÿ“ธ, How Cursor IDE Works ๐Ÿ–ฅ๏ธ, Netflix's 140 Million Hours of Data Daily ๐Ÿ“ˆ

In partnership with

Happy Monday! โ˜€๏ธ

Welcome to the 526 new hungry minds who have joined us since last Monday!

If you aren't subscribed yet, join smart, curious, and hungry folks by subscribing here.

๐Ÿ“š Software Engineering Articles

๐Ÿ—ž๏ธ Tech and AI Trends

๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป Coding Tip

  • Use the __slots__ attribute in Python

Time-to-digest: 5 minutes

Big thanks to our partners for keeping this newsletter free.

If you have a second, clicking the ad below helps us a tonโ€”and who knows, you might find something you love. ๐Ÿ’š

Optimize global IT operations with our World at Work Guide

Explore this ready-to-go guide to support your IT operations in 130+ countries. Discover how:

  • Standardizing global IT operations enhances efficiency and reduces overhead

  • Ensuring compliance with local IT legislation to safeguard your operations

  • Integrating Deel IT with EOR, global payroll, and contractor management optimizes your tech stack

Leverage Deel IT to manage your global operations with ease.

AWS Lambda processes tens of trillions of monthly invocations across 1.5M+ active customers. The service has evolved sophisticated techniques to handle massive scale while maintaining performance and isolation between tenants.

The challenge:
Managing billions of asynchronous requests while preventing noisy neighbors from impacting other tenants and maintaining system stability during traffic spikes and component failures.

Implementation highlights:

  • Uses shuffle-sharding to distribute tenants across multiple queues, minimizing the blast radius of noisy neighbors

  • Implements proactive detection and automated isolation of high-traffic tenants to dedicated queues

  • Maintains resilience through processing backlogs during outages and controlled recovery with load shedding

  • Provides detailed observability through metrics like AsyncEventAge and AsyncEventDropped for monitoring

Results & Learnings:

  • Shuffle-sharding with 100 queues and 2 queues per tenant creates 4,950 unique combinations, giving only a 0.02% chance of tenant overlap

  • The system automatically detects and isolates traffic spikes to dedicated queues while maintaining overall stability

  • Load shedding during recovery ensures fair resource allocation and improves mean time to recovery

Pro tip: Monitor key metrics and configure failure handling through destinations/DLQs to avoid data loss ๐Ÿ˜‰

ESSENTIAL (project wizardry)
How Iโ€™ve Run Major Projects

ESSENTIAL (praise the code)
How To Praise

GITHUB REPO (deep research dive)
local-deep-researcher

GITHUB REPO (ai tutorial treasure)
ai-engineering-hub

ARTICLE (s3 simplicity saga)
In S3 simplicity is table stakes

ARTICLE (sync engine dreams)
Sync Engines are the Future

ARTICLE (stamina superpower)
Stamina is a Quiet Advantage

ARTICLE (ai code review conundrum)
Why AI will never replace human code review

ARTICLE (next.js vs tanstack tussle)
Next.js vs TanStack

Want to reach 150,000+ engineers?

Letโ€™s work together! Whether itโ€™s your product, service, or event, weโ€™d love to help you connect with this awesome community.

Brief: Manus, a new autonomous AI agent from China, is impressing early testers with its ability to complete complex tasks rapidly, raising questions about AI leadership and the future of human-machine collaboration.

Brief: Claude Code offers a unique vibe coding experience that prioritizes fun and creativity over precision, but its high cost raises questions about its value for serious projects.

Brief: OpenAI launches the Responses API to empower developers in creating autonomous AI agents, aiming to fulfill the vision of AI joining the workforce by 2025.

Brief: Google introduces Gemini Robotics, aiming to create general purpose robots that can adapt, interact, and perform complex tasks using advanced AI models.

Brief: Apple is set to launch live translation for AirPods with iOS 19, enabling real-time conversation translations, a feature already available in Google's Pixel Buds since 2017.

This weekโ€™s coding challenge:

This weekโ€™s tip:

In Python, the __slots__ attribute can be used to explicitly declare data members in a class, which can significantly reduce memory usage and improve attribute access speed. By using __slots__, you prevent the creation of a __dict__ for each instance, which is especially beneficial when dealing with a large number of instances.

Wen?

  • Memory Optimization: Ideal for classes that will have a large number of instances, as it reduces the memory footprint by preventing the creation of a __dict__ for each instance.

  • Performance Improvement: Useful in scenarios where attribute access speed is critical, as __slots__ provides faster attribute access compared to the default __dict__-based attribute lookup.

  • Immutable Data Structures: Beneficial when you want to enforce a fixed set of attributes, preventing the dynamic addition of new attributes at runtime, which can help in maintaining a more predictable and controlled class structure.

โ€œWhen one door of happiness closes, another opens, but often we look so long at the closed door that we do not see the one that has been opened for us."
Helen Keller

Thatโ€™s it for today! โ˜€๏ธ

Enjoyed this issue? Send it to your friends here to sign up, or share it on Twitter!

If you want to submit a section to the newsletter or tell us what you think about todayโ€™s issue, reply to this email or DM me on Twitter! ๐Ÿฆ

Thanks for spending part of your Monday morning with Hungry Minds.
See you in a week โ€” Alex.

Icons by Icons8.

*I may earn a commission if you get a subscription through the links marked with โ€œaff.โ€ (at no extra cost to you).