• Hungry Minds
  • Posts
  • ๐Ÿ”๐Ÿง  How to Build a 2,000x Smaller Time Series Database in Rust

๐Ÿ”๐Ÿง  How to Build a 2,000x Smaller Time Series Database in Rust

PLUS: SQL Interview Tips for Big Tech ๐Ÿ’ผ, Stateful vs Stateless Architecture ๐Ÿ”„, Build Solid APIs with Next.js ๐Ÿ”ง

Todayโ€™s issue of Hungry Minds is brought to you by:

Happy Monday! โ˜€๏ธ

Welcome to the 493 new hungry minds who have joined us since last Monday!

If you aren't subscribed yet, join smart, curious, and hungry folks by subscribing here.

๐Ÿ“š Software Engineering Articles

๐Ÿ—ž๏ธ Tech and AI Trends

๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป Coding Tip

  • What are Rust lifetimes and how do they work?

Time-to-digest: 5 minutes

Big thanks to our partners for keeping this newsletter free.

If you have a second, clicking the ad below helps us a tonโ€”and who knows, you might find something you love. ๐Ÿ’š

  • Industry-leading voice models featuring 40ms latency and best-in-class voice quality

  • Instant cloning with 3 seconds of audio, voice changer allowing for fine-grained control, and audio infilling with templated scripts to generate personalized content, at scale

  • Launch voice agents with only 50 lines of code

When dealing with time series data, you often end up with lots of repeated information. A public transport API in Paris was storing 10GB of JSON data about service disruptions, with new entries added every 2 minutes. Could we make this more efficient?

The challenge: 
Compressing time series data while keeping it queryable is hard, as you need to balance size reduction with access speed and data structure complexity.

Implementation highlights:

  • Used interning pattern to deduplicate repeated data by storing unique values in lookup tables

  • Applied specialized data structures for UUIDs and timestamps instead of strings

  • Optimized serialization with delta encoding and run-length encoding for and compression (gzip/brotli/xz) as the final optimization layer

Results and learnings:

  • Achieved 2000x size reduction (from 1.1GB to 530KB) while keeping data queryable

  • Interning works best when applied broadly across all data types, not just strings

  • The resulting system is effectively a simple append-only database perfect for time series

This was a pretty nice one imo. At least re-building the database with Rust got the data less disrupted than the actual service ๐Ÿคฃ

ARTICLE (cross-site shenanigans)
Cross-Site Requests

ARTICLE (code reviews: the art of nitpicking)
How to Do Thoughtful Code Reviews

ARTICLE (API building blocks)
Building APIs with Next.js

ARTICLE (state transitions: the React dance)
{transitions} = f(state)

Want to reach 150,000+ engineers?

Letโ€™s work together! Whether itโ€™s your product, service, or event, weโ€™d love to help you connect with this awesome community.

Brief: OpenAI's Deep Research System Card details rigorous safety testing and risk evaluations for its new model, emphasizing privacy protections and mitigations against potential threats.

Brief: Perplexity announces its upcoming web browser Comet, aiming to reinvent browsing while inviting users to sign up for updates, amidst a competitive landscape.

Brief: DeepSeek aims to accelerate the release of its next-gen R2 AI model, promising enhanced coding skills and multilingual reasoning, following the success of its R1 model.

Brief: Anthropic's Claude 3.7 Sonnet is now live on Twitch, attempting to play Pokรฉmon Red, showcasing its AI reasoning skills while amusingly struggling with basic game mechanics.

This weekโ€™s coding challenge:

This weekโ€™s tip:

In Rust, lifetimes are a powerful feature that ensures references are valid for as long as they are used. They are particularly important when working with borrowed data to prevent dangling references. By explicitly annotating lifetimes, you can help the compiler understand the relationships between references and ensure memory safety.

Wen?

  • Borrowed Data: Use lifetimes when working with borrowed data (references) to ensure that the data remains valid for the duration of its use, preventing dangling references.

  • Structs with References: Essential when defining structs that hold references, as you need to specify how long the referenced data should live to avoid invalid memory access.

  • Complex Function Signatures: Useful in functions that take multiple references and return a reference, ensuring the returned reference is valid for the appropriate scope.

"Do not ignore your gift. Your gift is the thing you do the absolute best with the least amount of effort."
Steve Harvey

Thatโ€™s it for today! โ˜€๏ธ

Enjoyed this issue? Send it to your friends here to sign up, or share it on Twitter!

If you want to submit a section to the newsletter or tell us what you think about todayโ€™s issue, reply to this email or DM me on Twitter! ๐Ÿฆ

Thanks for spending part of your Monday morning with Hungry Minds.
See you in a week โ€” Alex.

Icons by Icons8.

*I may earn a commission if you get a subscription through the links marked with โ€œaff.โ€ (at no extra cost to you).