Blog

Building a Scalable System (Part 1)

Building a Scalable System for Tracking Likes and Interests in Conference Events

When building a platform for conferences, it’s essential to handle a fundamental question efficiently: which events has a user liked or marked as interested? In a conference setting, users often browse many talks, and a snappy homepage experience is critical. This post describes how we designed and tested a system that meets these goals, the trade-offs we made, and the performance we observed.

Building a Scalable Conference-Talk System

Overview

We designed a conference scheduling platform where users can browse and attend talks. Our core functionality is straightforward:

  1. Users can mark talks as “liked” or “interested.”
  2. Events (i.e., talks) each have an aggregate count of how many users like or are interested in them.
  3. Users’ home pages need to load quickly, showing details for all available events, along with:
    • Whether the current user has liked or is interested in each event.
    • Total likes and interests per event (slightly out of date is acceptable).

Data Model and Consistency

To handle this efficiently, we split the data into a few key tables:

  • events: Our main table storing each talk’s fundamental info (IDs, titles, etc.).
  • event_details: Extended fields (like descriptions, scheduled times).
  • user_event_likes & user_event_interest: Many-to-many relationship tables linking users to events. Each row records that a given user either “likes” or “is interested” in a specific event.
  • event_metadata: Stores aggregate stats (total likes, total interests) for each event.

Eventual Consistency

Whenever a user likes or becomes interested in a talk, we:

  1. Insert a row in the appropriate many-to-many table (user_event_likes or user_event_interest).
  2. Emit a small message/event that updates event_metadata asynchronously.

This is eventually consistent because the user’s immediate view of their own like/interest status must be accurate, but the total count in event_metadata might lag a bit.

From a user’s point of view:

  • They see the correct status of their actions (like/dislike, interested/not interested).
  • Event totals might be delayed by a few seconds. This delay is acceptable for a typical conference scenario.

Querying the Home Page

When a user opens their home page, our query must:

  1. Join events, event_details, and event_metadata to load each event’s details and precomputed stats.
  2. Check if the current user has liked or is interested in each event (by joining user_event_likes and user_event_interest on matching event IDs).
  3. Sort by time (usually by ed.scheduled_on) and limit the result (for pagination or preview).

Why Empirical Testing Matters

Before finalizing our design, we verified that this approach worked well in practice. We used:

  • EXPLAIN / EXPLAIN ANALYZE to ensure we’re getting index scans on our many-to-many tables and efficient hash joins on the broader tables.
  • Real-world execution time measurements on a local dev machine (a “battered Dell XPS”) to confirm our assumptions.

Stress-Testing Our Design

We set up a stress test:

  1. 10,000 users and 10,000 events in the database.
  2. For every user–event pair, there’s:
    • A 25% chance the user “likes” that event.
    • A 25% chance the user is “interested” in that event.
    • Randomly populate event_metadata so each event has some placeholder stats.

This generated about 30 million total rows across user_event_likes and user_event_interest. Our question: Could we still query the home page quickly under these conditions?

Performance Results

  • Cold Execution Time: From a cold start, loading all events plus the user’s likes/interests took around 50 ms. That might sound somewhat high, but consider:
    • A single primary key lookup from cold can be ~5 ms on our machine.
    • We’re doing a lot more work (multiple index scans, merges, sorting, limiting, and joins).
    • A ~10× difference (5 ms → 50 ms) felt justifiable given the added complexity.
  • Warm Caches, Random Users: Additional testing with random user profiles (still on a warm cache) brought the average execution time down to ~35 ms. Different users liked or were interested in different sets of events, so we saw some variance. But ~35 ms was relatively stable.

Under the Hood: The Query Plan

Postgres uses:

  1. Index-based bitmap scans on user_event_likes / user_event_interest to filter rows by user ID quickly.
  2. Hash joins for the larger sets (events, event_details, event_metadata), which is efficient for equality-based joins.
  3. Sorting + Top-N Heapsort to order rows by scheduled time, returning only the top few (for example, if we show the next 5 events).

These plan details confirm that:

  • We’re not scanning entire many-to-many tables blindly; we index on (user_id) to limit to relevant rows.
  • Postgres merges everything efficiently into the final row set, sorts, and returns them.
  • Even with millions of rows, the total runtime remained under ~50 ms from a cold start and ~35 ms in a more warmed-up scenario.

Final Thoughts

Our design highlights a few key principles:

  1. Separation of Concerns: We keep the “current user state” (likes/interests) strongly consistent, while the “aggregate stats” in event_metadata remain eventually consistent.
  2. Proper Indexing: Many-to-many link tables are indexed by (user_id), providing fast lookups.
  3. Empirical Testing > Guesswork: We confirm the plan with EXPLAIN ANALYZE but also rely on real test data to see actual performance.
  4. Performance Context: If a single key lookup takes ~5 ms, then 50 ms for a large multi-join query is acceptable. Adjust for your scenario and hardware to see if this meets your SLAs.

In production, final times will differ depending on your system resources, concurrency, and data distribution. But as a pattern, this approach—using eventual consistency for aggregated counters and direct many-to-many links for user states—offers a robust mix of performance and correctness for any system dealing with likes, follows, or interest markers.

We remain confident that this strategy scales smoothly, provides near-real-time updates for the current user, and handles aggregated stats in a way that’s “good enough” in most real-world usage scenarios.

Author

Robert Koletka