Design Leaderboard System
A real-time leaderboard system is a critical component of competitive games and applications, ranking millions of players based on their scores while maintaining sub-100ms latency for reads and supporting thousands of concurrent score updates per second. This system must handle diverse ranking strategies (global, regional, friend-based), multiple time windows (all-time, daily, weekly, monthly), and provide accurate percentile calculations at massive scale.
Step 1: Requirements and Scale Estimation
Functional Requirements
Core Leaderboard Operations:
- Submit Score: Players submit scores after completing games/matches
- Get Player Rank: Retrieve a specific player’s current rank and score
- Get Top N Players: Fetch the top players (e.g., top 100) with pagination
- Get Players Around Rank: Retrieve players ranked around a specific position
- Multiple Leaderboards: Support different leaderboard types (global, regional, friend-based)
- Time-Based Leaderboards: Daily, weekly, monthly, and all-time leaderboards
- Rank History: Track historical rank changes for players
- Percentile Calculation: Show player’s percentile (e.g., “You’re in the top 5%”)
Advanced Features:
- Real-time Updates: Leaderboard reflects score changes within seconds
- Tie Breaking: Handle players with identical scores using secondary criteria
- Score Verification: Prevent cheating through anomaly detection
- Leaderboard Reset: Reset time-based leaderboards at intervals
- Multiple Game Modes: Separate leaderboards per game mode/category
Non-Functional Requirements
Performance:
- Read Latency: < 100ms for fetching leaderboard data (p99)
- Write Latency: < 200ms for score submissions (p99)
- Rank Calculation: < 50ms for individual rank lookups
- Throughput: Handle 50,000+ score updates per second during peak hours
- Concurrent Users: Support 10+ million concurrent active players
Reliability:
- Availability: 99.99% uptime (< 1 hour downtime per year)
- Consistency: Eventual consistency for leaderboard updates acceptable
- Durability: No score data loss; all submissions persisted
- Fault Tolerance: System degrades gracefully under high load
Scalability:
- Player Base: Support 500+ million registered players
- Active Players: 50+ million daily active users
- Data Growth: Handle continuous growth without performance degradation
- Geographic Distribution: Low latency globally through regional deployments
Scale Estimation
Storage Calculations:
Per player leaderboard entry:
- Player ID: 8 bytes (BIGINT)
- Score: 8 bytes (DOUBLE)
- Timestamp: 8 bytes
- Metadata: 100 bytes (username, level, avatar, etc.)
- Total per entry: ~124 bytes
Storage per leaderboard type:
- Global Leaderboard: 500M players × 124 bytes = 62 GB
- Daily Leaderboard: 50M active players × 124 bytes = 6.2 GB
- Weekly Leaderboard: 150M players × 124 bytes = 18.6 GB
- Monthly Leaderboard: 300M players × 124 bytes = 37.2 GB
- Per-Game-Mode (5 modes): 5 × 62 GB = 310 GB
Historical data (1 year of daily snapshots):
- 50M players × 365 days × 50 bytes (compressed) = 912.5 GB
Total Storage: ~1.5 TB for active leaderboards + historical data
Traffic Estimation:
Score submissions:
- 50M DAU, each plays 10 games/day = 500M score submissions/day
- Peak traffic (20% in 4 hours): 100M submissions in 4 hours = 7,000 writes/sec
- With spikes (3x average): 21,000 writes/sec peak
Leaderboard reads:
- Players check leaderboards 3 times per session
- 50M DAU × 3 = 150M reads/day
- Peak: 30M reads in 4 hours = 2,100 reads/sec
- With spikes: 6,300 reads/sec peak
Bandwidth:
Score submission:
- Payload: 200 bytes per submission
- 21,000 writes/sec × 200 bytes = 4.2 MB/sec ingress
Leaderboard reads:
- Response size: 5 KB (100 entries with metadata)
- 6,300 reads/sec × 5 KB = 31.5 MB/sec egress
Memory Requirements (Redis):
Hot leaderboards in memory:
- Global + 4 time-based × 5 game modes = 25 leaderboards
- 50M active entries per major leaderboard
- 124 bytes per entry in sorted set (with overhead: ~200 bytes)
- 25 leaderboards × 50M entries × 200 bytes = 250 GB RAM
- With replication (3x): 750 GB total RAM
Step 2: High-Level Design
System Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Client Layer │
│ (Game Clients, Mobile Apps, Web Dashboard) │
└────────────────┬────────────────────────────────────────────────┘
│
│ HTTPS/WebSocket
│
┌────────────────▼────────────────────────────────────────────────┐
│ API Gateway Layer │
│ (Rate Limiting, Auth, Request Routing, DDoS Protection) │
└────────┬──────────────────┬──────────────────┬──────────────────┘
│ │ │
┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐
│ Score │ │ Ranking │ │Analytics │
│ Service │ │ Service │ │ Service │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
│ ┌────────────────┼──────────────────┘
│ │ │
┌────▼─▼────────────────▼─────────────────────────────────────┐
│ Message Queue (Kafka) │
│ Topics: score-submissions, rank-updates, analytics-events │
└────┬─────────────────┬─────────────────────────────────────┘
│ │
┌────▼─────┐ ┌────▼──────────────┐
│ Score │ │ Leaderboard │
│Processor │ │ Service │
│(Workers) │ │ (Read Optimized) │
└────┬─────┘ └────┬──────────────┘
│ │
┌────▼────────────────▼───────────────────────────────────────┐
│ Data Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Redis Cluster│ │ PostgreSQL │ │ TimescaleDB │ │
│ │(Sorted Sets) │ │ (Metadata) │ │ (History) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└──────────────────────────────────────────────────────────────┘
Core Services
1. Score Service (Write Path)
Responsibilities:
- Validate and accept score submissions from game clients
- Perform basic anti-cheat checks (score range validation, submission rate)
- Enqueue validated scores to Kafka for asynchronous processing
- Return immediate acknowledgment to clients
API Endpoints:
POST /api/v1/scores
Body: {
"player_id": 123456789,
"game_mode": "ranked_solo",
"score": 9850,
"match_id": "abc-def-123",
"timestamp": 1709876543,
"metadata": {
"duration_seconds": 1200,
"kills": 15,
"deaths": 3
}
}
Response: {
"status": "accepted",
"submission_id": "uuid-here",
"estimated_rank": 1247 // Optional cached approximation
}
Technology Stack:
- Language: Go (high throughput, low latency)
- Framework: Gin or FastHTTP
- Rate Limiting: Redis-based token bucket per player
- Validation: Schema validation + business rules
2. Score Processor Workers
Responsibilities:
- Consume score submissions from Kafka
- Update Redis sorted sets for all relevant leaderboards
- Detect anomalies and flag suspicious submissions
- Emit events for rank changes and milestones
- Persist scores to PostgreSQL for durability
Processing Pipeline:
func ProcessScore(submission ScoreSubmission) error {
// 1. Anti-cheat analysis
if isAnomalous(submission) {
flagForReview(submission)
return nil
}
// 2. Update Redis sorted sets (multiple leaderboards)
leaderboards := getRelevantLeaderboards(submission)
for _, lb := range leaderboards {
updateSortedSet(lb, submission.PlayerID, submission.Score)
}
// 3. Calculate new rank
newRank := getRank(submission.PlayerID, submission.GameMode)
// 4. Check for rank changes and milestones
if rankChanged(submission.PlayerID, newRank) {
emitRankChangeEvent(submission.PlayerID, newRank)
}
// 5. Persist to PostgreSQL
saveScoreToDatabase(submission)
return nil
}
3. Leaderboard Service (Read Path)
Responsibilities:
- Serve leaderboard queries with low latency
- Implement pagination for large result sets
- Calculate player percentiles
- Provide context (players around a rank)
- Handle cache invalidation
API Endpoints:
GET /api/v1/leaderboards/{game_mode}/top?limit=100&offset=0
GET /api/v1/leaderboards/{game_mode}/player/{player_id}
GET /api/v1/leaderboards/{game_mode}/around/{rank}?radius=50
GET /api/v1/leaderboards/{game_mode}/friends?player_id=123
GET /api/v1/players/{player_id}/percentile
Response Format:
{
"leaderboard_id": "ranked_solo_global",
"time_window": "all_time",
"updated_at": "2025-01-27T10:30:00Z",
"total_players": 45000000,
"entries": [
{
"rank": 1,
"player_id": 987654321,
"username": "ProGamer2025",
"score": 125000,
"percentile": 99.99,
"previous_rank": 2,
"rank_change": 1
}
],
"pagination": {
"current_page": 1,
"total_pages": 450000,
"has_next": true
}
}
4. Ranking Service
Responsibilities:
- Calculate accurate ranks for individual players
- Compute percentile rankings
- Handle tie-breaking logic
- Manage rank history snapshots
- Coordinate leaderboard resets
Key Operations:
GetPlayerRank(playerID, leaderboardID): O(log N)GetPlayerPercentile(playerID, leaderboardID): O(1) with cachingGetRankHistory(playerID, startDate, endDate): Query TimescaleDB
5. Analytics Service
Responsibilities:
- Track leaderboard engagement metrics
- Generate reports on score distribution
- Identify trending players
- Monitor system health and performance
- Support business intelligence queries
Metrics Tracked:
- Score submission rate per game mode
- Leaderboard view frequency
- Rank volatility (how often top ranks change)
- Player retention based on rank changes
- Cheating detection accuracy
Step 3: Deep Dives
3.1 Redis Sorted Sets for Real-Time Rankings
Why Redis Sorted Sets?
Redis sorted sets (ZSET) are the perfect data structure for leaderboards because they:
- Store members with associated scores
- Maintain automatic sorting by score
- Provide O(log N) insertion and rank lookups
- Support range queries efficiently
- Handle millions of entries in memory
Core Redis Operations:
# Add or update a player's score (O(log N))
ZADD leaderboard:global:ranked_solo 9850 "player:123456789"
# Get player's rank (0-based, 0 is highest) (O(log N))
ZREVRANK leaderboard:global:ranked_solo "player:123456789"
# Get player's score (O(1))
ZSCORE leaderboard:global:ranked_solo "player:123456789"
# Get top 100 players with scores (O(log N + M))
ZREVRANGE leaderboard:global:ranked_solo 0 99 WITHSCORES
# Get players ranked 1000-1099 (O(log N + M))
ZREVRANGE leaderboard:global:ranked_solo 1000 1099 WITHSCORES
# Get total number of players (O(1))
ZCARD leaderboard:global:ranked_solo
# Get players with scores between 8000-10000 (O(log N + M))
ZREVRANGEBYSCORE leaderboard:global:ranked_solo 10000 8000 WITHSCORES
# Get count of players with scores in range (O(log N))
ZCOUNT leaderboard:global:ranked_solo 8000 10000
Leaderboard Key Naming Convention:
leaderboard:{scope}:{game_mode}:{time_window}
leaderboard:{region}:{game_mode}:{time_window}
Examples:
- leaderboard:global:ranked_solo:alltime
- leaderboard:global:ranked_solo:daily:2025-01-27
- leaderboard:global:ranked_solo:weekly:2025-W04
- leaderboard:global:ranked_solo:monthly:2025-01
- leaderboard:us-west:ranked_duo:alltime
- leaderboard:friends:123456:ranked_solo:weekly:2025-W04
Atomic Score Updates with Lua Scripts:
To ensure atomic operations when updating scores and maintaining consistency:
-- Update score only if new score is higher
local key = KEYS[1]
local member = ARGV[1]
local new_score = tonumber(ARGV[2])
local current_score = redis.call('ZSCORE', key, member)
if current_score == false or new_score > tonumber(current_score) then
redis.call('ZADD', key, new_score, member)
return 1
else
return 0
end
Implementation Example:
type RedisLeaderboard struct {
client *redis.ClusterClient
}
func (r *RedisLeaderboard) UpdateScore(ctx context.Context,
leaderboardID string, playerID int64, score float64) error {
key := fmt.Sprintf("leaderboard:%s", leaderboardID)
member := fmt.Sprintf("player:%d", playerID)
// Use pipeline for multiple leaderboard updates
pipe := r.client.Pipeline()
pipe.ZAdd(ctx, key, redis.Z{
Score: score,
Member: member,
})
// Set expiry for time-based leaderboards
if isTimeBased(leaderboardID) {
pipe.Expire(ctx, key, getExpiryDuration(leaderboardID))
}
_, err := pipe.Exec(ctx)
return err
}
func (r *RedisLeaderboard) GetPlayerRank(ctx context.Context,
leaderboardID string, playerID int64) (int64, error) {
key := fmt.Sprintf("leaderboard:%s", leaderboardID)
member := fmt.Sprintf("player:%d", playerID)
// ZREVRANK returns 0-based rank (0 is highest)
rank, err := r.client.ZRevRank(ctx, key, member).Result()
if err != nil {
return 0, err
}
// Convert to 1-based rank
return rank + 1, nil
}
func (r *RedisLeaderboard) GetTopPlayers(ctx context.Context,
leaderboardID string, limit, offset int64) ([]LeaderboardEntry, error) {
key := fmt.Sprintf("leaderboard:%s", leaderboardID)
// Get players with scores
results, err := r.client.ZRevRangeWithScores(ctx, key,
offset, offset+limit-1).Result()
if err != nil {
return nil, err
}
entries := make([]LeaderboardEntry, len(results))
for i, z := range results {
playerID := extractPlayerID(z.Member.(string))
entries[i] = LeaderboardEntry{
Rank: offset + int64(i) + 1,
PlayerID: playerID,
Score: z.Score,
}
}
return entries, nil
}
3.2 Handling Millions of Concurrent Score Updates
Challenge: During peak hours, the system receives 20,000+ score submissions per second. Each submission must update multiple leaderboards (global, daily, weekly, monthly, regional).
Solution: Asynchronous Processing with Kafka
Architecture:
Game Clients → Score Service → Kafka Topic → Score Processor Workers → Redis
↓ ↓
Acknowledge PostgreSQL
Kafka Topic Design:
Topic: score-submissions
Partitions: 64 # Allows parallel processing by 64 consumers
Replication Factor: 3
Retention: 7 days
Compression: lz4
Partitioning Strategy: Hash by player_id
# Ensures all scores for a player go to same partition
# Maintains ordering per player
Consumer Group Configuration:
type ScoreProcessor struct {
consumer *kafka.Consumer
redis *RedisLeaderboard
db *sql.DB
workers int
}
func (sp *ScoreProcessor) Start(ctx context.Context) error {
// Create consumer group with multiple workers
group, err := kafka.NewConsumerGroup(kafka.ConsumerGroupConfig{
GroupID: "score-processors",
Topics: []string{"score-submissions"},
Workers: 64, // Match partition count
MaxPollRecords: 1000,
SessionTimeout: 30 * time.Second,
HeartbeatInterval: 3 * time.Second,
})
// Process messages in batches for efficiency
for {
select {
case <-ctx.Done():
return nil
default:
messages := group.Poll(ctx)
sp.processBatch(messages)
}
}
}
func (sp *ScoreProcessor) processBatch(messages []kafka.Message) {
// Batch Redis operations using pipeline
pipe := sp.redis.Pipeline()
for _, msg := range messages {
var submission ScoreSubmission
json.Unmarshal(msg.Value, &submission)
// Update multiple leaderboards in pipeline
leaderboards := sp.getRelevantLeaderboards(submission)
for _, lb := range leaderboards {
key := fmt.Sprintf("leaderboard:%s", lb)
member := fmt.Sprintf("player:%d", submission.PlayerID)
pipe.ZAdd(context.Background(), key, redis.Z{
Score: submission.Score,
Member: member,
})
}
}
// Execute all Redis commands at once
pipe.Exec(context.Background())
// Batch insert to PostgreSQL
sp.batchInsertScores(messages)
}
Rate Limiting Per Player:
Prevent spam and abuse:
func (s *ScoreService) SubmitScore(playerID int64, score float64) error {
// Token bucket rate limiter using Redis
key := fmt.Sprintf("ratelimit:score:%d", playerID)
// Allow 20 submissions per minute per player
allowed, err := s.redis.CheckRateLimit(key, 20, time.Minute)
if err != nil {
return err
}
if !allowed {
return ErrRateLimitExceeded
}
// Proceed with score submission
return s.publishToKafka(playerID, score)
}
3.3 Time-Based Leaderboards
Challenge: Support daily, weekly, and monthly leaderboards that reset at specific intervals while maintaining historical data.
Solution: Separate Sorted Sets with Scheduled Resets
Implementation Strategy:
- Separate Redis Keys per Time Window:
func getLeaderboardKey(gameMode string, timeWindow string) string {
now := time.Now()
switch timeWindow {
case "daily":
date := now.Format("2006-01-02")
return fmt.Sprintf("leaderboard:global:%s:daily:%s", gameMode, date)
case "weekly":
year, week := now.ISOWeek()
return fmt.Sprintf("leaderboard:global:%s:weekly:%d-W%02d",
gameMode, year, week)
case "monthly":
month := now.Format("2006-01")
return fmt.Sprintf("leaderboard:global:%s:monthly:%s", gameMode, month)
case "alltime":
return fmt.Sprintf("leaderboard:global:%s:alltime", gameMode)
default:
return ""
}
}
- Automatic Key Expiration:
func setLeaderboardExpiry(key string, timeWindow string) {
var expiry time.Duration
switch timeWindow {
case "daily":
// Expire 2 days after creation
expiry = 48 * time.Hour
case "weekly":
expiry = 14 * 24 * time.Hour
case "monthly":
expiry = 60 * 24 * time.Hour
default:
return // No expiry for all-time
}
redis.Expire(context.Background(), key, expiry)
}
- Snapshot Mechanism for Historical Data:
func (s *LeaderboardService) SnapshotDailyLeaderboard(gameMode string, date time.Time) error {
key := getLeaderboardKey(gameMode, "daily")
// Get top 10,000 players from Redis
topPlayers, err := s.redis.ZRevRangeWithScores(
context.Background(), key, 0, 9999).Result()
if err != nil {
return err
}
// Store snapshot in TimescaleDB for historical analysis
for rank, player := range topPlayers {
s.db.Exec(`
INSERT INTO leaderboard_snapshots
(date, game_mode, rank, player_id, score)
VALUES ($1, $2, $3, $4, $5)
`, date, gameMode, rank+1, extractPlayerID(player.Member), player.Score)
}
return nil
}
- Scheduled Reset Jobs:
func (s *LeaderboardService) StartResetScheduler() {
// Daily reset at midnight UTC
gocron.NewScheduler(time.UTC).Every(1).Day().At("00:00").Do(func() {
s.snapshotAndResetDaily()
})
// Weekly reset on Monday midnight UTC
gocron.NewScheduler(time.UTC).Every(1).Monday().At("00:00").Do(func() {
s.snapshotAndResetWeekly()
})
// Monthly reset on 1st of month
gocron.NewScheduler(time.UTC).Every(1).Month(1).At("00:00").Do(func() {
s.snapshotAndResetMonthly()
})
}
3.4 Approximate Rankings for Massive Scale
Challenge: With 500M players, calculating exact ranks for all players is computationally expensive. Players outside top 10,000 rarely need exact ranks.
Solution: HyperLogLog for Approximate Counts
Strategy:
- Exact rankings for top 100,000 players (stored in Redis ZSET)
- Approximate rankings for everyone else (using HyperLogLog + sampling)
Implementation:
func (r *RedisLeaderboard) GetApproximateRank(ctx context.Context,
leaderboardID string, playerID int64, score float64) (int64, error) {
key := fmt.Sprintf("leaderboard:%s", leaderboardID)
// First check if player is in top 100k (exact tracking)
rank, err := r.client.ZRevRank(ctx, key, fmt.Sprintf("player:%d", playerID)).Result()
if err == nil && rank < 100000 {
return rank + 1, nil
}
// For players outside top 100k, use approximate count
// Count players with score > current player's score
count, err := r.client.ZCount(ctx, key,
fmt.Sprintf("(%f", score), "+inf").Result()
if err != nil {
return 0, err
}
// Add 1 to convert to 1-based rank
return count + 1, nil
}
Optimized Storage Strategy:
// Only store top 1M players in sorted set
// Periodically trim the sorted set
func (r *RedisLeaderboard) TrimLeaderboard(ctx context.Context,
leaderboardID string) error {
key := fmt.Sprintf("leaderboard:%s", leaderboardID)
// Keep only top 1 million players
return r.client.ZRemRangeByRank(ctx, key, 0, -1000001).Err()
}
Trade-offs:
- Top players get exact ranks (important for competitive players)
- Casual players get approximate ranks (acceptable accuracy: ±0.1%)
- Reduces Redis memory usage by 80%
- Maintains sub-100ms query performance
3.5 Percentile Calculations
Challenge: Calculate “You’re in the top 5%” efficiently without scanning entire leaderboard.
Solution: Pre-computed Percentile Buckets
type PercentileCalculator struct {
redis *redis.ClusterClient
}
func (p *PercentileCalculator) CalculatePercentile(ctx context.Context,
leaderboardID string, playerID int64) (float64, error) {
// Get player's rank
rank, err := p.getRank(ctx, leaderboardID, playerID)
if err != nil {
return 0, err
}
// Get total player count
key := fmt.Sprintf("leaderboard:%s", leaderboardID)
totalPlayers, err := p.redis.ZCard(ctx, key).Result()
if err != nil {
return 0, err
}
// Calculate percentile
percentile := (1 - float64(rank-1)/float64(totalPlayers)) * 100
return percentile, nil
}
// Cache percentile thresholds for fast lookups
func (p *PercentileCalculator) GetPercentileThresholds(ctx context.Context,
leaderboardID string) (map[int]float64, error) {
key := fmt.Sprintf("leaderboard:%s", leaderboardID)
totalPlayers, _ := p.redis.ZCard(ctx, key).Result()
thresholds := make(map[int]float64)
percentiles := []int{1, 5, 10, 25, 50, 75, 90, 95, 99}
for _, percentile := range percentiles {
// Calculate rank for this percentile
rank := int64(float64(totalPlayers) * (1 - float64(percentile)/100))
// Get score at this rank
results, _ := p.redis.ZRevRange(ctx, key, rank, rank).Result()
if len(results) > 0 {
score, _ := p.redis.ZScore(ctx, key, results[0]).Result()
thresholds[percentile] = score
}
}
// Cache thresholds for 5 minutes
cacheKey := fmt.Sprintf("percentile_thresholds:%s", leaderboardID)
p.redis.Set(ctx, cacheKey, thresholds, 5*time.Minute)
return thresholds, nil
}
3.6 Rank History Tracking
Challenge: Track historical rank changes for player profiles and analytics.
Solution: TimescaleDB for Time-Series Data
Schema Design:
-- Create hypertable for time-series rank data
CREATE TABLE rank_history (
time TIMESTAMPTZ NOT NULL,
player_id BIGINT NOT NULL,
game_mode VARCHAR(50) NOT NULL,
leaderboard_type VARCHAR(20) NOT NULL,
rank INTEGER NOT NULL,
score DOUBLE PRECISION NOT NULL,
percentile DECIMAL(5,2)
);
-- Convert to hypertable (TimescaleDB)
SELECT create_hypertable('rank_history', 'time');
-- Create indexes
CREATE INDEX idx_rank_history_player ON rank_history (player_id, time DESC);
CREATE INDEX idx_rank_history_game_mode ON rank_history (game_mode, time DESC);
-- Continuous aggregate for daily summaries
CREATE MATERIALIZED VIEW rank_history_daily
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 day', time) AS day,
player_id,
game_mode,
leaderboard_type,
FIRST(rank, time) AS starting_rank,
LAST(rank, time) AS ending_rank,
MIN(rank) AS best_rank,
MAX(rank) AS worst_rank,
AVG(rank) AS avg_rank
FROM rank_history
GROUP BY day, player_id, game_mode, leaderboard_type;
Recording Rank Changes:
func (s *RankHistoryService) RecordRankChange(ctx context.Context,
playerID int64, gameMode string, newRank int64, score float64) error {
// Only record if rank changed significantly (±10 ranks or better than top 1000)
lastRank := s.getLastRecordedRank(playerID, gameMode)
if math.Abs(float64(newRank-lastRank)) < 10 && newRank > 1000 {
return nil // Skip minor changes for players outside top 1000
}
percentile := s.calculatePercentile(gameMode, newRank)
_, err := s.db.ExecContext(ctx, `
INSERT INTO rank_history
(time, player_id, game_mode, leaderboard_type, rank, score, percentile)
VALUES (NOW(), $1, $2, 'global', $3, $4, $5)
`, playerID, gameMode, newRank, score, percentile)
return err
}
func (s *RankHistoryService) GetPlayerRankHistory(ctx context.Context,
playerID int64, gameMode string, days int) ([]RankHistoryEntry, error) {
rows, err := s.db.QueryContext(ctx, `
SELECT time, rank, score, percentile
FROM rank_history
WHERE player_id = $1
AND game_mode = $2
AND time > NOW() - INTERVAL '1 day' * $3
ORDER BY time ASC
`, playerID, gameMode, days)
if err != nil {
return nil, err
}
defer rows.Close()
var history []RankHistoryEntry
for rows.Next() {
var entry RankHistoryEntry
rows.Scan(&entry.Time, &entry.Rank, &entry.Score, &entry.Percentile)
history = append(history, entry)
}
return history, nil
}
3.7 Anti-Cheat Detection
Challenge: Prevent score manipulation and identify suspicious patterns.
Solution: Multi-Layer Anomaly Detection
Detection Strategies:
- Statistical Outlier Detection:
func (a *AntiCheatService) DetectStatisticalAnomalies(
submission ScoreSubmission) (bool, string) {
// Get player's historical scores
avgScore := a.getPlayerAverageScore(submission.PlayerID, submission.GameMode)
stdDev := a.getPlayerStdDev(submission.PlayerID, submission.GameMode)
// Z-score calculation
zScore := (submission.Score - avgScore) / stdDev
// Flag if score is >4 standard deviations from mean
if math.Abs(zScore) > 4 {
return true, fmt.Sprintf("Score %.0f is %.2f std devs from mean %.0f",
submission.Score, zScore, avgScore)
}
return false, ""
}
- Rate-Based Detection:
func (a *AntiCheatService) DetectRapidSubmissions(playerID int64) bool {
key := fmt.Sprintf("submissions:count:%d", playerID)
// Count submissions in last 5 minutes
count, _ := a.redis.Get(context.Background(), key).Int()
// Flag if >30 submissions in 5 minutes
if count > 30 {
return true
}
// Increment counter
a.redis.Incr(context.Background(), key)
a.redis.Expire(context.Background(), key, 5*time.Minute)
return false
}
- Impossible Score Detection:
func (a *AntiCheatService) ValidateScoreRange(
gameMode string, score float64, duration int) bool {
// Get theoretical max score for game mode
maxPossibleScore := a.getMaxScore(gameMode)
// Check if score exceeds maximum
if score > maxPossibleScore {
return false
}
// Check score vs. time ratio
maxScorePerSecond := a.getMaxScoreRate(gameMode)
if score/float64(duration) > maxScorePerSecond {
return false
}
return true
}
- Behavioral Analysis:
func (a *AntiCheatService) AnalyzeBehaviorPattern(
playerID int64, submission ScoreSubmission) (float64, error) {
// Get recent match metadata
recentMatches := a.getRecentMatches(playerID, 20)
features := map[string]float64{
"score_variance": calculateVariance(recentMatches),
"submission_interval": calculateAvgInterval(recentMatches),
"score_progression": calculateProgression(recentMatches),
"win_rate_anomaly": detectWinRateSpike(recentMatches),
}
// Use ML model to calculate cheat probability
cheatProbability := a.mlModel.Predict(features)
return cheatProbability, nil
}
Handling Flagged Submissions:
func (s *ScoreProcessor) ProcessWithCheatDetection(
submission ScoreSubmission) error {
isSuspicious, reason := s.antiCheat.Analyze(submission)
if isSuspicious {
// Store in separate review queue
s.quarantineScore(submission, reason)
// Update temporary "under review" leaderboard
s.redis.ZAdd(context.Background(),
"leaderboard:quarantine:"+submission.GameMode,
redis.Z{Score: submission.Score, Member: submission.PlayerID})
// Alert moderation team
s.alertModerators(submission, reason)
return nil
}
// Normal processing for clean submissions
return s.updateLeaderboards(submission)
}
3.8 Pagination of Large Leaderboards
Challenge: Efficiently paginate through millions of entries without performance degradation.
Solution: Cursor-Based Pagination
Implementation:
type PaginationCursor struct {
Rank int64
Score float64
LastPlayerID int64
}
func (l *LeaderboardService) GetLeaderboardPage(ctx context.Context,
leaderboardID string, cursor *PaginationCursor, limit int)
(*LeaderboardPage, error) {
key := fmt.Sprintf("leaderboard:%s", leaderboardID)
var entries []LeaderboardEntry
var err error
if cursor == nil {
// First page - start from top
entries, err = l.getTopN(ctx, key, limit)
} else {
// Subsequent page - continue from cursor
entries, err = l.getAfterCursor(ctx, key, cursor, limit)
}
if err != nil {
return nil, err
}
// Generate cursor for next page
var nextCursor *PaginationCursor
if len(entries) == limit {
lastEntry := entries[len(entries)-1]
nextCursor = &PaginationCursor{
Rank: lastEntry.Rank,
Score: lastEntry.Score,
LastPlayerID: lastEntry.PlayerID,
}
}
return &LeaderboardPage{
Entries: entries,
NextCursor: nextCursor,
HasMore: nextCursor != nil,
}, nil
}
func (l *LeaderboardService) getAfterCursor(ctx context.Context,
key string, cursor *PaginationCursor, limit int) ([]LeaderboardEntry, error) {
// Use ZREVRANGEBYSCORE to continue from cursor score
results, err := l.redis.ZRevRangeByScoreWithScores(ctx, key,
&redis.ZRangeBy{
Min: "-inf",
Max: fmt.Sprintf("(%f", cursor.Score), // Exclusive
Offset: 0,
Count: int64(limit),
}).Result()
if err != nil {
return nil, err
}
entries := make([]LeaderboardEntry, len(results))
for i, z := range results {
entries[i] = LeaderboardEntry{
Rank: cursor.Rank + int64(i) + 1,
PlayerID: extractPlayerID(z.Member.(string)),
Score: z.Score,
}
}
return entries, nil
}
API Response with Cursor:
{
"entries": [...],
"pagination": {
"next_cursor": "eyJyYW5rIjoxMDAsInNjb3JlIjo4NTAwLCJsYXN0X3BsYXllcl9pZCI6MTIzNDU2fQ==",
"has_more": true,
"limit": 100
}
}
Client Usage:
async function loadLeaderboard() {
let cursor = null;
let allEntries = [];
do {
const response = await fetch(
`/api/v1/leaderboards/ranked_solo/top?limit=100${cursor ? `&cursor=${cursor}` : ''}`
);
const data = await response.json();
allEntries.push(...data.entries);
cursor = data.pagination.next_cursor;
} while (cursor && allEntries.length < 1000);
return allEntries;
}
Step 4: Additional Considerations
4.1 Database Schema (PostgreSQL)
Persistent Storage for Durability:
-- Player master table
CREATE TABLE players (
player_id BIGSERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW(),
last_active TIMESTAMPTZ
);
CREATE INDEX idx_players_username ON players(username);
-- Score submissions (immutable log)
CREATE TABLE score_submissions (
submission_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
player_id BIGINT NOT NULL REFERENCES players(player_id),
game_mode VARCHAR(50) NOT NULL,
score DOUBLE PRECISION NOT NULL,
match_id VARCHAR(100),
submitted_at TIMESTAMPTZ DEFAULT NOW(),
metadata JSONB,
is_verified BOOLEAN DEFAULT FALSE,
is_flagged BOOLEAN DEFAULT FALSE
);
CREATE INDEX idx_scores_player ON score_submissions(player_id, submitted_at DESC);
CREATE INDEX idx_scores_game_mode ON score_submissions(game_mode, submitted_at DESC);
-- Partitioning by month for scalability
CREATE TABLE score_submissions_2025_01 PARTITION OF score_submissions
FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
-- Leaderboard snapshots (daily/weekly/monthly)
CREATE TABLE leaderboard_snapshots (
snapshot_id BIGSERIAL PRIMARY KEY,
snapshot_date DATE NOT NULL,
game_mode VARCHAR(50) NOT NULL,
leaderboard_type VARCHAR(20) NOT NULL,
rank INTEGER NOT NULL,
player_id BIGINT NOT NULL REFERENCES players(player_id),
score DOUBLE PRECISION NOT NULL,
percentile DECIMAL(5,2)
);
CREATE INDEX idx_snapshots_date ON leaderboard_snapshots(snapshot_date, game_mode);
CREATE INDEX idx_snapshots_player ON leaderboard_snapshots(player_id, snapshot_date DESC);
4.2 Caching Strategy
Multi-Layer Caching:
- Redis (L1 Cache): Hot leaderboard data, player ranks
- Application Cache (L2): Leaderboard metadata, thresholds
- CDN (L3): Static leaderboard pages for anonymous users
Cache Invalidation:
func (s *CacheService) InvalidateLeaderboardCache(gameMode, timeWindow string) {
keys := []string{
fmt.Sprintf("leaderboard:top100:%s:%s", gameMode, timeWindow),
fmt.Sprintf("leaderboard:thresholds:%s:%s", gameMode, timeWindow),
fmt.Sprintf("leaderboard:count:%s:%s", gameMode, timeWindow),
}
for _, key := range keys {
s.redis.Del(context.Background(), key)
}
}
4.3 Monitoring and Observability
Key Metrics:
Service Health:
- Score submission rate (writes/sec)
- Leaderboard query rate (reads/sec)
- API latency (p50, p95, p99)
- Error rate by endpoint
Data Quality:
- Flagged submissions rate
- Score distribution anomalies
- Leaderboard consistency checks
Infrastructure:
- Redis memory usage
- Kafka consumer lag
- Database connection pool utilization
- Cache hit/miss ratio
Alerting Rules:
alerts:
- name: HighScoreSubmissionLatency
condition: p99_latency > 500ms
severity: warning
- name: KafkaConsumerLag
condition: lag > 100000 messages
severity: critical
- name: RedisMemoryHigh
condition: memory_usage > 90%
severity: warning
- name: AnomalousScoreRate
condition: flagged_rate > 5%
severity: critical
4.4 Disaster Recovery
Backup Strategy:
Redis:
- RDB snapshots every 6 hours
- AOF with fsync every second
- Replicate to standby cluster in different region
PostgreSQL:
- Continuous archiving with WAL
- Daily full backups
- Point-in-time recovery capability
Kafka:
- Replication factor of 3
- Cross-datacenter replication
- 7-day message retention
Recovery Time Objectives:
- RPO (Recovery Point Objective): < 5 minutes
- RTO (Recovery Time Objective): < 15 minutes
- Data Loss Tolerance: < 0.01% of scores
Step 5: Wrap-Up
System Characteristics Summary
Performance:
- Sub-100ms read latency for leaderboard queries
- 20,000+ concurrent score updates per second
- O(log N) rank calculations using Redis sorted sets
- Efficient pagination for millions of entries
Scalability:
- Horizontal scaling via Redis Cluster (sharding)
- Kafka partitioning for parallel processing
- Separate time-based leaderboards to distribute load
- Approximate rankings for non-competitive players
Reliability:
- 99.99% availability through multi-region deployment
- Asynchronous processing prevents data loss
- Eventual consistency acceptable for leaderboards
- Comprehensive anti-cheat detection
Data Architecture:
- Redis for hot, real-time leaderboard data
- PostgreSQL for durable score storage
- TimescaleDB for historical rank tracking
- Kafka for reliable message delivery
Trade-offs Made
| Decision | Trade-off | Justification |
|---|---|---|
| Eventual consistency | Strong consistency | Acceptable for leaderboards; reduces latency |
| Approximate ranks for tail | Exact ranks for all | 80% memory savings; casual players don’t need precision |
| Async processing | Immediate rank updates | Prevents data loss; rank updates within 1-2 seconds acceptable |
| Limited history in Redis | Full historical data | Cost-effective; TimescaleDB stores long-term history |
| Cursor pagination | Offset pagination | Stable for large datasets; prevents page drift |
Future Enhancements
- Machine Learning for Cheat Detection: Train models on historical patterns
- Predictive Rankings: Estimate future rank based on trends
- Social Features: Friend leaderboards, team rankings
- Live Tournaments: Bracket-style competitions with real-time updates
- Reward System Integration: Automatic rewards for reaching milestones
- Multi-Region Leaderboards: Federated rankings across regions
- Seasonal Resets: Competitive seasons with ranking decay
Key Takeaways
This leaderboard system design demonstrates how to build a production-grade, high-performance ranking system capable of handling millions of concurrent users. The architecture leverages Redis sorted sets for O(log N) operations, Kafka for reliable asynchronous processing, and TimescaleDB for historical analytics. By implementing approximate rankings for the tail, multi-layer caching, and comprehensive anti-cheat mechanisms, the system achieves the perfect balance of performance, scalability, and fairness required for competitive gaming platforms.
The design patterns presented here—time-based leaderboards, cursor pagination, percentile calculations, and rank history tracking—are applicable beyond gaming to any application requiring real-time rankings, such as fitness apps, e-learning platforms, sales leaderboards, and social media engagement metrics.
Comments