Reducing Latency in Go APIs: Lessons from Production
Our API’s P99 latency was 800ms. Users were complaining. After two weeks of profiling and optimization, we got it under 100ms. Here’s exactly what we did.
Step 1: Measure Everything
Before touching code, we instrumented every layer:
func (s *Service) GetOrder(ctx context.Context, id string) (*Order, error) {
defer trackLatency("GetOrder", time.Now())
order, err := s.cache.Get(ctx, id)
if err == nil {
cacheHits.Inc()
return order, nil
}
cacheMisses.Inc()
order, err = s.repo.GetOrder(ctx, id)
if err != nil {
return nil, err
}
s.cache.Set(ctx, id, order, 5*time.Minute)
return order, nil
}
The breakdown revealed:
- 60% of time: database queries
- 25% of time: downstream HTTP calls
- 10% of time: JSON serialization
- 5% of time: application logic
Step 2: Fix the Database
The biggest offender was a missing composite index. One query was doing a sequential scan on a 50M-row table:
-- Before: 400ms (sequential scan)
SELECT * FROM orders WHERE customer_id = $1 AND status = 'active' ORDER BY created_at DESC LIMIT 10;
-- After adding index: 2ms
CREATE INDEX CONCURRENTLY idx_orders_customer_active
ON orders (customer_id, created_at DESC) WHERE status = 'active';
Partial indexes are criminally underused. If you always filter on status = 'active', index only those rows.
We also found N+1 queries hiding in a loop:
// Before: 50 queries for 50 orders
for _, order := range orders {
items, _ := repo.GetItemsByOrderID(ctx, order.ID)
order.Items = items
}
// After: 1 query
itemsByOrder, _ := repo.GetItemsByOrderIDs(ctx, orderIDs)
for _, order := range orders {
order.Items = itemsByOrder[order.ID]
}
This alone cut 300ms off the P99.
Step 3: Fix Downstream Calls
Our service called three downstream services sequentially. Two of them were independent:
// Before: sequential, ~200ms total
user, _ := userService.Get(ctx, userID) // 80ms
preferences, _ := prefService.Get(ctx, userID) // 70ms
history, _ := historyService.Get(ctx, userID) // 50ms
// After: parallel independent calls, ~80ms total
g, ctx := errgroup.WithContext(ctx)
var user *User
var preferences *Preferences
var history *History
g.Go(func() error {
var err error
user, err = userService.Get(ctx, userID)
return err
})
g.Go(func() error {
var err error
preferences, err = prefService.Get(ctx, userID)
return err
})
g.Go(func() error {
var err error
history, err = historyService.Get(ctx, userID)
return err
})
if err := g.Wait(); err != nil {
return nil, err
}
Parallel calls reduced the downstream time from 200ms to 80ms.
Step 4: Add Caching
For data that changes infrequently, cache aggressively:
type TieredCache struct {
local *lru.Cache // L1: in-process, ~1ms
redis *redis.Client // L2: network, ~5ms
}
func (c *TieredCache) Get(ctx context.Context, key string) ([]byte, error) {
// L1
if val, ok := c.local.Get(key); ok {
return val.([]byte), nil
}
// L2
val, err := c.redis.Get(ctx, key).Bytes()
if err == nil {
c.local.Add(key, val)
return val, nil
}
return nil, ErrCacheMiss
}
In-process caching eliminated 40% of Redis calls. For our most-hit endpoints, cache hit rate was 85%.
Step 5: Async What You Can
Some work doesn’t need to happen in the request path:
// Before: send email synchronously (adds 100-500ms)
func (s *Service) CreateOrder(ctx context.Context, order Order) error {
if err := s.repo.Create(ctx, order); err != nil {
return err
}
return s.emailService.SendConfirmation(ctx, order) // Slow!
}
// After: publish event, handle email asynchronously
func (s *Service) CreateOrder(ctx context.Context, order Order) error {
if err := s.repo.Create(ctx, order); err != nil {
return err
}
s.events.Publish(ctx, OrderCreatedEvent{OrderID: order.ID})
return nil // Return immediately
}
If the user doesn’t need to see the result in this response, don’t make them wait.
Results
| Metric | Before | After |
|---|---|---|
| P50 | 200ms | 25ms |
| P99 | 800ms | 95ms |
| P99.9 | 2.5s | 200ms |
The fixes weren’t exotic. Missing index, N+1 queries, sequential calls that should be parallel, missing cache, synchronous work that should be async. Boring fundamentals — dramatic results.