← back to posts

Avoiding Goroutine Leaks in Long-Running Go Services

·

Goroutine leaks are the memory leaks of Go. They’re silent, gradual, and eventually fatal. A leaked goroutine holds its stack memory, any variables it references, and a slot in the scheduler — forever.

How Leaks Happen

Every goroutine leak has the same root cause: a goroutine is blocked on an operation that will never complete.

Leak #1: Unbuffered channel with no receiver

func process() {
    ch := make(chan Result)
    go func() {
        result := expensiveWork()
        ch <- result // Blocks forever if nobody reads from ch
    }()

    // Function returns without reading from ch
    // The goroutine is leaked
}

Fix: use a buffered channel or read from it:

func process() {
    ch := make(chan Result, 1) // Buffer of 1
    go func() {
        ch <- expensiveWork() // Never blocks
    }()
    // Even if we don't read, the goroutine completes
}

Leak #2: Missing context cancellation

func watchForUpdates(userID string) {
    ctx := context.Background() // Never cancelled!
    go func() {
        for {
            select {
            case <-ctx.Done():
                return
            case <-time.After(time.Second):
                checkUpdates(userID)
            }
        }
    }()
    // Goroutine runs forever
}

Fix: pass a cancellable context:

func watchForUpdates(ctx context.Context, userID string) {
    go func() {
        ticker := time.NewTicker(time.Second)
        defer ticker.Stop()
        for {
            select {
            case <-ctx.Done():
                return
            case <-ticker.C:
                checkUpdates(userID)
            }
        }
    }()
}

Leak #3: Blocked HTTP request

func fetchData(url string) {
    go func() {
        // No timeout — if the server hangs, this goroutine hangs forever
        resp, err := http.Get(url)
        if err != nil {
            return
        }
        defer resp.Body.Close()
        process(resp)
    }()
}

Fix: always use context with timeout:

func fetchData(ctx context.Context, url string) {
    go func() {
        ctx, cancel := context.WithTimeout(ctx, 10*time.Second)
        defer cancel()

        req, _ := http.NewRequestWithContext(ctx, "GET", url, nil)
        resp, err := http.DefaultClient.Do(req)
        if err != nil {
            return
        }
        defer resp.Body.Close()
        process(resp)
    }()
}

Leak #4: Ticker not stopped

func monitor() {
    ticker := time.NewTicker(time.Second)
    // ticker.Stop() never called
    for range ticker.C {
        collectMetrics()
    }
}

time.NewTicker creates a goroutine internally. Always defer ticker.Stop().

Detection in Tests

Use goleak to catch leaks in tests:

import "go.uber.org/goleak"

func TestMain(m *testing.M) {
    goleak.VerifyTestMain(m)
}

// Or per-test:
func TestProcessOrder(t *testing.T) {
    defer goleak.VerifyNone(t)

    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()

    processOrder(ctx, testOrder)
}

If any goroutine is still running when the test finishes, goleak fails the test with a full stack trace.

Runtime Detection

Monitor goroutine count in production:

func monitorGoroutines(ctx context.Context) {
    ticker := time.NewTicker(10 * time.Second)
    defer ticker.Stop()

    var lastCount int
    for {
        select {
        case <-ctx.Done():
            return
        case <-ticker.C:
            count := runtime.NumGoroutine()
            goroutineGauge.Set(float64(count))

            if count > lastCount+100 {
                slog.Warn("goroutine count spike",
                    "current", count,
                    "previous", lastCount,
                )
            }
            lastCount = count
        }
    }
}

Alert if goroutine count trends upward over time. A healthy service has a stable goroutine count.

Prevention Rules

  1. Every goroutine must have an exit condition. Usually context.Done() or a channel close.
  2. Every channel send must have a matching receive (or use a buffered channel).
  3. Every time.NewTicker needs a defer ticker.Stop().
  4. Every HTTP request needs a timeout.
  5. Every goroutine spawned in a request handler must be tied to the request context.
  6. Use goleak in tests.

Goroutine leaks are preventable. Follow these rules, and you’ll never be woken up by an OOM kill from a leaked goroutine accumulating for 3 days.