Skip to content

Performance Guide

This guide covers performance tuning and optimization for the HAProxy Template Ingress Controller.

Overview

Performance optimization involves three areas:

  • Controller performance - Template rendering, reconciliation cycles
  • HAProxy performance - Load balancer throughput and latency
  • Kubernetes integration - Resource watching and event handling

Controller Resource Sizing

Deployment Size CPU Request CPU Limit Memory Request Memory Limit
Small (<50 Ingresses) 50m 200m 64Mi 256Mi
Medium (50-200 Ingresses) 100m 500m 128Mi 512Mi
Large (200+ Ingresses) 200m 1000m 256Mi 1Gi

Configure via Helm values:

# values.yaml
resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

Memory Considerations

Memory usage scales with:

  • Number of watched resources (Ingresses, Services, Endpoints)
  • Size of template library
  • Event buffer size (default 1000 events)
  • Number of HAProxy pods being managed

Monitor memory usage:

container_memory_working_set_bytes{container="haptic"}

CPU Considerations

CPU spikes occur during:

  • Template rendering (complex templates with many resources)
  • Initial resource synchronization (startup)
  • Burst of resource changes (rolling updates)

Monitor CPU usage:

rate(container_cpu_usage_seconds_total{container="haptic"}[5m])

Reconciliation Tuning

Debounce Interval

The controller debounces resource changes to avoid excessive reconciliation:

# HAProxyTemplateConfig CRD
spec:
  controller:
    reconciliation:
      debounceInterval: 500ms  # Default

Tuning guidelines:

  • Lower (100-300ms): Faster response to changes, higher CPU usage
  • Default (500ms): Balanced for most workloads
  • Higher (1-5s): Better for high-churn environments with many changes

Reconciliation Metrics

Monitor reconciliation performance:

# Average reconciliation duration
rate(haptic_reconciliation_duration_seconds_sum[5m]) /
rate(haptic_reconciliation_duration_seconds_count[5m])

# Reconciliation rate
rate(haptic_reconciliation_total[5m])

# P95 reconciliation latency
histogram_quantile(0.95, rate(haptic_reconciliation_duration_seconds_bucket[5m]))

Target metrics:

  • Average reconciliation: <500ms
  • P95 reconciliation: <2s
  • Error rate: <1%

Template Optimization

Efficient Template Patterns

Use early filtering:

{#- GOOD: Filter early, process less data -#}
{%- var matching_ingresses = []any{} %}
{%- for _, ingress := range resources.ingresses.List() %}
  {%- if ingress.spec.ingressClassName == "haproxy" %}
    {%- matching_ingresses = append(matching_ingresses, ingress) %}
  {%- end %}
{%- end %}
{%- for _, ingress := range matching_ingresses %}
  ...
{%- end %}

{#- ALTERNATIVE: Process with inline filtering -#}
{%- for _, ingress := range resources.ingresses.List() %}
  {%- if ingress.spec.ingressClassName == "haproxy" %}
    ...
  {%- end %}
{%- end %}

Use caching for expensive operations:

{%- if !has_cached("sorted_routes") %}
  {#- Expensive computation only runs once per render -#}
  {%- var sorted_routes = []any{} %}
  {%- for _, route := range resources.httproutes.List() %}
    {%- sorted_routes = append(sorted_routes, route) %}
  {%- end %}
  {%- set_cached("sorted_routes", sorted_routes) %}
{%- end %}
{%- var analysis_routes = get_cached("sorted_routes") %}

Avoid nested loops when possible:

{#- AVOID: O(n*m) complexity -#}
{%- for _, ingress := range ingresses %}
  {%- for _, service := range services %}
    {%- if ingress.spec.backend.service.name == service.metadata.name %}
      ...
    {%- end %}
  {%- end %}
{%- end %}

{#- BETTER: Use indexing or filtering -#}
{%- var service_map = map[string]any{} %}
{%- for _, service := range services %}
  {%- service_map[service.metadata.name] = service %}
{%- end %}
{%- for _, ingress := range ingresses %}
  {%- var service = service_map[ingress.spec.backend.service.name] %}
  ...
{%- end %}

Template Debugging

Profile template rendering:

# Enable template tracing
./bin/haptic-controller validate -f config.yaml --trace

# View trace output
cat /tmp/template-trace.log

HAProxy Optimization

Configuration Parameters

Key HAProxy parameters for performance:

global
    maxconn {{ fallback(controller.config.haproxy.maxconn, 2000) }}
    nbthread {{ fallback(controller.config.haproxy.nbthread, 4) }}
    tune.bufsize {{ fallback(controller.config.haproxy.bufsize, 16384) }}
    tune.ssl.default-dh-param 2048

defaults
    timeout connect 5s
    timeout client 50s
    timeout server 50s
    timeout http-request 10s
    timeout queue 60s

Connection Limits

Calculate maxconn based on expected load:

maxconn = (expected_concurrent_connections * safety_factor) / num_haproxy_pods

Example:

  • Expected: 10,000 concurrent connections
  • Safety factor: 1.5
  • HAProxy pods: 3
  • maxconn = (10,000 * 1.5) / 3 = 5,000

Thread Configuration

Match nbthread to available CPU cores:

# HAProxy pod resources
resources:
  requests:
    cpu: 2
  limits:
    cpu: 4

# HAProxy config
global
    nbthread 4  # Match CPU limit

Buffer Sizing

Increase buffers for large headers or payloads:

global
    tune.bufsize 32768        # 32KB for large headers
    tune.http.maxhdr 128      # Allow more headers

Password Hash Performance

HAProxy validates password hash formats during configuration parsing by running the full hashing algorithm. This can significantly slow down config validation when using expensive hash algorithms.

Hash algorithm validation times:

Algorithm Example Time per hash
MD5 $1$salt$hash ~0.004ms
SHA-256 $5$salt$hash ~3ms
SHA-512 $6$salt$hash ~3ms
bcrypt (cost 10) $2y$10$salt$hash ~85ms

bcrypt with high cost factors is expensive

A configuration with 200 bcrypt passwords at cost factor 10 adds ~17 seconds to every config validation. This directly impacts reconciliation time and webhook validation latency.

Recommendations:

  • Prefer SHA-512 ($6$) for password hashes - cryptographically strong with fast validation
  • Avoid bcrypt cost factors above 8 in high-frequency validation scenarios
  • Consolidate userlists to avoid duplicate password entries - HAProxy validates each occurrence separately, even for identical hashes
  • Consider external authentication (OAuth, OIDC) for large user bases instead of embedding passwords in config

Checking your config:

# Count expensive bcrypt hashes
grep -c '\$2[aby]\$' /path/to/haproxy.cfg

# Estimate validation overhead (bcrypt count × 85ms)

Scaling Strategies

Horizontal Scaling

Scale HAProxy pods for increased traffic:

kubectl scale deployment haproxy --replicas=5

The controller automatically discovers new pods and deploys configuration.

Controller Scaling (HA Mode)

For high availability, run multiple controller replicas:

# values.yaml
replicaCount: 3

controller:
  config:
    controller:
      leader_election:
        enabled: true

Only the leader performs deployments; followers maintain hot-standby status.

Resource Watching Optimization

Reduce watched resources to minimize controller load:

# Only watch specific namespaces
spec:
  watchedResources:
    ingresses:
      apiVersion: networking.k8s.io/v1
      resources: ingresses
      namespaceSelector:
        matchNames:
          - production
          - staging

# Use label selectors
spec:
  watchedResources:
    ingresses:
      apiVersion: networking.k8s.io/v1
      resources: ingresses
      labelSelector:
        matchLabels:
          managed-by: haptic

Deployment Performance

Deployment Latency

Monitor deployment time:

# Average deployment duration
rate(haptic_deployment_duration_seconds_sum[5m]) /
rate(haptic_deployment_duration_seconds_count[5m])

# P95 deployment latency
histogram_quantile(0.95, rate(haptic_deployment_duration_seconds_bucket[5m]))

Target metrics:

  • Average deployment: <1s per HAProxy pod
  • P95 deployment: <3s

Parallel Deployment

The controller deploys to multiple HAProxy pods in parallel. If deployment is slow:

  1. Check DataPlane API responsiveness
  2. Verify network connectivity to HAProxy pods
  3. Consider reducing config complexity

Drift Prevention

Configure drift prevention to avoid unnecessary deployments:

spec:
  controller:
    deployment:
      driftPreventionInterval: 60s  # Check for drift every 60s

Event Processing

Event Buffer Sizing

The controller maintains event buffers for debugging:

spec:
  controller:
    eventBufferSize: 1000  # Default

Increase for high-throughput environments if you need more event history.

Subscriber Performance

Monitor event subscriber health:

# Event publishing rate
rate(haptic_events_published_total[5m])

# Subscriber count (should be constant)
haptic_event_subscribers

If subscriber count drops, components may be failing.

Profiling

Go Profiling

Access pprof endpoints for profiling:

# CPU profile (30 seconds)
curl http://localhost:6060/debug/pprof/profile?seconds=30 > cpu.pprof
go tool pprof -http=:8080 cpu.pprof

# Memory profile
curl http://localhost:6060/debug/pprof/heap > heap.pprof
go tool pprof -http=:8080 heap.pprof

# Goroutine dump
curl http://localhost:6060/debug/pprof/goroutine?debug=1

Profile-Guided Optimization (PGO)

The controller is built with Go's Profile-Guided Optimization (PGO) for improved performance. PGO typically provides 2-7% CPU improvement by optimizing frequently-called functions.

How it works:

A baseline CPU profile (cmd/controller/default.pgo) is committed to the repository. Go automatically uses this profile during builds to optimize hot paths.

Updating the profile:

To collect a fresh profile from the development environment:

  1. Start the dev environment:

    ./scripts/start-dev-env.sh
    
  2. Port-forward to the controller's debug port:

    kubectl -n haptic port-forward deploy/haptic-controller 8080:8080
    
  3. Generate workload (trigger reconciliation by modifying resources)

  4. Collect a 30-second CPU profile:

    make pgo-profile
    # Or manually:
    curl -o cmd/controller/default.pgo http://localhost:8080/debug/pprof/profile?seconds=30
    
  5. Rebuild with the new profile:

    make build
    

Production profiles:

For optimal results, collect profiles from production during representative workloads. Merge multiple profiles for broader coverage:

make pgo-merge PROFILES='profile1.pgo profile2.pgo'

Common Performance Issues

High memory usage:

  • Check for memory leaks: growing heap over time
  • Reduce event buffer size
  • Limit watched resources

High CPU usage:

  • Profile to find hot spots
  • Optimize template complexity
  • Increase debounce interval

Slow deployments:

  • Check DataPlane API health
  • Verify network latency to HAProxy pods
  • Consider reducing config size

Performance Checklist

Initial Deployment

  • [ ] Set appropriate resource requests/limits
  • [ ] Configure debounce interval for workload
  • [ ] Set HAProxy maxconn based on expected load
  • [ ] Match nbthread to CPU allocation

Ongoing Optimization

  • [ ] Monitor reconciliation latency
  • [ ] Monitor deployment latency
  • [ ] Watch for memory growth
  • [ ] Track event subscriber count

High-Load Environments

  • [ ] Scale HAProxy pods horizontally
  • [ ] Enable HA mode for controller
  • [ ] Limit watched namespaces
  • [ ] Use label selectors to filter resources
  • [ ] Profile and optimize templates

See Also