Performance Guide¶

This guide covers performance tuning and optimization for the HAProxy Template Ingress Controller.

Overview¶

Performance optimization involves three areas:

Controller performance - Template rendering, reconciliation cycles
HAProxy performance - Load balancer throughput and latency
Kubernetes integration - Resource watching and event handling

Controller Resource Sizing¶

Recommended Resources¶

Deployment Size	CPU Request	CPU Limit	Memory Request	Memory Limit
Small (<50 Ingresses)	50m	200m	64Mi	256Mi
Medium (50-200 Ingresses)	100m	500m	128Mi	512Mi
Large (200+ Ingresses)	200m	1000m	256Mi	1Gi

Configure via Helm values:

# values.yaml
resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

Memory Considerations¶

Memory usage scales with:

Number of watched resources (Ingresses, Services, Endpoints)
Size of template library
Event buffer size (default 1000 events)
Number of HAProxy pods being managed

Monitor memory usage:

container_memory_working_set_bytes{container="haptic"}

CPU Considerations¶

CPU spikes occur during:

Template rendering (complex templates with many resources)
Initial resource synchronization (startup)
Burst of resource changes (rolling updates)

Monitor CPU usage:

rate(container_cpu_usage_seconds_total{container="haptic"}[5m])

Reconciliation Tuning¶

Debounce Interval¶

The controller debounces resource changes to avoid excessive reconciliation:

# HAProxyTemplateConfig CRD
spec:
  controller:
    reconciliation:
      debounceInterval: 500ms  # Default

Tuning guidelines:

Lower (100-300ms): Faster response to changes, higher CPU usage
Default (500ms): Balanced for most workloads
Higher (1-5s): Better for high-churn environments with many changes

Reconciliation Metrics¶

Monitor reconciliation performance:

# Average reconciliation duration
rate(haptic_reconciliation_duration_seconds_sum[5m]) /
rate(haptic_reconciliation_duration_seconds_count[5m])

# Reconciliation rate
rate(haptic_reconciliation_total[5m])

# P95 reconciliation latency
histogram_quantile(0.95, rate(haptic_reconciliation_duration_seconds_bucket[5m]))

Target metrics:

Average reconciliation: <500ms
P95 reconciliation: <2s
Error rate: <1%

Template Optimization¶

Efficient Template Patterns¶

Use early filtering:

{#- GOOD: Filter early, process less data -#}
{%- var matching_ingresses = []any{} %}
{%- for _, ingress := range resources.ingresses.List() %}
  {%- if ingress.spec.ingressClassName == "haproxy" %}
    {%- matching_ingresses = append(matching_ingresses, ingress) %}
  {%- end %}
{%- end %}
{%- for _, ingress := range matching_ingresses %}
  ...
{%- end %}

{#- ALTERNATIVE: Process with inline filtering -#}
{%- for _, ingress := range resources.ingresses.List() %}
  {%- if ingress.spec.ingressClassName == "haproxy" %}
    ...
  {%- end %}
{%- end %}

Use caching for expensive operations:

{%- if !has_cached("sorted_routes") %}
  {#- Expensive computation only runs once per render -#}
  {%- var sorted_routes = []any{} %}
  {%- for _, route := range resources.httproutes.List() %}
    {%- sorted_routes = append(sorted_routes, route) %}
  {%- end %}
  {%- set_cached("sorted_routes", sorted_routes) %}
{%- end %}
{%- var analysis_routes = get_cached("sorted_routes") %}

Avoid nested loops when possible:

{#- AVOID: O(n*m) complexity -#}
{%- for _, ingress := range ingresses %}
  {%- for _, service := range services %}
    {%- if ingress.spec.backend.service.name == service.metadata.name %}
      ...
    {%- end %}
  {%- end %}
{%- end %}

{#- BETTER: Use indexing or filtering -#}
{%- var service_map = map[string]any{} %}
{%- for _, service := range services %}
  {%- service_map[service.metadata.name] = service %}
{%- end %}
{%- for _, ingress := range ingresses %}
  {%- var service = service_map[ingress.spec.backend.service.name] %}
  ...
{%- end %}

Template Debugging¶

Profile template rendering:

# Enable template tracing
./bin/haptic-controller validate -f config.yaml --trace

# View trace output
cat /tmp/template-trace.log

HAProxy Optimization¶

Configuration Parameters¶

Key HAProxy parameters for performance:

global
    maxconn {{ fallback(controller.config.haproxy.maxconn, 2000) }}
    nbthread {{ fallback(controller.config.haproxy.nbthread, 4) }}
    tune.bufsize {{ fallback(controller.config.haproxy.bufsize, 16384) }}
    tune.ssl.default-dh-param 2048

defaults
    timeout connect 5s
    timeout client 50s
    timeout server 50s
    timeout http-request 10s
    timeout queue 60s

Connection Limits¶

Calculate maxconn based on expected load:

maxconn = (expected_concurrent_connections * safety_factor) / num_haproxy_pods

Example:

Expected: 10,000 concurrent connections
Safety factor: 1.5
HAProxy pods: 3
maxconn = (10,000 * 1.5) / 3 = 5,000

Thread Configuration¶

Match nbthread to available CPU cores:

# HAProxy pod resources
resources:
  requests:
    cpu: 2
  limits:
    cpu: 4

# HAProxy config
global
    nbthread 4  # Match CPU limit

Buffer Sizing¶

Increase buffers for large headers or payloads:

global
    tune.bufsize 32768        # 32KB for large headers
    tune.http.maxhdr 128      # Allow more headers

Password Hash Performance¶

HAProxy validates password hash formats during configuration parsing by running the full hashing algorithm. This can significantly slow down config validation when using expensive hash algorithms.

Hash algorithm validation times:

Algorithm	Example	Time per hash
MD5	`$1$salt$hash`	~0.004ms
SHA-256	`$5$salt$hash`	~3ms
SHA-512	`$6$salt$hash`	~3ms
bcrypt (cost 10)	`$2y$10$salt$hash`	~85ms

bcrypt with high cost factors is expensive

A configuration with 200 bcrypt passwords at cost factor 10 adds ~17 seconds to every config validation. This directly impacts reconciliation time and webhook validation latency.

Recommendations:

Prefer SHA-512 ( $6$ ) for password hashes - cryptographically strong with fast validation
Avoid bcrypt cost factors above 8 in high-frequency validation scenarios
Consolidate userlists to avoid duplicate password entries - HAProxy validates each occurrence separately, even for identical hashes
Consider external authentication (OAuth, OIDC) for large user bases instead of embedding passwords in config

Checking your config:

# Count expensive bcrypt hashes
grep -c '\$2[aby]\$' /path/to/haproxy.cfg

# Estimate validation overhead (bcrypt count × 85ms)

Scaling Strategies¶

Horizontal Scaling¶

Scale HAProxy pods for increased traffic:

kubectl scale deployment haproxy --replicas=5

The controller automatically discovers new pods and deploys configuration.

Controller Scaling (HA Mode)¶

For high availability, run multiple controller replicas:

# values.yaml
replicaCount: 3

controller:
  config:
    controller:
      leader_election:
        enabled: true

Only the leader performs deployments; followers maintain hot-standby status.

Resource Watching Optimization¶

Reduce watched resources to minimize controller load:

# Only watch specific namespaces
spec:
  watchedResources:
    ingresses:
      apiVersion: networking.k8s.io/v1
      resources: ingresses
      namespaceSelector:
        matchNames:
          - production
          - staging

# Use label selectors
spec:
  watchedResources:
    ingresses:
      apiVersion: networking.k8s.io/v1
      resources: ingresses
      labelSelector:
        matchLabels:
          managed-by: haptic

Deployment Performance¶

Deployment Latency¶

Monitor deployment time:

# Average deployment duration
rate(haptic_deployment_duration_seconds_sum[5m]) /
rate(haptic_deployment_duration_seconds_count[5m])

# P95 deployment latency
histogram_quantile(0.95, rate(haptic_deployment_duration_seconds_bucket[5m]))

Target metrics:

Average deployment: <1s per HAProxy pod
P95 deployment: <3s

Parallel Deployment¶

The controller deploys to multiple HAProxy pods in parallel. If deployment is slow:

Check DataPlane API responsiveness
Verify network connectivity to HAProxy pods
Consider reducing config complexity

Drift Prevention¶

Configure drift prevention to avoid unnecessary deployments:

spec:
  controller:
    deployment:
      driftPreventionInterval: 60s  # Check for drift every 60s

Event Processing¶

Event Buffer Sizing¶

The controller maintains event buffers for debugging:

spec:
  controller:
    eventBufferSize: 1000  # Default

Increase for high-throughput environments if you need more event history.

Subscriber Performance¶

Monitor event subscriber health:

# Event publishing rate
rate(haptic_events_published_total[5m])

# Subscriber count (should be constant)
haptic_event_subscribers

If subscriber count drops, components may be failing.

Profiling¶

Go Profiling¶

Access pprof endpoints for profiling:

# CPU profile (30 seconds)
curl http://localhost:6060/debug/pprof/profile?seconds=30 > cpu.pprof
go tool pprof -http=:8080 cpu.pprof

# Memory profile
curl http://localhost:6060/debug/pprof/heap > heap.pprof
go tool pprof -http=:8080 heap.pprof

# Goroutine dump
curl http://localhost:6060/debug/pprof/goroutine?debug=1

Profile-Guided Optimization (PGO)¶

The controller is built with Go's Profile-Guided Optimization (PGO) for improved performance. PGO typically provides 2-7% CPU improvement by optimizing frequently-called functions.

How it works:

A baseline CPU profile (cmd/controller/default.pgo) is committed to the repository. Go automatically uses this profile during builds to optimize hot paths.

Updating the profile:

To collect a fresh profile from the development environment:

Start the dev environment:
```
./scripts/start-dev-env.sh
```

Port-forward to the controller's debug port:

kubectl -n haptic port-forward deploy/haptic-controller 8080:8080

Generate workload (trigger reconciliation by modifying resources)

Collect a 30-second CPU profile:

make pgo-profile
# Or manually:
curl -o cmd/controller/default.pgo http://localhost:8080/debug/pprof/profile?seconds=30

Rebuild with the new profile:
```
make build
```

Production profiles:

For optimal results, collect profiles from production during representative workloads. Merge multiple profiles for broader coverage:

make pgo-merge PROFILES='profile1.pgo profile2.pgo'

Common Performance Issues¶

High memory usage:

Check for memory leaks: growing heap over time
Reduce event buffer size
Limit watched resources

High CPU usage:

Profile to find hot spots
Optimize template complexity
Increase debounce interval

Slow deployments:

Check DataPlane API health
Verify network latency to HAProxy pods
Consider reducing config size

Performance Checklist¶

Initial Deployment¶

[ ] Set appropriate resource requests/limits
[ ] Configure debounce interval for workload
[ ] Set HAProxy maxconn based on expected load
[ ] Match nbthread to CPU allocation

Ongoing Optimization¶

[ ] Monitor reconciliation latency
[ ] Monitor deployment latency
[ ] Watch for memory growth
[ ] Track event subscriber count

High-Load Environments¶

[ ] Scale HAProxy pods horizontally
[ ] Enable HA mode for controller
[ ] Limit watched namespaces
[ ] Use label selectors to filter resources
[ ] Profile and optimize templates