Skip to content

Keycloak Authentication & JWT Management #207

@jmgilman

Description

@jmgilman

Keycloak Authentication & JWT Management

Overview

Implement mTLS authentication with Keycloak and JWT token management. This includes establishing mutual TLS connections using certificates, exchanging certificates for JWT tokens, and managing token lifecycle with caching and automatic refresh.

Requirements

Module Structure

Implement the following modules:

internal/
├── auth/
│   ├── keycloak.go           # Keycloak mTLS authentication
│   └── jwt.go                # JWT token management

Keycloak mTLS Authentication

Implement Keycloak client in internal/auth/keycloak.go:

type KeycloakClient struct {
    baseURL       string
    realm         string
    clientID      string
    tokenEndpoint string
    httpClient    *http.Client
    logger        *slog.Logger
}

type TokenResponse struct {
    AccessToken  string `json:"access_token"`
    TokenType    string `json:"token_type"`
    ExpiresIn    int    `json:"expires_in"`
    RefreshToken string `json:"refresh_token,omitempty"`
    Scope        string `json:"scope,omitempty"`
}

// Core functions to implement:
func NewKeycloakClient(config *KeycloakConfig, cert tls.Certificate) (*KeycloakClient, error)
func (k *KeycloakClient) ExchangeCertificateForToken() (*TokenResponse, error)
func (k *KeycloakClient) ValidateConnection() error

mTLS Connection Setup

func (k *KeycloakClient) configureTLSClient(cert tls.Certificate) error {
    tlsConfig := &tls.Config{
        Certificates: []tls.Certificate{cert},
        MinVersion:   tls.VersionTLS13,
        ClientAuth:   tls.RequireAndVerifyClientCert,
        // Additional security settings
    }
    
    k.httpClient = &http.Client{
        Transport: &http.Transport{
            TLSClientConfig: tlsConfig,
            // Connection pooling settings
        },
        Timeout: 30 * time.Second,
    }
}

Certificate-to-JWT Exchange Flow

  1. Load current certificate (bootstrap or API-managed)
  2. Configure TLS client with certificate
  3. Make token request to Keycloak token endpoint
  4. Present certificate during TLS handshake
  5. Keycloak validates certificate against Server CA
  6. Receive JWT token with service account permissions
  7. Cache token for future use

Request format:

POST /realms/main/protocol/openid-connect/token
Content-Type: application/x-www-form-urlencoded

grant_type=client_credentials&client_id=hetzner-machine

JWT Token Management

Implement token manager in internal/auth/jwt.go:

type JWTManager struct {
    keycloakClient *KeycloakClient
    currentToken   *Token
    mutex          sync.RWMutex
    logger         *slog.Logger
    refreshTimer   *time.Timer
}

type Token struct {
    Value      string
    ExpiresAt  time.Time
    IssuedAt   time.Time
    RefreshAt  time.Time  // 10 minutes before expiry
}

// Core functions to implement:
func (j *JWTManager) GetToken() (string, error)
func (j *JWTManager) RefreshToken() error
func (j *JWTManager) ScheduleRefresh(expiresIn int)
func (j *JWTManager) IsTokenValid() bool
func (j *JWTManager) StartAutoRefresh()
func (j *JWTManager) StopAutoRefresh()

Token Lifecycle Management

func (j *JWTManager) GetToken() (string, error) {
    j.mutex.RLock()
    defer j.mutex.RUnlock()
    
    // Return cached token if valid
    if j.currentToken != nil && j.IsTokenValid() {
        return j.currentToken.Value, nil
    }
    
    // Acquire write lock for refresh
    j.mutex.RUnlock()
    j.mutex.Lock()
    defer j.mutex.Unlock()
    
    // Double-check after acquiring write lock
    if j.currentToken != nil && j.IsTokenValid() {
        return j.currentToken.Value, nil
    }
    
    // Refresh token
    return j.refreshTokenInternal()
}

Automatic Token Refresh

func (j *JWTManager) StartAutoRefresh() {
    go func() {
        for {
            if j.currentToken != nil {
                refreshIn := time.Until(j.currentToken.RefreshAt)
                j.refreshTimer = time.NewTimer(refreshIn)
                
                select {
                case <-j.refreshTimer.C:
                    if err := j.RefreshToken(); err != nil {
                        j.logger.Error().Err(err).Msg("Failed to refresh JWT token")
                        // Retry with exponential backoff
                        j.scheduleRetry()
                    }
                case <-j.stopChan:
                    return
                }
            }
        }
    }()
}

Error Handling and Resilience

Implement robust error handling:

type AuthError struct {
    Type    AuthErrorType
    Message string
    Cause   error
    Retryable bool
}

type AuthErrorType string

const (
    ErrCertificateRejected AuthErrorType = "CERTIFICATE_REJECTED"
    ErrTokenExchangeFailed AuthErrorType = "TOKEN_EXCHANGE_FAILED"
    ErrNetworkTimeout      AuthErrorType = "NETWORK_TIMEOUT"
    ErrInvalidResponse     AuthErrorType = "INVALID_RESPONSE"
)

func (j *JWTManager) handleAuthError(err error) error {
    // Classify error type
    // Determine if retryable
    // Log with appropriate level
    // Return wrapped error with context
}

Degraded Mode Support

Implement fallback behavior for network issues:

type DegradedModeManager struct {
    jwtManager     *JWTManager
    lastValidToken *Token
    degradedSince  *time.Time
}

func (d *DegradedModeManager) GetTokenWithFallback() (string, error) {
    // Try to get fresh token
    token, err := d.jwtManager.GetToken()
    if err == nil {
        d.exitDegradedMode()
        return token, nil
    }
    
    // Enter degraded mode if not already
    if d.degradedSince == nil {
        d.enterDegradedMode()
    }
    
    // Use cached token if available and not too old
    if d.lastValidToken != nil {
        gracePeriod := 24 * time.Hour // Allow expired tokens for 24h in degraded mode
        if time.Since(d.lastValidToken.ExpiresAt) < gracePeriod {
            return d.lastValidToken.Value, nil
        }
    }
    
    return "", fmt.Errorf("no valid token available in degraded mode")
}

Audit Logging

Implement comprehensive audit logging:

type AuthAuditLogger struct {
    logger *slog.Logger
}

func (a *AuthAuditLogger) LogAuthenticationAttempt(identity string, success bool, details map[string]interface{}) {
    level := slog.LevelInfo
    if !success {
        level = slog.LevelError
    }
    
    a.logger.LogAttrs(context.Background(), level, "Authentication attempt",
        slog.String("event_type", "authentication_attempt"),
        slog.String("actor.identity", identity),
        slog.String("actor.auth_method", "mtls"),
        slog.Bool("outcome", success),
        slog.Any("details", details),
    )
}

func (a *AuthAuditLogger) LogTokenRefresh(success bool, expiresAt time.Time) {
    // Log token refresh events
}

func (a *AuthAuditLogger) LogDegradedMode(entering bool) {
    // Log degraded mode transitions
}

Configuration

Required configuration parameters:

keycloak:
  base_url: "https://keycloak.internal"
  realm: "main"
  client_id: "hetzner-machine"
  token_endpoint: "/realms/main/protocol/openid-connect/token"
  jwt_cache_duration: "50m" # Refresh JWT 10 minutes before expiry

Acceptance Criteria

  1. mTLS Authentication

    • Successfully establishes mTLS connection to Keycloak
    • Presents certificate during TLS handshake
    • Handles certificate validation errors appropriately
    • Supports both bootstrap and API-managed certificates
  2. Token Exchange

    • Exchanges certificate for JWT token successfully
    • Parses and validates token response
    • Extracts expiry information correctly
    • Handles exchange failures with proper errors
  3. Token Management

    • Caches tokens in memory
    • Automatically refreshes tokens 10 minutes before expiry
    • Thread-safe token access
    • No token persistence to disk
  4. Resilience

    • Operates in degraded mode during network issues
    • Uses cached tokens when Keycloak unavailable
    • Implements exponential backoff for retries
    • Recovers automatically when connectivity restored
  5. Security

    • Uses TLS 1.3 minimum
    • Validates server certificates
    • Never logs tokens or sensitive data
    • Audit logs all authentication events

Testing Requirements

  • Unit tests for token parsing and validation
  • Unit tests for cache management
  • Mock tests for Keycloak interaction
  • Test automatic refresh scheduling
  • Test degraded mode operation
  • Test concurrent token access
  • Integration tests with test Keycloak instance
  • Test certificate rejection scenarios
  • Test network timeout handling

Dependencies

  • Standard Go crypto/tls package
  • Standard net/http package
  • JWT parsing library (e.g., github.com/golang-jwt/jwt/v5)
  • No direct Keycloak SDK required (use standard OIDC flow)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions