-
Notifications
You must be signed in to change notification settings - Fork 0
Google Cloud Storage Integration
- Introduction
- Architecture Overview
- Authentication Mechanism
- Configuration Parameters
- Object Naming Hierarchy
- Core Storage Operations
- Fallback Behavior and Synchronization
- Security Considerations
- Performance Optimization
- Troubleshooting Guide
- Implementation Examples
The Google Cloud Storage (GCS) integration in the Post-Quantum WebAuthn Platform provides a scalable, durable storage backend for WebAuthn credentials and metadata. This cloud-backed storage solution enables persistent storage of cryptographic material, session data, and credential artifacts across distributed deployments while maintaining backward compatibility with local storage for development and testing scenarios.
The integration leverages the google-cloud-storage Python SDK to provide robust, fault-tolerant storage operations with automatic retry mechanisms, comprehensive error handling, and seamless fallback capabilities between cloud and local storage backends.
The GCS integration follows a layered architecture pattern that separates concerns between storage operations, authentication, and fallback mechanisms:
graph TB
subgraph "Application Layer"
WebAuthn[WebAuthn Operations]
Metadata[Metadata Management]
Artifacts[Credential Artifacts]
end
subgraph "Storage Abstraction Layer"
Storage[Storage Interface]
SessionStore[Session Metadata Store]
CredentialStore[Credential Store]
ArtifactStore[Credential Artifacts Store]
end
subgraph "Cloud Storage Layer"
GCSCore[GCS Core Implementation]
Auth[Authentication]
Retry[Retry Logic]
Bucket[Bucket Management]
end
subgraph "External Dependencies"
GCS[Google Cloud Storage]
Local[Local Filesystem]
end
WebAuthn --> Storage
Metadata --> SessionStore
Artifacts --> ArtifactStore
Storage --> GCSCore
SessionStore --> GCSCore
CredentialStore --> GCSCore
ArtifactStore --> GCSCore
GCSCore --> Auth
GCSCore --> Retry
GCSCore --> Bucket
Bucket --> GCS
Bucket --> Local
Diagram sources
- cloud_storage.py
- storage.py
- session_metadata_store.py
Section sources
- cloud_storage.py
- storage.py
The GCS integration supports multiple authentication methods through service account credentials, providing flexibility for different deployment scenarios:
The system supports three primary authentication approaches:
- Service Account File Path: Load credentials from a JSON key file
- Service Account JSON Content: Direct JSON credential content
- Project Override: Use default application credentials with project override
flowchart TD
Start([Authentication Request]) --> CheckFile{"Credentials File<br/>Provided?"}
CheckFile --> |Yes| LoadFromFile[Load from Service Account File]
CheckFile --> |No| CheckJSON{"Credentials JSON<br/>Provided?"}
LoadFromFile --> ExtractProject[Extract Project ID]
ExtractProject --> BuildClient[Build GCS Client]
CheckJSON --> |Yes| ParseJSON[Parse JSON Credentials]
CheckJSON --> |No| CheckProject{"Project Override<br/>Provided?"}
ParseJSON --> ExtractProject2[Extract Project ID]
ExtractProject2 --> BuildClient
CheckProject --> |Yes| BuildClientWithProject[Build Client with Project Override]
CheckProject --> |No| DefaultClient[Build Default Client]
BuildClientWithProject --> ValidateClient[Validate Client]
DefaultClient --> ValidateClient
BuildClient --> ValidateClient
ValidateClient --> Success([Authentication Success])
Diagram sources
- cloud_storage.py
The authentication process is controlled through environment variables:
| Environment Variable | Purpose | Example Value |
|---|---|---|
FIDO_SERVER_GCS_CREDENTIALS_FILE |
Path to service account JSON file | /path/to/service-account.json |
FIDO_SERVER_GCS_CREDENTIALS_JSON |
Inline JSON credentials | {"type": "service_account", ...} |
FIDO_SERVER_GCS_PROJECT |
Project ID override | my-gcp-project |
Section sources
- cloud_storage.py
The GCS integration uses a comprehensive set of configuration parameters to control behavior and optimize performance:
| Parameter | Environment Variable | Default Value | Description |
|---|---|---|---|
FIDO_SERVER_GCS_ENABLED |
Enable/disable GCS | False |
Controls whether GCS is used for storage |
FIDO_SERVER_GCS_BUCKET |
Bucket name | Required | Target GCS bucket for storage operations |
FIDO_SERVER_GCS_USER_FOLDER_PREFIX |
Folder prefix | "user-data" |
Base prefix for user data organization |
FIDO_SERVER_GCS_USER_CREDENTIAL_SUBDIR |
Credential subdir | "credentials" |
Subdirectory for credential storage |
FIDO_SERVER_GCS_USER_METADATA_SUBDIR |
Metadata subdir | "metadata" |
Subdirectory for session metadata |
| Parameter | Environment Variable | Default Value | Description |
|---|---|---|---|
FIDO_SERVER_GCS_CREDENTIAL_PREFIX |
Legacy prefix | "user-data" |
Backward compatibility prefix |
FIDO_SERVER_GCS_SESSION_METADATA_PREFIX |
Metadata prefix | "user-data" |
Metadata storage prefix |
The system implements exponential backoff retry logic for transient failures:
| Parameter | Default Value | Description |
|---|---|---|
max_attempts |
3 | Maximum retry attempts |
base_delay |
0.5 seconds | Initial retry delay |
retryable_exceptions |
Multiple types | Exceptions eligible for retry |
Section sources
- cloud_storage.py
- storage.py
- session_metadata_store.py
The GCS integration implements a hierarchical object naming scheme that organizes data logically and efficiently:
[user-data]/[session-id]/credentials/[username]_credential_data.pkl
[user-data]/[session-id]/metadata/.last-access
[user-data]/[session-id]/metadata/[filename].json
[user-data]/[session-id]/metadata/[filename].meta.json
[user-data]/[session-id]/credential-artifacts/[sha256-hash].json
The system provides flexible blob name construction with automatic normalization:
flowchart TD
Input[Input Components] --> NormalizePrefix[Normalize Prefix]
NormalizePrefix --> FilterComponents[Filter Empty Components]
FilterComponents --> StripSlashes[Strip Leading/Trailing Slashes]
StripSlashes --> JoinPath[Join with Forward Slashes]
JoinPath --> ValidatePath{Validate Path}
ValidatePath --> |Valid| ConstructName[Construct Final Blob Name]
ValidatePath --> |Invalid| ThrowError[Throw ValueError]
ConstructName --> Output[Final Blob Name]
Diagram sources
- cloud_storage.py
Section sources
- storage.py
- session_metadata_store.py
- credential_artifacts.py
The GCS integration provides comprehensive CRUD operations with built-in retry logic and error handling:
The upload system supports various data types and content types:
sequenceDiagram
participant App as Application
participant Storage as Storage Layer
participant GCS as GCS Client
participant Retry as Retry Handler
App->>Storage : upload_bytes(blob_name, data, content_type)
Storage->>GCS : bucket.blob(blob_name)
Storage->>Retry : _with_retry(upload_operation)
loop Retry Attempts (max 3)
Retry->>GCS : blob.upload_from_string(data, content_type)
alt Success
Retry-->>Storage : Operation Complete
else Transient Error
Retry->>Retry : Exponential Backoff Delay
end
end
Retry-->>Storage : Final Result
Storage-->>App : Upload Complete
Diagram sources
- cloud_storage.py
The download system handles missing objects gracefully:
flowchart TD
Start[Download Request] --> GetBlob[Get Blob Reference]
GetBlob --> TryDownload[Try Download]
TryDownload --> CheckError{Download Error?}
CheckError --> |NotFound| ReturnNull[Return None]
CheckError --> |Other Error| RetryLogic[Apply Retry Logic]
CheckError --> |Success| ReturnData[Return Data]
RetryLogic --> RetryAttempt{Retry Attempt?}
RetryAttempt --> |Yes| WaitDelay[Wait Exponential Delay]
RetryAttempt --> |No| ReturnNull
WaitDelay --> TryDownload
Diagram sources
- cloud_storage.py
The delete system supports missing object handling:
| Parameter | Type | Default | Description |
|---|---|---|---|
missing_ok |
bool |
True |
Whether to suppress NotFound errors |
The listing system supports prefix-based filtering:
| Parameter | Type | Description |
|---|---|---|
prefix |
str |
Filter blobs by prefix |
Section sources
- cloud_storage.py
The system implements intelligent fallback behavior that seamlessly transitions between cloud and local storage:
flowchart TD
Start[Storage Operation] --> CheckEnabled{GCS Enabled?}
CheckEnabled --> |No| LocalStorage[Use Local Storage]
CheckEnabled --> |Yes| CheckBucket{Bucket Configured?}
CheckBucket --> |No| LocalStorage
CheckBucket --> |Yes| CloudStorage[Use Cloud Storage]
LocalStorage --> LocalOps[Local File Operations]
CloudStorage --> CloudOps[GCS Operations]
CloudOps --> CloudSuccess{Operation Success?}
CloudSuccess --> |No| LocalFallback[Local Fallback]
CloudSuccess --> |Yes| Return[Return Result]
LocalFallback --> LocalOps
LocalOps --> Return
Diagram sources
- storage.py
The credential storage system implements a sophisticated fallback mechanism:
- Primary Search: Look for new-format credential blobs
- Legacy Fallback: Fall back to legacy credential locations
- Local Backup: Use local storage if cloud fails
The session metadata store implements automatic cleanup for inactive sessions:
| Parameter | Value | Description |
|---|---|---|
INACTIVE_AGE |
14 days | Age threshold for cleanup |
CLEANUP_INTERVAL |
6 hours | Cleanup frequency |
MAX_ATTEMPTS |
3 | Retry attempts for cleanup |
Section sources
- storage.py
- session_metadata_store.py
The GCS integration implements multiple layers of security to protect sensitive data:
Google Cloud Storage provides automatic encryption at rest using AES-256 encryption. Additional encryption layers can be implemented at the application level using the cryptography library.
Required IAM roles for optimal functionality:
| Role | Purpose | Scope |
|---|---|---|
roles/storage.objectAdmin |
Full object management | Bucket-level |
roles/storage.admin |
Bucket administration | Project-level |
roles/storage.objectViewer |
Read-only access | Optional |
Service account key management best practices:
- Rotate Keys Regularly: Implement automated key rotation
- Principle of Least Privilege: Grant minimal required permissions
- Audit Logging: Enable comprehensive audit logs
- Network Restrictions: Use VPC Service Controls when appropriate
The system implements comprehensive input validation:
flowchart TD
Input[User Input] --> ValidateType{Valid Type?}
ValidateType --> |No| Reject[Reject Operation]
ValidateType --> |Yes| ValidateLength{Valid Length?}
ValidateLength --> |No| Reject
ValidateLength --> |Yes| ValidateChars{Valid Characters?}
ValidateChars --> |No| Reject
ValidateChars --> |Yes| Sanitize[Sanitize Input]
Sanitize --> Process[Process Operation]
Section sources
- storage.py
- session_metadata_store.py
Several strategies can be employed to optimize GCS integration performance:
- Connection Pooling: Reuse GCS client connections
- Batch Operations: Group multiple operations when possible
- Caching: Implement application-level caching for frequently accessed data
- Regional Deployment: Deploy applications close to target GCS regions
Cost optimization techniques:
| Strategy | Description | Impact |
|---|---|---|
| Lifecycle Policies | Automatic object lifecycle management | Reduced storage costs |
| Nearline Storage | Use Nearline for less frequently accessed data | Lower storage rates |
| Compress Data | Compress data before storage | Reduced transfer costs |
| Request Optimization | Minimize unnecessary requests | Lower API costs |
The retry system implements exponential backoff:
delay = base_delay * (2^(attempt-1))
This strategy balances system load while maximizing success probability.
Section sources
- cloud_storage.py
Common issues and their solutions:
Symptoms: GoogleAPICallError or RefreshError exceptions
Causes and Solutions:
| Issue | Cause | Solution |
|---|---|---|
| Invalid credentials | Incorrect service account file | Verify credentials file path and content |
| Expired credentials | Token expiration | Refresh service account credentials |
| Insufficient permissions | Missing IAM roles | Add required IAM roles to service account |
| Network connectivity | Firewall blocking | Configure firewall rules for GCS |
Symptoms: OSError or timeout exceptions
Solutions:
- Increase timeout values: Adjust client timeout settings
- Check network connectivity: Verify internet access
- Region selection: Choose geographically closer GCS regions
- Retry configuration: Increase retry attempts and delays
Symptoms: QuotaExceededError exceptions
Solutions:
- Monitor quotas: Track API usage through Cloud Console
- Request quota increases: Submit quota increase requests
- Optimize operations: Reduce unnecessary API calls
- Implement backoff: Add exponential backoff to operations
Symptoms: NotFound or permission errors
Solutions:
- Verify bucket existence: Confirm bucket name and region
- Check bucket permissions: Ensure proper IAM configuration
- Validate prefixes: Verify object naming conventions
- Test connectivity: Use GCS browser or CLI tools
Section sources
- cloud_storage.py
- test_cloud_storage.py
Storing WebAuthn credentials with automatic fallback:
# Save credential data
def save_webauthn_credential(username: str, credential_data: Any, session_id: str):
# Automatically uses GCS if enabled, falls back to local storage
savekey(username, credential_data, session_id=session_id)Managing session metadata with automatic cleanup:
# Touch session to mark activity
def touch_session_activity(session_id: str):
# Updates last access timestamp
touch_last_access(session_id)
# Clean up inactive sessions
def cleanup_inactive_sessions():
# Automatically removes sessions older than 14 days
_maybe_cleanup_inactive_sessions()Storing advanced credential artifacts:
# Store credential artifact with merging
def store_artifact(storage_id: str, artifact_data: Dict, merge: bool = True):
# Supports atomic updates and conflict resolution
store_credential_artifact(storage_id, artifact_data, merge=merge)Robust error handling for production deployments:
def robust_storage_operation(operation_func):
try:
return operation_func()
except gcs_exceptions.GoogleAPICallError as e:
# Log error and fall back to local storage
logger.warning(f"GCS operation failed: {e}")
return fallback_local_operation()
except Exception as e:
# Unexpected error, log and re-raise
logger.error(f"Unexpected storage error: {e}")
raiseSection sources
- storage.py
- session_metadata_store.py
- credential_artifacts.py