Database Controller
Overview
The Database Controller is a critical component of the EDURange Cloud platform that manages database interactions between the Kubernetes cluster and the PostgreSQL database. It consists of two main services that run as containers within the same Kubernetes pod:
- Database API: A Flask-based REST API that provides endpoints for database operations
- Database Sync: A background service that synchronizes the state of challenge instances between Kubernetes and the database
Together, these services ensure data consistency and provide a unified interface for database operations across the platform.
Architecture
The Database Controller is deployed as a single Kubernetes deployment with two containers sharing the same pod. This design allows both services to access the same environment variables and configuration while maintaining separation of concerns.
Direct Database Connection
Unlike other services in the EDURange Cloud platform that use PgBouncer for connection pooling, the Database Controller connects directly to the PostgreSQL database. This direct connection approach is intentional for several reasons:
-
Transaction Consistency: The Database API and Sync services require consistent transaction isolation for operations that modify database state, which might be affected by connection pooling settings.
-
Long-Running Operations: The Database Sync service performs continuous polling and update operations that benefit from persistent connections without the overhead of connection pool management.
-
Reliability Requirement: The Database Controller is a core service that must maintain its database connection even during high-demand periods when the connection pool might be fully utilized.
-
Specialized Access Requirements: The Database Controller may require specific session configurations or permissions that are better managed through dedicated connections.
-
Performance Optimization: Direct connections eliminate an extra network hop and potential bottleneck for this critical system component.
This architecture ensures that other services can benefit from connection pooling’s efficiency while the Database Controller maintains its dedicated connection pathway for maximum reliability and consistency.
For more information about how connection pooling works in the EDURange Cloud platform, see the Connection Pooling documentation.
Internal Kubernetes Access
The Database API is designed to be accessed only from within the Kubernetes cluster for enhanced security. This internal-only access model improves the security posture of the platform.
Internal DNS Name
The Database API service is accessible within the cluster using the following internal DNS name:
http://database-api-service.default.svc.cluster.local
Access Methods
Components within the cluster can access the Database API in two ways:
-
Direct Internal Access: Services running in the same Kubernetes namespace (such as the Instance Manager and Database Sync) can directly access the Database API using the internal DNS name.
-
WebOS Proxy: For client-side components in the WebOS application, requests are proxied through the
/api/database-proxy
endpoint to avoid mixed content issues and maintain security.
Security Benefits
This internal-only access model provides several security advantages:
- Prevents unauthorized external access to sensitive database operations
- Reduces the attack surface of the platform
- Ensures that only authenticated and authorized services can perform database operations
- Simplifies the network security model by keeping database traffic within the cluster
Database API
The Database API is a Flask application that provides RESTful endpoints for interacting with the database. It serves as the primary interface for other components of the EDURange Cloud platform to perform database operations.
Key Features
- Activity Logging: Records user activities and system events
- Points Management: Handles awarding, updating, and retrieving points for users
- Competition Management: Supports creating and managing competition groups
- Challenge Management: Provides endpoints for challenge-related operations
- Question Tracking: Manages question completions and attempts
- Challenge Pack Management: Handles installation and management of challenge packs
API Endpoints
Core Endpoints
Endpoint | Method | Description |
---|---|---|
/activity/log | POST | Records activity events in the system |
/add_points | POST | Adds points to a user’s score |
/set_points | POST | Sets a user’s points to a specific value |
/get_points | GET | Retrieves a user’s current points |
/get_challenge_instance | GET | Gets details about a challenge instance |
/competition/create | POST | Creates a new competition group |
/competition/join | POST | Adds a user to a competition group |
/competition/generate-code | POST | Generates an access code for a competition |
/competition/add-challenge | POST | Adds a challenge to a competition |
/competition/complete-challenge | POST | Marks a challenge as completed |
/competition/<group_id>/leaderboard | GET | Gets the leaderboard for a competition |
/competition/<group_id>/progress/<user_id> | GET | Gets a user’s progress in a competition |
/question/complete | POST | Marks a question as completed |
/question/attempt | POST | Records an attempt to answer a question |
/question/completed | GET | Gets a list of completed questions |
/question/details | GET | Gets details about a specific question |
/challenge/details | GET | Gets details about a specific challenge |
/challenge/list | GET | Gets a list of challenges with optional filters |
/status | GET | Simple health check endpoint for Kubernetes liveness and readiness probes |
Challenge Pack Endpoints
Endpoint | Method | Description |
---|---|---|
/challenge-pack/install | POST | Installs a new challenge pack |
/challenge-pack/update | POST | Updates an existing challenge pack |
/challenge-pack/uninstall | POST | Uninstalls a challenge pack |
/challenge-pack/list | GET | Lists all installed challenge packs |
/challenge-pack/details/<pack_id> | GET | Gets details about a specific challenge pack |
CDF Management Endpoints
Endpoint | Method | Description |
---|---|---|
/cdf/validate | POST | Validates a CDF document against the schema |
/cdf/import | POST | Imports a challenge from a CDF document |
/cdf/export/<challenge_id> | GET | Exports a challenge as a CDF document |
/cdf/update/<challenge_id> | POST | Updates a challenge from a CDF document |
Implementation Details
- Uses Prisma ORM for database interactions
- Implements proper error handling and validations
- Provides consistent JSON responses
- Logs all operations for debugging and auditing
- Supports transaction management for operations that require atomicity
Database Sync
The Database Sync service is a Python application that runs continuously in the background, synchronizing the state of challenge instances between the Kubernetes cluster and the database.
Key Functions
- Synchronization Loop: Continuously polls for changes in challenge pods
- Instance Management: Creates, updates, and removes challenge instances in the database
- Flag Management: Retrieves and stores challenge flags securely
- Status Tracking: Updates the status of challenge instances based on Kubernetes pod status
- Error Recovery: Attempts to recover from failed operations through retry mechanisms
Enhanced Synchronization Process
- Polling: Every few seconds, the sync service queries the Instance Manager API to get the current list of challenge pods in the Kubernetes cluster
- Comparison: Compares the list of pods with the challenge instances in the database
- Status Updates:
- Updates challenge instance status (CREATING, ACTIVE, TERMINATING, TERMINATED, ERROR)
- Tracks status change times and termination attempts
- Implements finite state machine for lifecycle management
- Instance Management:
- Adds new challenge instances to the database when new pods are detected
- Updates existing challenge instances with current status information
- Marks challenge instances as terminated when pods are deleted
- Error Handling:
- Implements retry logic for failed operations
- Tracks termination attempts for graceful cleanup
- Logs detailed error information for troubleshooting
- Activity Logging: Records relevant events during the synchronization process
Implementation Details
- Uses asynchronous programming with
asyncio
for efficient database operations - Implements robust error handling to prevent synchronization failures
- Logs all operations for debugging and troubleshooting
- Uses a finite state machine approach to manage challenge instance lifecycle
Challenge Definition Format Integration
The Database Controller plays a key role in managing the Challenge Definition Format (CDF) and Challenge Packs:
CDF Workflow
- Validation: Validates CDF documents against the JSON schema
- Import: Processes and imports CDF documents to create challenges
- Storage: Stores the full CDF content in the database
- Export: Generates CDF documents from existing challenges
- Updates: Handles updates to challenges from updated CDF documents
Challenge Pack Management
- Installation: Processes challenge packs during installation
- Import: Imports challenges from the pack into the database
- Reference: Maintains references between challenges and their packs
- Update: Handles updates to challenge packs
- Uninstall: Properly removes pack data while preserving challenge data
Deployment
The Database Controller is deployed as a Kubernetes deployment with two containers:
-
Database API Container:
- Built from the
dockerfile.api
file - Runs the Flask API on port 8000
- Exposed through a Kubernetes Service
- Built from the
-
Database Sync Container:
- Built from the
dockerfile.sync
file - Runs continuously in the background
- Not exposed outside the cluster
- Built from the
Both containers share the same environment variables for database connection:
POSTGRES_HOST
: The hostname of the PostgreSQL serverPOSTGRES_NAME
: The name of the databasePOSTGRES_USER
: The database usernamePOSTGRES_PASSWORD
: The database passwordDATABASE_URL
: The complete connection string for Prisma
Security Considerations
- Database credentials are stored as environment variables in the deployment configuration
- The Database API is only accessible from within the Kubernetes cluster
- The Database Sync service is not directly accessible from outside the cluster
- Both services implement proper input validation to prevent injection attacks
- API endpoints implement authorization checks for sensitive operations
Monitoring and Maintenance
- The Database Controller’s health can be monitored through the system health endpoints
- Logs from both containers can be collected and analyzed for troubleshooting
- The deployment can be updated by rebuilding and pushing new container images
- Database migrations are handled through Prisma migrations