-
Notifications
You must be signed in to change notification settings - Fork 1
Instrument the amazeeai backend with prometheus metrics #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Updated docker-compose.yml to include Prometheus and Grafana services. - Added environment variable ENABLE_METRICS to enable Prometheus metrics in the backend. - Implemented Prometheus middleware for FastAPI to track HTTP requests and audit events. - Enhanced audit logging to record metrics for events and durations. - Created Prometheus configuration file for scraping metrics from the FastAPI application. - Added Grafana provisioning files for dashboards and data sources to visualize metrics. - Updated requirements.txt to include Prometheus client libraries.
- Updated docker-compose.yml to include an env_file for environment variables. - Refactored login and registration endpoints in auth.py to utilize a new metrics tracking function for authentication attempts. - Introduced a new auth.py module for tracking authentication metrics with Prometheus. - Simplified error handling and improved user feedback during login and registration processes. - Updated Grafana dashboard configuration to visualize authentication metrics.
- Updated the login and registration endpoints in auth.py to enhance user experience with detailed documentation and error handling. - Removed metrics tracking for authentication attempts to streamline the login process. - Added support for both form data and JSON formats in login and registration requests. - Improved the sign-in process to automatically register users and create teams if they do not exist. - Enhanced error messages for invalid login and sign-in data.
- Added a new function to track authentication requests using Prometheus metrics in auth.py. - Updated login, registration, and email validation endpoints to call the tracking function. - Refactored Prometheus middleware to handle error responses and record metrics accurately. - Modified Grafana dashboard configuration to reflect changes in authentication metrics tracking.
- Added a logging mechanism for authentication events in auth.py, including login attempts, registration, and email validation. - Configured a TimedRotatingFileHandler to manage log files for authentication events. - Updated Prometheus middleware to track authentication request metrics with success and failure statuses. - Enhanced Grafana dashboard configuration to reflect changes in authentication metrics and improve visualization. - Cleaned up unused code related to previous metrics tracking.
- Introduced AuthMiddleware to handle user authentication and store user data in request state. - Updated existing middleware (AuditLogMiddleware and PrometheusMiddleware) to utilize user data from request state instead of directly querying for user information. - Enhanced get_current_user_from_auth function to check for user in request state, improving efficiency. - Cleaned up unused authentication code in middleware for better maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good! keeping in mind that i don't know much about prometheus and didn't review the config files
app/middleware/auth.py
Outdated
class AuthMiddleware(BaseHTTPMiddleware): | ||
async def dispatch(self, request: Request, call_next): | ||
# Skip auth for certain paths | ||
if request.url.path in ["/health", "/docs", "/openapi.json", "/metrics"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit picky but could maybe benefit from being extracted to const.PUBLIC_ENDPOINTS
or similar and then using that here and here, just imagining it'd be easier to manage centrally whenever the list changes...
authorization=auth_header if auth_header else None, | ||
db=db | ||
) | ||
# Store essential user data instead of the full SQLAlchemy object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the benefit of doing this? especially given that at least sometimes we need the full SQLAlchemy object later (via get_current_user_from_auth
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug fixing... I got a DetachedInstanceError
in some of my tests due to lazy loading some of the fields on the user later on. We do still have to reload the SA object later, but it is fewer times than it was.
- Updated the `update_budget_period` function to retrieve spend information from the `info` key in the response, improving data handling. - Added error logging to capture exceptions during budget period updates, enhancing traceability. - Refactored user ID retrieval in `AuditLogMiddleware` to handle different user data structures, improving robustness. - Introduced a new test for updating budget duration as a team admin, ensuring proper functionality and API interaction.
- Added a new PUBLIC_PATHS setting in config.py to centralize the definition of public endpoints. - Updated AuditLogMiddleware, AuthMiddleware, and PrometheusMiddleware to utilize the PUBLIC_PATHS setting for path checks, improving maintainability and consistency across middleware.
Adds instrumentation for basic RED metrics, as well as custom tracking on authentication endpoints to keep track of incoming email addresses.
Example dashboard is included (local grafana and prometheus)
