-
Notifications
You must be signed in to change notification settings - Fork 219
Description
Describe the bug
We had a deployment of opal-server that we'd typoed the broadcast URI. We were seeing websockets disconnect errors in the clients, but no errors server side, so no clues it was the broadcast URI.
After turning debug logging on for the server, I spotted the following output:
2024-12-04T15:27:01.018085+0000| fastapi_websocket_rpc.websocket_rpc_e...| INFO | Client connected
2024-12-04T15:27:01.018500+0000| websockets.legacy.server | INFO | connection open
2024-12-04T15:27:01.022749+0000| fastapi_websocket_rpc.rpc_channel |DEBUG | Handling RPC request - {'request': RpcRequest(method='_ping_', arguments={}, call_id='5d8d421e3dc94751a035b202644c8e8a'), 'channel': '72c2a547ffc64741bd095079bc778d7d'}
2024-12-04T15:27:01.023420+0000| fastapi_websocket_pubsub.pub_sub_server |DEBUG | task is done: {<Task finished name='Task-3053' coro=<EventBroadcaster.__read_notifications__() done, defined at /usr/local/lib/python3.10/site-packages/fastapi_websocket_pubsub/event_broadcaster.py:245> exception=gaierror(-2, 'Name or service not known')>}
2024-12-04T15:27:01.023565+0000| fastapi_websocket_pubsub.event_broadc...| INFO | Cancelling broadcast listen task
2024-12-04T15:27:01.023677+0000| fastapi_websocket_pubsub.event_broadc...|DEBUG | Unsubscribing from ALL TOPICS
2024-12-04T15:27:01.023790+0000| fastapi_websocket_pubsub.event_notifier |DEBUG | Removing Subscription of topic='__EventNotifier_ALL_TOPICS__' for subscriber=cc43be7c3ebb41e1b4869ace10d213db
2024-12-04T15:27:01.023948+0000| fastapi_websocket_rpc.websocket_rpc_e...| INFO | Client connection failed - 42723 :: 72c2a547ffc64741bd095079bc778d7d
2024-12-04T15:27:01.024165+0000| fastapi_websocket_pubsub.pub_sub_server |DEBUG | Leaving endpoint's main loop
2024-12-04T15:27:01.026785+0000| websockets.legacy.server | INFO | connection closed
Based on the 4th line, I went to look at fastapi_websocket_pubsub/event_broadcaster.py:245
and saw it was referencing the broadcast URI, at which point I double checked ours and saw the typo.
What I believe to be a bug is that the error is being swallowed, and should probably be elevated to an ERROR
level log for visibility.
To Reproduce
Set the broadcast URI to an invalid value - in our case, a postgres URI with a hostname that didn't resolve.
Logs: se616-opal-opal-server-545c454db-hx2nc.log
Expected behavior
A clear error level log to indicate that the broadcast URI could not be resolved
OPAL version
- Version: Client - 0.7.15, Server - 0.7.8