-
Notifications
You must be signed in to change notification settings - Fork 152
Description
Disclaimer: Table names, column names, usernames, and values in this report have been intentionally obfuscated to protect sensitive or personally identifiable information (PII). The logic, structure, logs, and symptoms remain intact and technically accurate for troubleshooting purposes.
⸻
Describe the problem
Readyset starts receiving broken pipe errors after running the following queries directly on the upstream MySQL database, not via Readyset:
TRUNCATE TABLE schema_x.table_target;
INSERT IGNORE INTO schema_x.table_target (number)
SELECT xxx_number AS number
FROM schema_x.table_source;
Execution results:
Query OK, 0 rows affected (28.48 sec)
Query OK, 20300114 rows affected (53.70 sec)
Records: 20300114 Duplicates: 0 Warnings: 0
Immediately after running this, Readyset logs the following and becomes unresponsive:
2025-08-28T20:51:14.624432Z INFO domain{address=7.0.0}:domain_handle_packet{op=ReplayPiece(link=LocalNodeIndex { id: 0 } -> LocalNodeIndex { id: 1 }, tag=Tag(1), 0 records)}: readyset_util::time_scope: operation took 1.972 seconds
2025-08-28T20:51:14.624482Z INFO domain{address=7.0.0}:handle_one_packet{op=ReplayPiece(link=LocalNodeIndex { id: 0 } -> LocalNodeIndex { id: 1 }, tag=Tag(1), 0 records)}: readyset_util::time_scope: operation took 1.972 seconds
2025-08-28T20:51:14.624488Z INFO domain{address=7.0.0}:handle_packets{op=Some(1)}: readyset_util::time_scope: operation took 1.972 seconds
2025-08-28T20:51:16.508190Z INFO domain{address=7.0.0}:domain_handle_packet{op=RequestReaderReplay(node=L 0, keys=[Equal([UnsignedInt(**************)]), ...])}: readyset_util::time_scope: operation took 1.883 seconds
2025-08-28T20:51:16.508225Z INFO domain{address=7.0.0}:handle_one_packet{op=RequestReaderReplay(node=L 0, keys=[Equal([UnsignedInt(**************)])])}: readyset_util::time_scope: operation took 1.884 seconds
2025-08-28T20:51:16.508234Z INFO domain{address=7.0.0}:handle_packets{op=Some(47)}: readyset_util::time_scope: operation took 1.884 seconds
2025-08-28T20:51:16.512686Z ERROR connection{addr=127.0.0.1:38072}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.513135Z ERROR connection{addr=127.0.0.1:38506}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.513457Z ERROR connection{addr=127.0.0.1:38512}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.513778Z ERROR connection{addr=127.0.0.1:38390}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.513957Z ERROR connection{addr=127.0.0.1:38156}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.514150Z ERROR connection{addr=127.0.0.1:38998}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.514509Z ERROR connection{addr=127.0.0.1:38284}: readyset::mysql: connection lost err=Broken pipe (os error 32)
⸻
ProxySQL behavior
ProxySQL marks the Readyset instance as SHUNNED, reporting:
2025-08-28 22:02:23 MySQL_Session.cpp:2823:handler_again___status_CONNECTING_SERVER(): [ERROR] Max connect timeout reached while reaching hostgroup 12 after 10000ms . HG status: [{"Bytes_recv":"1083990373","Bytes_sent":"19018077573","ConnERR":"0","ConnFree":"0","ConnOK":"1020","ConnUsed":"0","Latency_us":"417","MaxConnUsed":"88","Queries":"281797867","Queries_GTID_sync":"0","hostgroup":"12","srv_host":"127.0.0.1","srv_port":"3309","status":"SHUNNED"}]
2025-08-28 22:02:24 MyHGC.cpp:186:get_random_MySrvC(): [ERROR] Hostgroup 12 has no servers available! Checking servers shunned for more than 1 second
2025-08-28 22:02:26 MyHGC.cpp:186:get_random_MySrvC(): [ERROR] Hostgroup 12 has no servers available! Checking servers shunned for more than 1 second
2025-08-28 22:02:26 MySQL_Session.cpp:2823:handler_again___status_CONNECTING_SERVER(): [ERROR] Max connect timeout reached while reaching hostgroup 12 after 10510ms . HG status: [{"Bytes_recv":"1083990373","Bytes_sent":"19018077573","ConnERR":"0","ConnFree":"0","ConnOK":"1020","ConnUsed":"0","Latency_us":"375","MaxConnUsed":"88","Queries":"281797867","Queries_GTID_sync":"0","hostgroup":"12","srv_host":"127.0.0.1","srv_port":"3309","status":"SHUNNED"}]
2025-08-28 22:02:28 MyHGC.cpp:186:get_random_MySrvC(): [ERROR] Hostgroup 12 has no servers available! Checking servers shunned for more than 1 second
2025-08-28 22:02:30 MyHGC.cpp:186:get_random_MySrvC(): [ERROR] Hostgroup 12 has no servers available! Checking servers shunned for more than 1 second
2025-08-28 22:02:30 MySQL_Session.cpp:2823:handler_again___status_CONNECTING_SERVER(): [ERROR] Max connect timeout reached while reaching hostgroup 12 after 10000ms . HG status: [{"Bytes_recv":"1083990373","Bytes_sent":"19018077573","ConnERR":"0","ConnFree":"0","ConnOK":"1020","ConnUsed":"0","Latency_us":"315","MaxConnUsed":"88","Queries":"281797867","Queries_GTID_sync":"0","hostgroup":"12","srv_host":"127.0.0.1","srv_port":"3309","status":"SHUNNED"}]
2025-08-28 22:02:33 MyHGC.cpp:186:get_random_MySrvC(): [ERROR] Hostgroup 12 has no servers available! Checking servers shunned for more than 1 second
2025-08-28 22:02:34 MySQL_Session.cpp:2823:handler_again___status_CONNECTING_SERVER(): [ERROR] Max connect timeout reached while reaching hostgroup 12 after 10000ms . HG status: [{"Bytes_recv":"1083990373","Bytes_sent":"19018077573","ConnERR":"0","ConnFree":"0","ConnOK":"1020","ConnUsed":"0","Latency_us":"232","MaxConnUsed":"88","Queries":"281797867","Queries_GTID_sync":"0","hostgroup":"12","srv_host":"127.0.0.1","srv_port":"3309","status":"SHUNNED"}
⸻
To Reproduce
1. Set up Readyset connected to MySQL 5.7 with ProxySQL routing traffic.
2. Cache queries using the following tables:
REPLICATION_TABLES="schema_x.table_target, schema_x.table_stage, schema_x.calls_live, schema_x.calls_ivr, schema_x.calls_stat, schema_x.calls_stat_aux, schema_x.calls_stat_route"
3. Execute (directly in upstream MySQL):
TRUNCATE TABLE schema_x.table_target;
INSERT IGNORE INTO schema_x.table_target (number)
SELECT xxxxe_number AS number
FROM schema_x.table_source;
4. Observe Readyset logs and ProxySQL behavior.
⸻
Expected behavior
Readyset should be resilient to upstream DML changes (e.g., TRUNCATE or mass INSERT) and not crash or hang. Ideally, it should invalidate caches or reconcile state gracefully.
⸻
Environment
• Readyset version:
release-version: dev
commit_id: 634ff3d0d5edd2a60e4ace9637f80fa7b0e3c2e5
platform: x86_64-unknown-linux-gnu
rustc_version: rustc 1.88.0 (2025-06-23)
profile: release
opt_level: 3
• ProxySQL version: 3.0.1-420-g2c26a42
• MySQL version: mysqld Ver 5.7.44-48 for debian-linux-gnu (Percona Server)
• Deployment: Bare-metal binary install
• Client app: ProxySQL + MySQL CLI
• Replication config:
ALLOWED_USERS="user1:<REDACTED>,user2:<REDACTED>"
⸻
Impact
• All Readyset connections fail with broken pipe
• ProxySQL shuns the instance
• Service becomes unrecoverable without manual Readyset data wipe and resync
• High risk in production environments with upstream TRUNCATE or reingestion jobs
⸻