Skip to content

Readyset crashes with broken pipe after upstream TRUNCATE + mass INSERT (requires full resync to recover) #1548

@vgrippa

Description

@vgrippa

Disclaimer: Table names, column names, usernames, and values in this report have been intentionally obfuscated to protect sensitive or personally identifiable information (PII). The logic, structure, logs, and symptoms remain intact and technically accurate for troubleshooting purposes.

Describe the problem
Readyset starts receiving broken pipe errors after running the following queries directly on the upstream MySQL database, not via Readyset:

TRUNCATE TABLE schema_x.table_target;

INSERT IGNORE INTO schema_x.table_target (number)
SELECT xxx_number AS number 
FROM schema_x.table_source;

Execution results:

Query OK, 0 rows affected (28.48 sec)

Query OK, 20300114 rows affected (53.70 sec)
Records: 20300114  Duplicates: 0  Warnings: 0

Immediately after running this, Readyset logs the following and becomes unresponsive:

2025-08-28T20:51:14.624432Z  INFO domain{address=7.0.0}:domain_handle_packet{op=ReplayPiece(link=LocalNodeIndex { id: 0 } -> LocalNodeIndex { id: 1 }, tag=Tag(1), 0 records)}: readyset_util::time_scope: operation took 1.972 seconds
2025-08-28T20:51:14.624482Z  INFO domain{address=7.0.0}:handle_one_packet{op=ReplayPiece(link=LocalNodeIndex { id: 0 } -> LocalNodeIndex { id: 1 }, tag=Tag(1), 0 records)}: readyset_util::time_scope: operation took 1.972 seconds
2025-08-28T20:51:14.624488Z  INFO domain{address=7.0.0}:handle_packets{op=Some(1)}: readyset_util::time_scope: operation took 1.972 seconds
2025-08-28T20:51:16.508190Z  INFO domain{address=7.0.0}:domain_handle_packet{op=RequestReaderReplay(node=L 0, keys=[Equal([UnsignedInt(**************)]), ...])}: readyset_util::time_scope: operation took 1.883 seconds
2025-08-28T20:51:16.508225Z  INFO domain{address=7.0.0}:handle_one_packet{op=RequestReaderReplay(node=L 0, keys=[Equal([UnsignedInt(**************)])])}: readyset_util::time_scope: operation took 1.884 seconds
2025-08-28T20:51:16.508234Z  INFO domain{address=7.0.0}:handle_packets{op=Some(47)}: readyset_util::time_scope: operation took 1.884 seconds
2025-08-28T20:51:16.512686Z ERROR connection{addr=127.0.0.1:38072}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.513135Z ERROR connection{addr=127.0.0.1:38506}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.513457Z ERROR connection{addr=127.0.0.1:38512}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.513778Z ERROR connection{addr=127.0.0.1:38390}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.513957Z ERROR connection{addr=127.0.0.1:38156}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.514150Z ERROR connection{addr=127.0.0.1:38998}: readyset::mysql: connection lost err=Broken pipe (os error 32)
2025-08-28T20:51:16.514509Z ERROR connection{addr=127.0.0.1:38284}: readyset::mysql: connection lost err=Broken pipe (os error 32)

ProxySQL behavior
ProxySQL marks the Readyset instance as SHUNNED, reporting:

2025-08-28 22:02:23 MySQL_Session.cpp:2823:handler_again___status_CONNECTING_SERVER(): [ERROR] Max connect timeout reached while reaching hostgroup 12 after 10000ms . HG status: [{"Bytes_recv":"1083990373","Bytes_sent":"19018077573","ConnERR":"0","ConnFree":"0","ConnOK":"1020","ConnUsed":"0","Latency_us":"417","MaxConnUsed":"88","Queries":"281797867","Queries_GTID_sync":"0","hostgroup":"12","srv_host":"127.0.0.1","srv_port":"3309","status":"SHUNNED"}]
2025-08-28 22:02:24 MyHGC.cpp:186:get_random_MySrvC(): [ERROR] Hostgroup 12 has no servers available! Checking servers shunned for more than 1 second
2025-08-28 22:02:26 MyHGC.cpp:186:get_random_MySrvC(): [ERROR] Hostgroup 12 has no servers available! Checking servers shunned for more than 1 second
2025-08-28 22:02:26 MySQL_Session.cpp:2823:handler_again___status_CONNECTING_SERVER(): [ERROR] Max connect timeout reached while reaching hostgroup 12 after 10510ms . HG status: [{"Bytes_recv":"1083990373","Bytes_sent":"19018077573","ConnERR":"0","ConnFree":"0","ConnOK":"1020","ConnUsed":"0","Latency_us":"375","MaxConnUsed":"88","Queries":"281797867","Queries_GTID_sync":"0","hostgroup":"12","srv_host":"127.0.0.1","srv_port":"3309","status":"SHUNNED"}]
2025-08-28 22:02:28 MyHGC.cpp:186:get_random_MySrvC(): [ERROR] Hostgroup 12 has no servers available! Checking servers shunned for more than 1 second
2025-08-28 22:02:30 MyHGC.cpp:186:get_random_MySrvC(): [ERROR] Hostgroup 12 has no servers available! Checking servers shunned for more than 1 second
2025-08-28 22:02:30 MySQL_Session.cpp:2823:handler_again___status_CONNECTING_SERVER(): [ERROR] Max connect timeout reached while reaching hostgroup 12 after 10000ms . HG status: [{"Bytes_recv":"1083990373","Bytes_sent":"19018077573","ConnERR":"0","ConnFree":"0","ConnOK":"1020","ConnUsed":"0","Latency_us":"315","MaxConnUsed":"88","Queries":"281797867","Queries_GTID_sync":"0","hostgroup":"12","srv_host":"127.0.0.1","srv_port":"3309","status":"SHUNNED"}]
2025-08-28 22:02:33 MyHGC.cpp:186:get_random_MySrvC(): [ERROR] Hostgroup 12 has no servers available! Checking servers shunned for more than 1 second
2025-08-28 22:02:34 MySQL_Session.cpp:2823:handler_again___status_CONNECTING_SERVER(): [ERROR] Max connect timeout reached while reaching hostgroup 12 after 10000ms . HG status: [{"Bytes_recv":"1083990373","Bytes_sent":"19018077573","ConnERR":"0","ConnFree":"0","ConnOK":"1020","ConnUsed":"0","Latency_us":"232","MaxConnUsed":"88","Queries":"281797867","Queries_GTID_sync":"0","hostgroup":"12","srv_host":"127.0.0.1","srv_port":"3309","status":"SHUNNED"}

To Reproduce
1. Set up Readyset connected to MySQL 5.7 with ProxySQL routing traffic.
2. Cache queries using the following tables:

REPLICATION_TABLES="schema_x.table_target, schema_x.table_stage, schema_x.calls_live, schema_x.calls_ivr, schema_x.calls_stat, schema_x.calls_stat_aux, schema_x.calls_stat_route"

3.	Execute (directly in upstream MySQL):
TRUNCATE TABLE schema_x.table_target;

INSERT IGNORE INTO schema_x.table_target (number)
SELECT xxxxe_number AS number 
FROM schema_x.table_source;
4.	Observe Readyset logs and ProxySQL behavior.

Expected behavior
Readyset should be resilient to upstream DML changes (e.g., TRUNCATE or mass INSERT) and not crash or hang. Ideally, it should invalidate caches or reconcile state gracefully.

Environment

	•	Readyset version:

release-version: dev
commit_id:       634ff3d0d5edd2a60e4ace9637f80fa7b0e3c2e5
platform:        x86_64-unknown-linux-gnu
rustc_version:   rustc 1.88.0 (2025-06-23)
profile:         release
opt_level:       3
•	ProxySQL version: 3.0.1-420-g2c26a42
•	MySQL version: mysqld  Ver 5.7.44-48 for debian-linux-gnu (Percona Server)
•	Deployment: Bare-metal binary install
•	Client app: ProxySQL + MySQL CLI
•	Replication config:

ALLOWED_USERS="user1:<REDACTED>,user2:<REDACTED>"

Impact
• All Readyset connections fail with broken pipe
• ProxySQL shuns the instance
• Service becomes unrecoverable without manual Readyset data wipe and resync
• High risk in production environments with upstream TRUNCATE or reingestion jobs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions