Skip to content

Adding MVA field in query extremely slowing it down #3381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 tasks
cappadaan opened this issue May 13, 2025 · 27 comments
Open
4 tasks

Adding MVA field in query extremely slowing it down #3381

cappadaan opened this issue May 13, 2025 · 27 comments
Assignees
Labels
bug waiting Waiting for the original poster (in most cases) or something else

Comments

@cappadaan
Copy link

Bug Description:

Adding a MVA field as SQL filter causing the query to slow down extremely. Any other type of field is not slowing the query down.

Queries to show the effect:

  • Query without MVA field
    SELECT
    COUNT(*) AS doc_count
    FROM
    index
    WHERE
    date >= 1739314800
    AND
    date <= 1747087199
    => 5821426 results - 501ms

  • Query with INT field
    SELECT
    COUNT(*) AS doc_count
    FROM
    index
    WHERE
    date >= 1739314800
    AND
    date <= 1747087199
    AND
    INT_field = 1
    => 5232640 results - 900ms

  • Query with MVA field
    SELECT
    COUNT(*) AS doc_count
    FROM
    index
    WHERE
    date >= 1739314800
    AND
    date <= 1747087199
    AND
    MVA_field = 5
    => 716972 results - 10,1sec

Manticore Search Version:

9.3.2

Operating System Version:

AlmaLinux

Have you tried the latest development version?

None

Internal Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

  • Implementation completed
  • Tests developed
  • Documentation updated
  • Documentation reviewed
@cappadaan cappadaan added the bug label May 13, 2025
@cappadaan cappadaan changed the title Adding MVA field in query extremly slowing it down Adding MVA field in query extremely slowing it down May 13, 2025
@sanikolaev
Copy link
Collaborator

@cappadaan what's the schema of the table and what Manticore version are you using?

@sanikolaev
Copy link
Collaborator

what Manticore version are you using?

I see it's 9.3.2

@sanikolaev
Copy link
Collaborator

Can you please also provide SHOW META for the slow query?

@cappadaan
Copy link
Author

This is a realtime index.

Relevant config values:

    max_filter_values       = 100000
    threads                 = 8
    pseudo_sharding         = 0
    max_threads_per_query   = 1

Query
SELECT
COUNT(*) AS doc_count
FROM
index
WHERE
date >= 1739314800
AND
date <= 1747087199
AND
ANY(MVA_field) IN (3,5);

SHOW META
total 1
total_found 1
total_relation eq
time 20.115
index date:SecondaryIndex (100%)

SHOW TABLE index SETTINGS
settings charset_table = non_cjk

SHOW TABLE index INDEXES
date timestamp 1 100
MVA_field uint32set 1 100

These are only the relevant for the query. The table itself has over 40 columns.

Let me know if you need any other info.

@cappadaan
Copy link
Author

The format of the MVA in the field does not mather:

AND
ANY(MVA_field) IN (3,5);

or

AND
MVA_field = 5;

Both take forever, sometimes up to 2 minutes.

@sanikolaev
Copy link
Collaborator

WHERE
date >= 1739314800
AND
date <= 1747087199

  1. How many documents does this alone match?
  2. What's the typical value of MVA_field? How many items does it usually have?
  3. How many documents are there in the table in total?

We can try to reproduce the issue locally if we know these three things.

@cappadaan
Copy link
Author

  • How many documents does this alone match?
    5,8 million

  • What's the typical value of MVA_field? How many items does it usually have?
    1 or 2.

  • How many documents are there in the table in total?
    54 million

@cappadaan
Copy link
Author

@sanikolaev Any news on this issue, have you been able to reproduce it?

@sanikolaev
Copy link
Collaborator

@cappadaan I can't reproduce it with:

manticore-load \
--drop \
--batch-size=10000 \
--threads=5 \
--total=54000000 \
--init="CREATE TABLE test(date timestamp, m multi)" \
--load="INSERT INTO test(id,date,m) VALUES(<increment>,<int/1672531200/1735689599>,(<int/1/10>,<int/1/10>))"

Here's the result:

mysql> select count(*) from test where date >= 1700000000 and date <= 1706500000; select count(*) from test where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5); show meta;
--------------
select count(*) from test where date >= 1700000000 and date <= 1706500000
--------------

+----------+
| count(*) |
+----------+
|  5557992 |
+----------+
1 row in set (0.03 sec)
--- 1 out of 1 results in 33ms ---

--------------
select count(*) from test where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5)
--------------

+----------+
| count(*) |
+----------+
|  2000511 |
+----------+
1 row in set (0.03 sec)
--- 1 out of 1 results in 32ms ---

--------------
show meta
--------------

+----------------+----------------------------+
| Variable_name  | Value                      |
+----------------+----------------------------+
| total          | 1                          |
| total_found    | 1                          |
| total_relation | eq                         |
| time           | 0.032                      |
| index          | date:SecondaryIndex (100%) |
+----------------+----------------------------+
5 rows in set (0.00 sec)

@sanikolaev
Copy link
Collaborator

@cappadaan can you try the same on your end?

@sanikolaev sanikolaev added the waiting Waiting for the original poster (in most cases) or something else label May 22, 2025
@cappadaan
Copy link
Author

select count(*) from index where date >= 1700000000 and date <= 1706500000;

4702408 results
401ms


select count(*) from index where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5)

509670 results
14.1sec


show meta

total 1
total_found 1
total_relation eq
time 14.064
index date:SecondaryIndex (100%)

@cappadaan
Copy link
Author

Can it be hardware related, maybe some limits are hit?

@sanikolaev
Copy link
Collaborator

@cappadaan what's your version of AlmaLinux?

@cappadaan
Copy link
Author

9.5

@sanikolaev
Copy link
Collaborator

AlmaLinux 9.5 and the weakest Hetzner VPS:

Image
mysql> select count(*) from test where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5); show meta;
--------------
select count(*) from test where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5)
--------------

+----------+
| count(*) |
+----------+
|  2000511 |
+----------+
1 row in set (0.57 sec)
--- 1 out of 1 results in 569ms ---

--------------
show meta
--------------

+----------------+----------------------------+
| Variable_name  | Value                      |
+----------------+----------------------------+
| total          | 1                          |
| total_found    | 1                          |
| total_relation | eq                         |
| time           | 0.569                      |
| index          | date:SecondaryIndex (100%) |
+----------------+----------------------------+
5 rows in set (0.00 sec)
show table test status
--------------

+-------------------------------+--------------------------------------------------------------------------------------------------------+
| Variable_name                 | Value                                                                                                  |
+-------------------------------+--------------------------------------------------------------------------------------------------------+
| table_type                    | rt                                                                                                     |
| indexed_documents             | 54000000                                                                                               |
| indexed_bytes                 | 0                                                                                                      |
| ram_bytes                     | 1611869600                                                                                             |
| disk_bytes                    | 3102096370                                                                                             |
| disk_mapped                   | 1622549896                                                                                             |
| disk_mapped_cached            | 1611849728                                                                                             |
| disk_mapped_doclists          | 0                                                                                                      |
| disk_mapped_cached_doclists   | 0                                                                                                      |
| disk_mapped_hitlists          | 0                                                                                                      |
| disk_mapped_cached_hitlists   | 0                                                                                                      |
| killed_documents              | 0                                                                                                      |
| killed_rate                   | 0.00%                                                                                                  |
| ram_chunk                     | 0                                                                                                      |
| ram_chunk_segments_count      | 0                                                                                                      |
| disk_chunks                   | 4                                                                                                      |
| mem_limit                     | 134217728                                                                                              |
| mem_limit_rate                | 95.00%                                                                                                 |

@cappadaan
Copy link
Author

cappadaan commented May 22, 2025

hardware

VPS, Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz, 20G RAM

limits.conf

manticore soft memlock unlimited
manticore hard memlock unlimited
manticore - nofile 65535
manticore - nproc 65535

manticore config

max_filter_values = 100000
network_timeout = 10
client_timeout = 15m
sphinxql_timeout = 1h
threads = 8
pseudo_sharding = 0
max_threads_per_query = 1 (we tried 4, but has no effect)

select count(*) from test where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5); show meta;

total 1
total_found 1
total_relation eq
time 17.237
index date:SecondaryIndex (100%)

show table test status

table_type rt
indexed_documents 54045484
indexed_bytes 42867341188
ram_bytes 4655035575
disk_bytes 110084339746
disk_mapped 73083179493
disk_mapped_cached 4653461504
disk_mapped_doclists 0
disk_mapped_cached_doclists 0
disk_mapped_hitlists 0
disk_mapped_cached_hitlists 0
killed_documents 1498679
killed_rate 2.69%
ram_chunk 1326511
ram_chunk_segments_count 25
disk_chunks 57
mem_limit 134217728
mem_limit_rate 95.00%
ram_bytes_retired 0
optimizing 1
locked 0
tid 0
tid_saved 0
query_time_1min {"queries":10141, "avg_sec":0.000, "min_sec":0.000, "max_sec":0.013, "pct95_sec":0.001, "pct99_sec":0.001}
query_time_5min {"queries":29012, "avg_sec":0.001, "min_sec":0.000, "max_sec":0.071, "pct95_sec":0.001, "pct99_sec":0.001}
query_time_15min {"queries":29012, "avg_sec":0.001, "min_sec":0.000, "max_sec":0.071, "pct95_sec":0.001, "pct99_sec":0.001}
query_time_total {"queries":2847736, "avg_sec":0.003, "min_sec":0.000, "max_sec":420.195, "pct95_sec":0.003, "pct99_sec":0.017}
found_rows_1min {"queries":10141, "avg":727, "min":0, "max":4628202, "pct95":1, "pct99":1}
found_rows_5min {"queries":29012, "avg":1862, "min":0, "max":46626572, "pct95":1, "pct99":1}
found_rows_15min {"queries":29012, "avg":1862, "min":0, "max":46626572, "pct95":1, "pct99":1}
found_rows_total {"queries":2847736, "avg":512, "min":0, "max":46626572, "pct95":3, "pct99":4}
command_search 2847736
command_excerpt 0
command_update 0
command_keywords 0
command_status 1
command_delete 0
command_insert 143687
command_replace 1091881
command_commit 0
command_suggest 0
command_callpq 0
command_getfield 0
insert_replace_stats_ms_avg 1.113 1.078 1.078
insert_replace_stats_ms_min 0.716 0.716 0.716
insert_replace_stats_ms_max 13.673 36.425 36.425
insert_replace_stats_ms_pct95 1.425 1.264 1.264
insert_replace_stats_ms_pct99 2.256 1.943 1.943
search_stats_ms_avg 0.000 0.001 0.001
search_stats_ms_min 0.000 0.000 0.000
search_stats_ms_max 0.013 0.071 0.071
search_stats_ms_pct95 0.001 0.001 0.001
search_stats_ms_pct99 0.001 0.001 0.001
update_stats_ms_avg N/A N/A N/A
update_stats_ms_min N/A N/A N/A
update_stats_ms_max N/A N/A N/A
update_stats_ms_pct95 N/A N/A N/A
update_stats_ms_pct99 N/A N/A N/A

@sanikolaev
Copy link
Collaborator

@cappadaan

disk_chunks 57

Why do you have so many chunks? Combined with max_threads_per_query = 1, this makes Manticore search through them one by one, which can slow down the query.

Anyway, let's focus on the synthetic case, which also shows a big difference between your server and even the weakest Hetzner VPS. Can you try again with the default Manticore configuration this time?

manticore-load \
--drop \
--batch-size=10000 \
--threads=5 \
--total=54000000 \
--init="CREATE TABLE test(date timestamp, m multi)" \
--load="INSERT INTO test(id,date,m) VALUES(<increment>,<int/1672531200/1735689599>,(<int/1/10>,<int/1/10>))"

select count(*) from test where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5); show meta; show table test status;

@cappadaan
Copy link
Author

I think the chunks is the result of our reindex method.

We use this method:

  • 1000 rows at once via endpoint json/bulk

Maybe if we set the rt_mem_limit to eg. '5G' and reindex again, the chunks will be much less?

Is there a prefered chunk bandwidth? I cannot find anything about it in the documentation.

@sanikolaev
Copy link
Collaborator

The number of disk chunks isn't related to the batch size. There could be two reasons why there are too many chunks:

  • auto_optimize is turned off and manual OPTIMIZE runs are too rare
  • merging is slower than writing, so it never fully catches up

Please share your searchd log so we can take a closer look.

@cappadaan
Copy link
Author

auto_optimize is on.

I uploaded the log to your S3 storage under this ticket issue id.
Its the table with click in the name.

There is also a crash dump in the log, maybe related.

@sanikolaev
Copy link
Collaborator

I see a lot of "... optimized progressive chunk(s) 57 (left 56) ..." in the log for all tables. It looks like you have 28 cores, which sets the default optimize_cutoff to 56. That might be slowing down the query, but I'll wait for you to do this:

Anyway, let's focus on the synthetic case, which also shows a big difference between your server and even the weakest Hetzner VPS. Can you try again with the default Manticore configuration this time?

@cappadaan
Copy link
Author

cappadaan commented May 26, 2025

I ran the load command and the query.
This is the result:

select count(*) from test where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5);

+----------+
| count(*) |
+----------+
|  2000511 |
+----------+
+----------------+----------------------------+
| Variable_name  | Value                      |
+----------------+----------------------------+
| total          | 1                          |
| total_found    | 1                          |
| total_relation | eq                         |
| time           | 3.960                      |
| index          | date:SecondaryIndex (100%) |
+----------------+----------------------------+
+-------------------------------+--------------------------------------------------------------------------------------------------------+
| Variable_name                 | Value                                                                                                  |
+-------------------------------+--------------------------------------------------------------------------------------------------------+
| table_type                    | rt                                                                                                     |
| indexed_documents             | 54000000                                                                                               |
| indexed_bytes                 | 0                                                                                                      |
| ram_bytes                     | 1611488743                                                                                             |
| disk_bytes                    | 3074920076                                                                                             |
| disk_mapped                   | 1601237197                                                                                             |
| disk_mapped_cached            | 1586024448                                                                                             |
| disk_mapped_doclists          | 0                                                                                                      |
| disk_mapped_cached_doclists   | 0                                                                                                      |
| disk_mapped_hitlists          | 0                                                                                                      |
| disk_mapped_cached_hitlists   | 0                                                                                                      |
| killed_documents              | 0                                                                                                      |
| killed_rate                   | 0.00%                                                                                                  |
| ram_chunk                     | 25345615                                                                                               |
| ram_chunk_segments_count      | 23                                                                                                     |
| disk_chunks                   | 27                                                                                                     |
| mem_limit                     | 134217728                                                                                              |
| mem_limit_rate                | 71.83%                                                                                                 |
| ram_bytes_retired             | 0                                                                                                      |
| optimizing                    | 0                                                                                                      |
| locked                        | 0                                                                                                      |
| tid                           | 0                                                                                                      |
| tid_saved                     | 0                                                                                                      |
| query_time_1min               | {"queries":1, "avg_sec":0.002, "min_sec":0.002, "max_sec":0.002, "pct95_sec":0.002, "pct99_sec":0.002} |
| query_time_5min               | {"queries":1, "avg_sec":0.002, "min_sec":0.002, "max_sec":0.002, "pct95_sec":0.002, "pct99_sec":0.002} |
| query_time_15min              | {"queries":1, "avg_sec":0.002, "min_sec":0.002, "max_sec":0.002, "pct95_sec":0.002, "pct99_sec":0.002} |
| query_time_total              | {"queries":1, "avg_sec":0.002, "min_sec":0.002, "max_sec":0.002, "pct95_sec":0.002, "pct99_sec":0.002} |
| found_rows_1min               | {"queries":1, "avg":1, "min":1, "max":1, "pct95":1, "pct99":1}                                         |
| found_rows_5min               | {"queries":1, "avg":1, "min":1, "max":1, "pct95":1, "pct99":1}                                         |
| found_rows_15min              | {"queries":1, "avg":1, "min":1, "max":1, "pct95":1, "pct99":1}                                         |
| found_rows_total              | {"queries":1, "avg":1, "min":1, "max":1, "pct95":1, "pct99":1}                                         |
| command_search                | 1                                                                                                      |
| command_excerpt               | 0                                                                                                      |
| command_update                | 0                                                                                                      |
| command_keywords              | 0                                                                                                      |
| command_status                | 155                                                                                                    |
| command_delete                | 0                                                                                                      |
| command_insert                | 5400                                                                                                   |
| command_replace               | 0                                                                                                      |
| command_commit                | 0                                                                                                      |
| command_suggest               | 0                                                                                                      |
| command_callpq                | 0                                                                                                      |
| command_getfield              | 0                                                                                                      |
| insert_replace_stats_ms_avg   | N/A 127.388 127.388                                                                                    |
| insert_replace_stats_ms_min   | N/A 41.348 41.348                                                                                      |
| insert_replace_stats_ms_max   | N/A 299.551 299.551                                                                                    |
| insert_replace_stats_ms_pct95 | N/A 172.044 172.044                                                                                    |
| insert_replace_stats_ms_pct99 | N/A 191.075 191.075                                                                                    |
| search_stats_ms_avg           | 0.002 0.002 0.002                                                                                      |
| search_stats_ms_min           | 0.002 0.002 0.002                                                                                      |
| search_stats_ms_max           | 0.002 0.002 0.002                                                                                      |
| search_stats_ms_pct95         | 0.002 0.002 0.002                                                                                      |
| search_stats_ms_pct99         | 0.002 0.002 0.002                                                                                      |
| update_stats_ms_avg           | N/A N/A N/A                                                                                            |
| update_stats_ms_min           | N/A N/A N/A                                                                                            |
| update_stats_ms_max           | N/A N/A N/A                                                                                            |
| update_stats_ms_pct95         | N/A N/A N/A                                                                                            |
| update_stats_ms_pct99         | N/A N/A N/A                                                                                            |
+-------------------------------+--------------------------------------------------------------------------------------------------------+

I also tried this on my own index

ALTER TABLE index rt_mem_limit='512M';
ALTER TABLE index optimize_cutoff='5';
OPTIMIZE TABLE index;

select count(*) from index where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5);

total	1
total_found	1
total_relation	eq
time	6.306 
index	date:SecondaryIndex (100%)

table_type	rt
indexed_documents	54271124
indexed_bytes	43546950606
ram_bytes	6443233959
disk_bytes	106620497847
disk_mapped	71046127939
disk_mapped_cached	6438064128
disk_mapped_doclists	0
disk_mapped_cached_doclists	0
disk_mapped_hitlists	0
disk_mapped_cached_hitlists	0
killed_documents	17849
killed_rate	0.03%
ram_chunk	5145663
ram_chunk_segments_count	27
disk_chunks	5
mem_limit	536870912
mem_limit_rate	95.00%
ram_bytes_retired	0
optimizing	0
locked	0
tid	0
tid_saved	0
query_time_1min	{"queries":1, "avg_sec":0.002, "min_sec":0.002, "max_sec":0.002, "pct95_sec":0.002, "pct99_sec":0.002}
query_time_5min	{"queries":2130, "avg_sec":0.002, "min_sec":0.001, "max_sec":0.014, "pct95_sec":0.002, "pct99_sec":0.003}
query_time_15min	{"queries":10253, "avg_sec":0.000, "min_sec":0.000, "max_sec":0.330, "pct95_sec":0.002, "pct99_sec":0.003}
query_time_total	{"queries":40350, "avg_sec":0.003, "min_sec":0.000, "max_sec":32.241, "pct95_sec":0.003, "pct99_sec":0.007}
found_rows_1min	{"queries":1, "avg":1, "min":1, "max":1, "pct95":1, "pct99":1}
found_rows_5min	{"queries":2130, "avg":2, "min":0, "max":31, "pct95":3, "pct99":4}
found_rows_15min	{"queries":10253, "avg":453, "min":0, "max":4637531, "pct95":3, "pct99":4}
found_rows_total	{"queries":40350, "avg":1364, "min":0, "max":46819237, "pct95":3, "pct99":4}
command_search	40350
command_excerpt	0
command_update	0
command_keywords	0
command_status	5
command_delete	0
command_insert	6339
command_replace	10987
command_commit	0
command_suggest	0
command_callpq	0
command_getfield	0
insert_replace_stats_ms_avg	N/A N/A 124.298
insert_replace_stats_ms_min	N/A N/A 0.466
insert_replace_stats_ms_max	N/A N/A 200697.418
insert_replace_stats_ms_pct95	N/A N/A 1.017
insert_replace_stats_ms_pct99	N/A N/A 1.182
search_stats_ms_avg	0.002 0.002 0.000
search_stats_ms_min	0.002 0.001 0.000
search_stats_ms_max	0.002 0.014 0.330
search_stats_ms_pct95	0.002 0.002 0.002
search_stats_ms_pct99	0.002 0.003 0.003
update_stats_ms_avg	N/A N/A N/A
update_stats_ms_min	N/A N/A N/A
update_stats_ms_max	N/A N/A N/A
update_stats_ms_pct95	N/A N/A N/A
update_stats_ms_pct99	N/A N/A N/A

@sanikolaev
Copy link
Collaborator

select count(*) from test where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5);

So with the same Manticore version and config in your case, it took 3960 ms, while on the weakest VPS it only took 569 ms with the same OS, and just 32 ms on a regular modern MacBook Pro.

It looks like there might be something wrong with your server @cappadaan . You might want to check the CPU performance using some benchmarking tools. I’ve had an issue in the past where the CPU was performing poorly because it was overheating and throttling. That might be what's happening in your case too. Check the CPU temperature.

BTW when Manticore is busy doing this:

select count(*) from test where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5);

+----------+
| count(*) |
+----------+
|  2000511 |
+----------+
+----------------+----------------------------+
| Variable_name  | Value                      |
+----------------+----------------------------+
| total          | 1                          |
| total_found    | 1                          |
| total_relation | eq                         |
| time           | 3.960                      |

what's the CPU load according to dstat/vmstat?

@cappadaan
Copy link
Author

cappadaan commented May 27, 2025

You maybe right, we are struggling with manticore on this vps for a while now.
The same test on a different VPS from ours was indeed superfast.

While running the query:

[root@manticore maintenance]# vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
4 7 0 207584 6044 9388936 0 0 1774 537 18 6 14 1 84 1 0

iostat
avg-cpu: %user %nice %system %iowait %steal %idle
14.09 0.00 1.08 1.04 0.01 83.79

It looks like there is not much happening at all.

But this query caused a fatal crash now:

[Tue May 27 07:21:56.106 2025] [8251] [BUDDY] [X] <Thrown: QueryProcessor:256> <Logged: EventHandler:73> <[68356814140f33.58265631] processing error> Failed to handle query: select count(*) from index where date>= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5)
------- FATAL: CRASH DUMP -------
[Tue May 27 07:31:36.497 2025] [ 8244]
..

@tomatolog
Copy link
Contributor

But this query caused a fatal crash now:

If the crash has same crash steck leads into join sorter - it could be better to create separate issue and upload data (config, query, indexes) that reproduces that crash here locally.

@sanikolaev
Copy link
Collaborator

It looks like there is not much happening at all.

We can see that the CPU is 84% idle. Provided the response time is 4 seconds and there are many chunks, and pseudo_sharding is not disabled, Manticore should be parallelizing as much as possible. But for some reason, the CPU isn't being fully used. For example, when I run the same query on my Mac

rm -f /tmp/sql; for n in `seq 1 1000`; do echo "select count(*) from test where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5);" >> /tmp/sql; done; mysql -P9306 -h0 < /tmp/sql > /dev/null

I get this:

Image

i.e. the CPU is utilized very well which is expected.

and upload data (config, query, indexes) that reproduces that crash here locally.

Or even better, if you can reproduce the crash using the synthetic case or a modified version of it, please let us know how to change it so that it causes the crash.

@cappadaan
Copy link
Author

cappadaan commented Jun 1, 2025

I did some testing on the VPS.
There are multiple searchd and indexer processes running, so I killed them all.

Test 1:
I ran 1 searchd with the test database with

  • unlimited threads
  • most basic rt config there is

rm -f /tmp/sql; for n in seq 1 1000; do echo "select count(*) from test where date >= 1700000000 and date <= 1706500000 and ANY(m) IN (3,5);" >> /tmp/sql; done; mysql -P9306 -h0 < /tmp/sql > /dev/null

which shows

Image

looks fine to me.

Test 2
then I started only my problem searchd:

  • unlimited threads
  • most basic rt config there is

select count(*) from click where date >= 1700000001 and date <= 1706500001 and ANY(m) IN (3,5);
show meta;
show table click status;

which shows

Image

Took: 14.1 seconds

table_type rt
indexed_documents 54562984
indexed_bytes 44157936702
ram_bytes 5904271523
disk_bytes 107250687127
disk_mapped 71461036070
disk_mapped_cached 5901819904
disk_mapped_doclists 0
disk_mapped_cached_doclists 0
disk_mapped_hitlists 0
disk_mapped_cached_hitlists 0
killed_documents 30839
killed_rate 0.05%
ram_chunk 2427451
ram_chunk_segments_count 26
disk_chunks 5
mem_limit 536870912
mem_limit_rate 95.00%
ram_bytes_retired 0
optimizing 0
locked 0
tid 0
tid_saved 0

We tested the VPS for CPU steal and other performance problems but all seems fine.

I am out of ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug waiting Waiting for the original poster (in most cases) or something else
Projects
None yet
Development

No branches or pull requests

4 participants