-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Is your feature request related to a problem or challenge?
- Related to Adds memory-bound DefaultListFilesCache #18855
- part of [EPIC] ListingTable object store usage improvements #17214
As we roll out the ListingFileCache from @BlakeOrth in #18855 it would be very helpful to be able to see its contents to debug any potential issues we see
@nuno-faria made a really nice feature to view the contents of the cache: metadata_cache()
For example:
> select * from metadata_cache();
+---------------------------------------------------+---------------------+-----------------+--------------------------------------+---------+---------------------+------+------------------+
| path | file_modified | file_size_bytes | e_tag | version | metadata_size_bytes | hits | extra |
+---------------------------------------------------+---------------------+-----------------+--------------------------------------+---------+---------------------+------+------------------+
| hits_compatible/athena_partitioned/hits_1.parquet | 2022-07-03T15:33:57 | 174965044 | "1f5da68e097309811a675c849491ac48-9" | NULL | 165128 | 0 | page_index=false |
+---------------------------------------------------+---------------------+-----------------+--------------------------------------+---------+---------------------+------+------------------+
1 row(s) fetched.
Elapsed 0.005 seconds.Describe the solution you'd like
I would like a table function similar to metadata_cache() for the listing files cache. Since each entry is a Vec<ObjectMeta> one option would be to flatten the entries so there is one entry per ObjectMeta stored:
Someting like
select * from list_files_cache();| path | file_modified | file_size_bytes | e_tag | version | metadata_size_bytes | expires |
|---|---|---|---|---|---|---|
| /foo/bar | 2022-07-03T15:33:57 | 1234 | ... | ... | 132 | NULL |
| /foo/baz | 2022-07-03T15:33:57 | 5678 | ... | ... | 3112 | 2026-07-03T15:33:57 |
| ... | ... | ... | ... | ... | ... | ... |
Where metadata_size_bytes shows the size of the statistics, in bytes and expires shows when the entry expires
This would mean that a single ListFilesEntry object is displayed as multiple rows.
It would also mean we would have to find some way to represent a ListFilesEntry that had no entries (e.g. metas is an empty Vec). Perhaps it could have a row entirely of nulls:
| path | file_modified | file_size_bytes | e_tag | version | metadata_size_bytes | expires |
|---|---|---|---|---|---|---|
| NULL | NULL | NULL | NULL | NULL | NULL | 2026-07-03T15:33:57 |
Describe alternatives you've considered
No response
Additional context
No response