-
Environment:
dd if=/zpool/f64/nondedup of=/dev/null bs=64k count=100000 status=progress The speed of reading files in a deduped filesystem is about 1/5 compared with a non-deduped filesystem. I'm curious what caused that difference so much if DDT is in ARC for sure. I could not find any clue so far. My analysis: Then I traced ZFS read logic from SPL to DMU layer, but still can not find any difference. At last, I try to trace ZIO pipeline. The difference is just in zio_ddt_read_start and zio_ddt_read_done. zio_ddt_read_start will create an extra zio. The rest of zio logic is almost the same as normal filesystem reading. Then I suspect context switch caused by taskq , but after tracing I could say both dedup and non-dedup reading have the same taskq trigger, i,e, taskq is only trigger by physical IO completion, and all parent zio(s) are run in stack environment in my test case. The rough dedup read zio pipeline looks like: The non-dedup read zio pipeline looks like: Please let me know if there is anything I can dig further. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
the read is synchronous. everything must wait for the DDT sync_read to finish before async_reads can proceed. does this align with your observations? |
Beta Was this translation helpful? Give feedback.
-
If you see no difference on disk (I would not expect there be unless you hit error to trigger recovery), have you looked on CPU usage? May be collect CPU profiles and compare them one to another and tor expectations? Alternative idea is whether something could happen to space allocation in case of dedup, that would make reads not sequential in case of enabled dedup. With primarycache=metadata you get no prefetch on read, so maximally depend on a disk latency. You could compare disk read latency and/or read offsets. |
Beta Was this translation helpful? Give feedback.
If you see no difference on disk (I would not expect there be unless you hit error to trigger recovery), have you looked on CPU usage? May be collect CPU profiles and compare them one to another and tor expectations?
Alternative idea is whether something could happen to space allocation in case of dedup, that would make reads not sequential in case of enabled dedup. With primarycache=metadata you get no prefetch on read, so maximally depend on a disk latency. You could compare disk read latency and/or read offsets.