You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By default, `dir-content-diff` runs file comparisons sequentially. However, for improved performance when comparing large numbers of files, parallel execution is available using either thread-based or process-based concurrency.
223
+
224
+
#### Configuration Options
225
+
226
+
Parallel execution can be configured using the following parameters:
227
+
228
+
-**`executor_type`**: Controls the type of parallel execution:
229
+
-`"sequential"` (default): No parallel execution, files are compared one by one
230
+
-`"thread"`: Uses `ThreadPoolExecutor` (recommended for I/O-bound tasks)
231
+
-`"process"`: Uses `ProcessPoolExecutor` (recommended for CPU-intensive comparisons)
232
+
233
+
-**`max_workers`**: Maximum number of worker threads/processes. If `None` (default), it defaults to `min(32, (os.cpu_count() or 1) + 4)`.
234
+
235
+
#### Usage Examples
236
+
237
+
Enable thread-based parallel execution:
238
+
239
+
```python
240
+
import dir_content_diff
241
+
242
+
dir_content_diff.compare_trees(
243
+
"reference_dir",
244
+
"compared_dir",
245
+
executor_type="thread",
246
+
max_workers=8
247
+
)
248
+
```
249
+
250
+
Enable process-based parallel execution with automatic worker count:
251
+
252
+
```python
253
+
import dir_content_diff
254
+
255
+
dir_content_diff.compare_trees(
256
+
"reference_dir",
257
+
"compared_dir",
258
+
executor_type="process"
259
+
)
260
+
```
261
+
262
+
Using a configuration object:
263
+
264
+
```python
265
+
import dir_content_diff
266
+
267
+
config = dir_content_diff.ComparisonConfig(
268
+
executor_type="thread",
269
+
max_workers=4
270
+
)
271
+
272
+
dir_content_diff.compare_trees(
273
+
"reference_dir",
274
+
"compared_dir",
275
+
config=config
276
+
)
277
+
```
278
+
279
+
#### Performance Considerations
280
+
281
+
-**Thread-based execution** (`executor_type="thread"`) is generally recommended for most use cases as file comparisons are typically I/O-bound operations
282
+
-**Process-based execution** (`executor_type="process"`) may be beneficial when using computationally intensive comparators or when dealing with very large files
283
+
- Parallel execution is automatically disabled for single file comparisons and falls back to sequential execution when only one file needs to be compared
284
+
- The optimal number of workers depends on your system's capabilities and the nature of your files; too many workers may actually decrease performance due to overhead
285
+
286
+
206
287
### Export formatted data
207
288
208
289
Some comparators have to format the data before comparing them. For example, if one wants to
0 commit comments