@@ -78,6 +78,18 @@ in smaller chunks and process these chunks separately (e.g. on isolated workers)
78
78
Consequently, it's important that your **UDF algorithm operates correctly
79
79
in such a chunked processing context **.
80
80
81
+ A very common mistake is to use index-based array indexing, rather than name based. The index based approach
82
+ assumes that datacube dimension order is fixed, which is not guaranteed. Next to that, it also reduces the readability
83
+ of your code. Label based indexing is a great feature of xarray, and should be used whenever possible.
84
+
85
+ As a rule of thumb, the UDF should preserve the dimensions and shape of the input
86
+ data cube. The datacube chunk that is passed on by the backend does not have a fixed
87
+ specification, so the UDF needs to be able to accomodate different shapes and sizes of the data.
88
+
89
+ There's important exceptions to this rule, that depend on the context in which the UDF is used.
90
+ For instance, a UDF used as a reducer should effectively remove the reduced dimension from the
91
+ output chunk. These details are documented in the next sections.
92
+
81
93
UDFs as apply/reduce "callbacks"
82
94
---------------------------------
83
95
@@ -347,6 +359,17 @@ the datacube.
347
359
{' dimension' : ' y' , ' value' : 8 , ' unit' : ' px' }
348
360
])
349
361
362
+
363
+
364
+ .. warning ::
365
+
366
+ The ``apply_neighborhood `` is the most versatile, but also most complex process. Make sure to keep an eye on the dimensions
367
+ and the shape of the DataArray returned by your UDF. For instance, a very common error is to somehow 'flip' the spatial dimensions.
368
+ Debugging the UDF locally can help, but then you will want to try and reproduce the input that you get also on the backend.
369
+ This can typically be achieved by using logging to inspect the DataArrays passed into your UDF backend side.
370
+
371
+
372
+
350
373
Example: Smoothing timeseries with a user defined function (UDF)
351
374
==================================================================
352
375
0 commit comments