Description
Currently, there is no example of aggregating files AND associated metadata. For instance, in many/most nf-core pipelines the process outputs are something like:
output:
tuple val(meta), path("file.txt")
...but what if one wants to then aggregate all of the file.txt
outputs into one table AND include the meta
metadata in that output table?
As far as I can tell from scouring the nextflow slack channel, one must "embed" the metadata in the file paths and then parse the file paths in the aggregation step. For example:
Per-file process:
output:
tuple val(meta), path("${meta}.txt")
Aggregation process:
input:
path("*")
script:
"""
[somehow parse {meta} from input file path]
"""
Is there a better way, especially given the substantial limitations of trying to embed metadata into a file path (eg., dealing with multiple values and special characters in the metadata values)?
I'm sure a lot of pipeline developers would like a best-practices example of how to deal with this situation (without having to decipher how meta
is dealt with in aggregation steps of nf-core pipelines).