Skip to content

Example of aggregating file with associated metadata #40

Open
@nick-youngblut

Description

@nick-youngblut

Currently, there is no example of aggregating files AND associated metadata. For instance, in many/most nf-core pipelines the process outputs are something like:

output:
tuple val(meta), path("file.txt")

...but what if one wants to then aggregate all of the file.txt outputs into one table AND include the meta metadata in that output table?

As far as I can tell from scouring the nextflow slack channel, one must "embed" the metadata in the file paths and then parse the file paths in the aggregation step. For example:

Per-file process:

output:
tuple val(meta), path("${meta}.txt")

Aggregation process:

input:
path("*")

script:
"""
[somehow parse {meta} from input file path] 
"""

Is there a better way, especially given the substantial limitations of trying to embed metadata into a file path (eg., dealing with multiple values and special characters in the metadata values)?

I'm sure a lot of pipeline developers would like a best-practices example of how to deal with this situation (without having to decipher how meta is dealt with in aggregation steps of nf-core pipelines).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions