Skip to content

Scatter using Singularity fails creating SIF #5

@atombaby

Description

@atombaby

When using singularity exec within tasks executed within scatter there is a race condition when the Docker/Singularity image isn't in the cache. On NFS mounted home directories this apparently results in an error "stale NFS file handle"

To replicate it's necessary to remove the images from the Singularity cache (~/.singularity). It also is difficult to replicate with simple images (e.g. ubuntu). The Broad's GATK image seems to reproduce this error fairly reliably.

==> shard-0/execution/stderr <==
2020/04/10 07:06:02 debug unpacking entry           path=root/.conda root=/loc/scratch/46802618/rootfs-3371b095-7b34-11ea-ae11-002590e2b58e type=53
2020/04/10 07:06:02 debug unpacking entry           path=root/.conda/pkgs root=/loc/scratch/46802618/rootfs-3371b095-7b34-11ea-ae11-002590e2b58e type=53
2020/04/10 07:06:02 debug unpacking entry           path=root/.conda/pkgs/urls root=/loc/scratch/46802618/rootfs-3371b095-7b34-11ea-ae11-002590e2b58e type=48
2020/04/10 07:06:02 debug unpacking entry           path=root/.conda/pkgs/urls.txt root=/loc/scratch/46802618/rootfs-3371b095-7b34-11ea-ae11-002590e2b58e type=48
2020/04/10 07:06:02 debug unpacking entry           path=root/.gradle root=/loc/scratch/46802618/rootfs-3371b095-7b34-11ea-ae11-002590e2b58e type=53
2020/04/10 07:06:02 debug unpacking entry           path=root/gatk.jar root=/loc/scratch/46802618/rootfs-3371b095-7b34-11ea-ae11-002590e2b58e type=50
2020/04/10 07:06:02 debug unpacking entry           path=root/run_unit_tests.sh root=/loc/scratch/46802618/rootfs-3371b095-7b34-11ea-ae11-002590e2b58e type=48
DEBUG   [U=34152,P=14608]  Full()                        Inserting Metadata
DEBUG   [U=34152,P=14608]  Full()                        Calling assembler
INFO    [U=34152,P=14608]  Assemble()                    Creating SIF file...
DEBUG   [U=34152,P=14608]  cleanUp()                     Cleaning up "/loc/scratch/46802618/rootfs-3371b095-7b34-11ea-ae11-002590e2b58e" and "/loc/scratch/46802618/bundle-temp-539962740"
FATAL   [U=34152,P=14608]  replaceURIWithImage()         Unable to handle docker://broadinstitute/gatk@sha256:0dd5cb7f9321dc5a43e7667ed4682147b1e827d6a3e5f7bf4545313df6d491aa uri: unable to build: while creating SIF: while creating container: writing data object for SIF file: copying data object file to SIF file: write /home/mrg/.singularity/cache/oci-tmp/0dd5cb7f9321dc5a43e7667ed4682147b1e827d6a3e5f7bf4545313df6d491aa/gatk@sha256_0dd5cb7f9321dc5a43e7667ed4682147b1e827d6a3e5f7bf4545313df6d491aa.sif: stale NFS file handle
==> shard-1/execution/stderr <==
2020/04/10 07:06:13 debug unpacking entry           path=root/.conda root=/loc/scratch/46802619/rootfs-4740289f-7b34-11ea-ad57-002590e2b824 type=53
2020/04/10 07:06:13 debug unpacking entry           path=root/.conda/pkgs root=/loc/scratch/46802619/rootfs-4740289f-7b34-11ea-ad57-002590e2b824 type=53
2020/04/10 07:06:13 debug unpacking entry           path=root/.conda/pkgs/urls root=/loc/scratch/46802619/rootfs-4740289f-7b34-11ea-ad57-002590e2b824 type=48
2020/04/10 07:06:13 debug unpacking entry           path=root/.conda/pkgs/urls.txt root=/loc/scratch/46802619/rootfs-4740289f-7b34-11ea-ad57-002590e2b824 type=48
2020/04/10 07:06:13 debug unpacking entry           path=root/.gradle root=/loc/scratch/46802619/rootfs-4740289f-7b34-11ea-ad57-002590e2b824 type=53
2020/04/10 07:06:13 debug unpacking entry           path=root/gatk.jar root=/loc/scratch/46802619/rootfs-4740289f-7b34-11ea-ad57-002590e2b824 type=50
2020/04/10 07:06:13 debug unpacking entry           path=root/run_unit_tests.sh root=/loc/scratch/46802619/rootfs-4740289f-7b34-11ea-ad57-002590e2b824 type=48
DEBUG   [U=34152,P=3126]   Full()                        Inserting Metadata
DEBUG   [U=34152,P=3126]   Full()                        Calling assembler
INFO    [U=34152,P=3126]   Assemble()                    Creating SIF file...
VERBOSE [U=34152,P=3126]   Full()                        Build complete: /home/mrg/.singularity/cache/oci-tmp/0dd5cb7f9321dc5a43e7667ed4682147b1e827d6a3e5f7bf4545313df6d491aa/gatk@sha256_0dd5cb7f9321dc5a43e7667ed4682147b1e827d6a3e5f7bf4545313df6d491aa.sif
DEBUG   [U=34152,P=3126]   cleanUp()                     Cleaning up "/loc/scratch/46802619/rootfs-4740289f-7b34-11ea-ad57-002590e2b824" and "/loc/scratch/46802619/bundle-temp-701530895"
VERBOSE [U=34152,P=3126]   handleOCI()                   Image cached as SIF at /home/mrg/.singularity/cache/oci-tmp/0dd5cb7f9321dc5a43e7667ed4682147b1e827d6a3e5f7bf4545313df6d491aa/gatk@sha256_0dd5cb7f9321dc5a43e7667ed4682147b1e827d6a3e5f7bf4545313df6d491aa.sif
DEBUG   [U=34152,P=3126]   execStarter()                 Checking for encrypted system partition

.... output trimmed- log indicates this shard ran the container....

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions