From 648793eed7ed0e058b6944c6194c331fd2859e7d Mon Sep 17 00:00:00 2001 From: saima-tithi-stjude Date: Tue, 3 Jun 2025 14:07:27 -0500 Subject: [PATCH 1/3] Added input csv files for case samples --- case/annotationFiles.csv | 3 +++ case/annotationGDSFiles.csv | 3 +++ case/genotypeGDSFiles.csv | 3 +++ 3 files changed, 9 insertions(+) create mode 100644 case/annotationFiles.csv create mode 100644 case/annotationGDSFiles.csv create mode 100644 case/genotypeGDSFiles.csv diff --git a/case/annotationFiles.csv b/case/annotationFiles.csv new file mode 100644 index 000000000..6af29684d --- /dev/null +++ b/case/annotationFiles.csv @@ -0,0 +1,3 @@ +chr,vcf,index +21,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/case/21.annotated.vcf.gz,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/case/21.annotated.vcf.gz.tbi +22,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/case/22.annotated.vcf.gz,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/case/22.annotated.vcf.gz.tbi diff --git a/case/annotationGDSFiles.csv b/case/annotationGDSFiles.csv new file mode 100644 index 000000000..8b028b0fd --- /dev/null +++ b/case/annotationGDSFiles.csv @@ -0,0 +1,3 @@ +chr,gds +21,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/case/21.annotated.vcf.gz.gds +22,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/case/22.annotated.vcf.gz.gds diff --git a/case/genotypeGDSFiles.csv b/case/genotypeGDSFiles.csv new file mode 100644 index 000000000..c9ea8b931 --- /dev/null +++ b/case/genotypeGDSFiles.csv @@ -0,0 +1,3 @@ +chr,gds +21,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/case/21.biallelic.leftnorm.ABCheck.vcf.gz.gds +22,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/case/22.biallelic.leftnorm.ABCheck.vcf.gz.gds From e43ceaa231ee93ca00a5c9a6e98ac9844215584d Mon Sep 17 00:00:00 2001 From: saima-tithi-stjude Date: Wed, 4 Jun 2025 10:32:25 -0500 Subject: [PATCH 2/3] Added input csv files for gnomAD control --- control/controlAnnotationGDS.csv | 3 +++ control/controlGenotypeGDS.csv | 3 +++ 2 files changed, 6 insertions(+) create mode 100644 control/controlAnnotationGDS.csv create mode 100644 control/controlGenotypeGDS.csv diff --git a/control/controlAnnotationGDS.csv b/control/controlAnnotationGDS.csv new file mode 100644 index 000000000..cf9f85f37 --- /dev/null +++ b/control/controlAnnotationGDS.csv @@ -0,0 +1,3 @@ +chr,gds +21,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/control/annotation/chr21.annovar.vep.vcf.gz.gds +22,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/control/annotation/chr22.annovar.vep.vcf.gz.gds diff --git a/control/controlGenotypeGDS.csv b/control/controlGenotypeGDS.csv new file mode 100644 index 000000000..b386d364f --- /dev/null +++ b/control/controlGenotypeGDS.csv @@ -0,0 +1,3 @@ +chr,gds +21,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/control/genotypeCount/gnomad.21.vcf.bgz.gds +22,https://raw.githubusercontent.com/nf-core/test-datasets/rarevariantburden/control/genotypeCount/gnomad.22.vcf.bgz.gds From 692a1b46db2cfdf06bdd233159d667234451187c Mon Sep 17 00:00:00 2001 From: saima-tithi-stjude Date: Mon, 16 Jun 2025 11:17:52 -0500 Subject: [PATCH 3/3] updated README to add csv files --- README.md | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 98b52955e..6c987dbba 100644 --- a/README.md +++ b/README.md @@ -11,26 +11,36 @@ This branch contains test data to be used for automated testing with the [nf-cor `case/samples.txt`: One column text file containing list of samples, one sample ID per line. Here we used 25 test samples from 1000 Genomes Project (build GRCh37) as case data. This is the input of pipeline parameter `caseSample` -`case/21.annotated.vcf.gz`: Pre-annotated case VCF file for chr 21, we used the pre-annotated file to skip the annotation steps in the pipelie for testing purpose. This is the input of pipeline parameter `caseAnnotatedVCFPrefix` and `caseAnnotatedVCFSuffix` +`case/annotationFiles.csv`: List of annotated files for test profile. This is the input of pipeline parameter `caseAnnotatedVCFFileList` + +`case/annotationGDSFiles.csv`: List of annotated GDS files for test profile. This is the input of pipeline parameter `caseAnnotationGDSFileList` + +`case/genotypeGDSFiles.csv`: List of genotype GDS files for test profile. This is the input of pipeline parameter `caseGenotypeGDSFileList` + +`case/21.annotated.vcf.gz`: Pre-annotated case VCF file for chr 21, we used the pre-annotated file to skip the annotation steps in the pipelie for testing purpose. `case/21.annotated.vcf.gz.tbi`: The tabix index file for the pre-annotated case VCF file for chr 21 -`case/21.annotated.vcf.gz.gds`: The GDS format file for the pre-annotated case VCF file for chr 21, we used the GDS file to skip the VCF to GDS conversion steps in the pipelie for testing purpose. This is the input of pipeline parameter `caseAnnotationGDSPrefix` and `caseAnnotationGDSSuffix` +`case/21.annotated.vcf.gz.gds`: The GDS format file for the pre-annotated case VCF file for chr 21, we used the GDS file to skip the VCF to GDS conversion steps in the pipelie for testing purpose. -`case/22.annotated.vcf.gz`: Pre-annotated case VCF file for chr 22, we used the pre-annotated file to skip the annotation steps in the pipelie for testing purpose. This is the input of pipeline parameter `caseAnnotatedVCFPrefix` and `caseAnnotatedVCFSuffix` +`case/22.annotated.vcf.gz`: Pre-annotated case VCF file for chr 22, we used the pre-annotated file to skip the annotation steps in the pipelie for testing purpose. `case/22.annotated.vcf.gz.tbi`: The tabix index file for the pre-annotated case VCF file for chr 22 -`case/22.annotated.vcf.gz.gds`: The GDS format file for the pre-annotated case VCF file for chr 22, we used the GDS file to skip the VCF to GDS conversion steps in the pipelie for testing purpose. This is the input of pipeline parameter `caseAnnotationGDSPrefix` and `caseAnnotationGDSSuffix` +`case/22.annotated.vcf.gz.gds`: The GDS format file for the pre-annotated case VCF file for chr 22, we used the GDS file to skip the VCF to GDS conversion steps in the pipelie for testing purpose. -`case/21.biallelic.leftnorm.ABCheck.vcf.gz.gds`: The GDS format for the left normalized case VCF file for chr 21, we used this to skip the normalization and convert nomalized VCF file to GDS format steps in the pipeline. This is the input of pipeline parameter `caseGenotypeGDSPrefix` and `caseGenotypeGDSSuffix` +`case/21.biallelic.leftnorm.ABCheck.vcf.gz.gds`: The GDS format for the left normalized case VCF file for chr 21, we used this to skip the normalization and convert nomalized VCF file to GDS format steps in the pipeline. -`case/22.biallelic.leftnorm.ABCheck.vcf.gz.gds`: The GDS format for the left normalized case VCF file for chr 22, we used this to skip the normalization and convert nomalized VCF file to GDS format steps in the pipeline. This is the input of pipeline parameter `caseGenotypeGDSPrefix` and `caseGenotypeGDSSuffix` +`case/22.biallelic.leftnorm.ABCheck.vcf.gz.gds`: The GDS format for the left normalized case VCF file for chr 22, we used this to skip the normalization and convert nomalized VCF file to GDS format steps in the pipeline. `case/casePopulation.txt`: The predicted ancestry for each sample, if not specified pipeline will estimate the ancestry/ethnicity of each sample using gnomAD classifier. This is the input of pipeline parameter `casePopulation` `control/`: Input files needed for the pipeline from the control dataset (here we used gnomAD v2 exome data as control dataset) +`control/controlAnnotationGDS.csv`: List of pre-annotated and GDS converted VCF file from gnomAD v2 exome. + +`control/controlGenotypeGDS.csv`: List of normalized and GDS converted VCF file from gnomAD v2 exome. + `control/annotation/chr21.annovar.vep.vcf.gz.gds`: The pre-annotated and GDS converted VCF file from gnomAD v2 exome. Here we used a small exome region for chr 21 as the control data. `control/annotation/chr22.annovar.vep.vcf.gz.gds`: The pre-annotated and GDS converted VCF file from gnomAD v2 exome. Here we used a small exome region for chr 22 as the control data.