Open
Description
Hello,
I’m trying to use VarSim to simulate SNPs and SVs in a plant genome (e.g., Arabidopsis thaliana). I have several VCF files from real population data that include both SNPs and SVs, and I want to simulate dozens of random genomes using these VCFs.
However, when I use these VCF files as input to VarSim, the simulated genomes are always exactly the same across runs. I don’t understand why this happens — I was expecting some level of random sampling or stochasticity when generating each simulated genome.
Here is the command I used:
varsim.py --id ath_${i} --seed $i --simulator_executable ~/software/varsim/opt/ART/art_bin_VanillaIceCream/art_illumina \
--reference ~/reference/ath/upload_vcf/Col-PEK.genome.fasta --sv_num_ins 5000 --sv_num_del 5000 --sv_num_dup 2500 --sv_num_inv 2500 \
--vc_num_snp 500000 --vc_num_ins 20000 --vc_num_del 20000 --vc_num_mnp 5000 --vc_num_complex 2500 \
--vc_min_length_lim 0 --vc_max_length_lim 49 --sv_min_length_lim 50 --sv_max_length_lim 1000000 \
--disable_sim --vc_prop_het 0.6 --sv_prop_het 0.6 --vcfs ~/reference/ath/upload_vcf/72accs.col_pek.* \
--disable_rand_vcf --disable_rand_dgv --out_dir out_sim_${i} --log_dir log_sim_${i} --work_dir work_sim_${i}
Thanks!
Metadata
Metadata
Assignees
Labels
No labels