-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Dear Evan,
I am really excited with getting Tephra running, it seems to be a beautiful piece of software. I had some issues I'd like to solve, though. I am putting them here so everyone can see, but please let me know if this shouldn't be on the issues.
I got the docker version running by installing docker:
$ docker run -it --name tephra-con -v $(pwd)/db:/db:Z sestaton/tephra
$ cd /db
$ wget https://raw.githubusercontent.com/sestaton/tephra/master/config/tephra_config.yml
#### changed the "logfile", "genome", "outfile", "repeatdb" (using your sunflower library, thank you for that!).
$ tephra all -c tephra_config.yml
[ERROR]: gene file was not defined in configuration or does not exist. Check input. Exiting.
I noticed that the new config file has this line. It is possibly new since it is not on the manual or help pages.
- genefile: TAIR10_genes.fas
I deleted it:
$ sed "s/.*genefile.*//; /^$/d" tephra_config.yml > tephra_config2.yml
$ tephra all -c tephra_config2.yml
[ERROR]: 'trnadb' under 'all' is not defined after parsing configuration file.
This indicates there may be a blank line in your configuration file.
Please check your configuration file and try again. Exiting.
Q1: I interpret this that it did not like my re-formating of the config file. I was thus wondering what is this "TAIR10_genes.fas". Is this the genetic annotations of arabidopsis? I checked NCBI and TAIR10 seems to be an assembly name for this species ( https://www.ncbi.nlm.nih.gov/assembly/GCF_000001735.4).
Q2: Is there a way to run the "all" command without specifying the annotations?
See config file below.
$ cat t*yml
## For more information about this file, see:
## https://github.com/sestaton/tephra/wiki/Specifications-and-example-usage.
all:
- logfile: tephra.log
- genome: scalesia_atractyloides.fasta
- outfile: scalesia_atractyloides_thra_transposons.gff3
- repeatdb: Ha412v1r1_transposons_v1.0.fasta
- genefile: TAIR10_genes.fas
- trnadb: TephraDB
- hmmdb: TephraDB
- threads: 24
- clean: YES
- debug: NO
- subs_rate: 1e-8
findltrs:
- dedup: NO
- tnpfilter: NO
- domains_required: NO
- ltrharvest:
- mintsd: 4
- maxtsd: 20
- minlenltr: 100
- maxlenltr: 1000
- mindistltr: 1000
- maxdistltr: 15000
- seedlength: 30
- tsdradius: 60
- xdrop: 5
- swmat: 2
- swmis: -2
- swins: -3
- swdel: -3
- overlaps: best
- ltrdigest:
- pptradius: 30
- pptlen: 8 30
- pptagpr: 0.25
- uboxlen: 3 30
- uboxutpr: 0.91
- pbsradius: 30
- pbslen: 11 30
- pbsoffset: 0 5
- pbstrnaoffset: 0 5
- pbsmaxeditdist: 1
- pdomevalue: 1E-6
- pdomcutoff: NONE
- maxgaplen: 50
classifyltrs:
- percentcov: 50
- percentid: 80
- hitlen: 80
illrecomb:
- repeat_pid: 10
ltrage:
- all: NO
maskref:
- percentid: 80
- hitlength: 70
- splitsize: 5000000
- overlap: 100
sololtr:
- percentid: 39
- percentcov: 80
- matchlen: 80
- numfamilies: 20
- allfamilies: NO
tirage:
- all: NO