Skip to content

arguments #40

@anyusernamenottaken

Description

@anyusernamenottaken

Hi team,

I've been trying to get BG7 working for a while (annotating a ~5MB Streptomyces genome), and now think I've found a couple of bugs and workarounds. I'm working with the latest version as of July/August 2013:

  1. When running the /bin/bg7 script, at line 396, when the program goes to make a copy of bg7.jar in the output directory, it can't locate the original. I noticed that the script is looking for "/jar/bg7.jar" and the directory structure for the file is "/jars/bg7.jar", or vice versa. But correcting this still didn't lead the script to find it, so I just wrote the entire path into this line.
  2. Once the program got as far as the PredictGenes script, it returned an error about expecting 6 inputs, but the output of the /bin/bg7 wrapper script only writes 5 arguments. I had to add a line to also write the Dif_span:30 argument so PredictGenes could proceed. Also: the default values for the last 3 arguments input to the PredictGenes.jar were 400, true, and 30 (I took the 30 value for dif_span from code buried within the default PredictGenes.jar file). Are these values appropriate for most microbes? Is there any documentation about what the arguments mean, in bioinformatic terms? It looked like the boolean value was describing whether the genome was viral, which seems like an odd default value to choose.
  3. I've run into other java errors about heap space and array index being out of bounds, but your earlier responses about these problems have helped me work around them, I think. I've gotten the test data you included with the code to work, as well as test runs of my genome when I pare down the reference protein or RNA set.

Finally, assuming my latest (long) runs end up working, I'm curious about improving the predict genes portion of the algorithm: Streptomycetes use GTG a lot as an alternative start codon; RAST catches this, but most of the genes output in the pared-down test runs of my genome with BG7 forced nearby ATG codons for the start of genes. Of course I don't know it's wrong, but given other published genomes, I'd suspect the RAST starts are closer to the truth. Is there any way to alter the BG7 code to bias gene calling towards a known organism-specific codon usage?

Thanks,
Drew

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions