Quantcast

Ab initio gene prediction; 0 genes when creating HMM via SNAP

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Ab initio gene prediction; 0 genes when creating HMM via SNAP

lucys-world

Dear maker-devel group,


I have some issues with my maker ab initio gene prediction (for a new mammal genome) when creating an HMM via SNAP.

after two maker runs I wanted to create a new HMM for the third maker run, but the command


fathom genome.ann genoma.dna -gene-stats


resulted in 0 genes.


What have I done so far:

  • for the first training run I only used BUSCO and Swiss-Port data bank as references (Since no EST are available for my species). Additionally I set protein2genome =1


  • I was able to create an HMM based on all merged *.gff But these were not many:
    • out of 27.032 Scafolds (Sequences) only 280 were used for the HMM; here the gene-stats:
    • 280 sequences
      0.458676 avg GC fraction (min=0.338014 max=0.708052)
      7445 genes (plus=3192 minus=4253)
      1621 (0.217730) single-exon
      5824 (0.782270) multi-exon
      168.412018 mean exon (min=1 max=5224)
      1464.349243 mean intron (min=30 max=41197)


  • For the second maker run I then used this HMM and again the BUSCO+SwissPort.fasta reference file.
    • the gene-stats for the output of the second maker run are:
    • 282 sequences
      0.473125 avg GC fraction (min=0.338014 max=0.725131)
      0 genes (plus=0 minus=0)
      0 (-nan) single-exon
      0 (-nan) multi-exon
      -nan mean exon (min=2147483647 max=0)
      -nan mean intron (min=2147483647 max=0)


Would you recommend to rerun everything, e.g. with an additional Augustus gene prediction (species=human), or EST from related species? (If so how close related?)


Thank you for your time and help

kind regards

Lucy


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Ab initio gene prediction; 0 genes when creating HMM via SNAP

Ence,daniel
Hi Lucy, 

What were your settings for the second training run? Did you leave protein2genome=1? 

~Daniel

On Mar 6, 2017, at 9:40 AM, [hidden email] wrote:

Dear maker-devel group,


I have some issues with my maker ab initio gene prediction (for a new mammal genome) when creating an HMM via SNAP.

after two maker runs I wanted to create a new HMM for the third maker run, but the command


fathom genome.ann genoma.dna -gene-stats


resulted in 0 genes.


What have I done so far:

  • for the first training run I only used BUSCO and Swiss-Port data bank as references (Since no EST are available for my species). Additionally I set protein2genome =1


  • I was able to create an HMM based on all merged *.gff But these were not many:
    • out of 27.032 Scafolds (Sequences) only 280 were used for the HMM; here the gene-stats:
    • 280 sequences
      0.458676 avg GC fraction (min=0.338014 max=0.708052)
      7445 genes (plus=3192 minus=4253)
      1621 (0.217730) single-exon
      5824 (0.782270) multi-exon
      168.412018 mean exon (min=1 max=5224)
      1464.349243 mean intron (min=30 max=41197)


  • For the second maker run I then used this HMM and again the BUSCO+SwissPort.fasta reference file.
    • the gene-stats for the output of the second maker run are:
    • 282 sequences
      0.473125 avg GC fraction (min=0.338014 max=0.725131)
      0 genes (plus=0 minus=0)
      0 (-nan) single-exon
      0 (-nan) multi-exon
      -nan mean exon (min=2147483647 max=0)
      -nan mean intron (min=2147483647 max=0)


Would you recommend to rerun everything, e.g. with an additional Augustus gene prediction (species=human), or EST from related species? (If so how close related?)


Thank you for your time and help

kind regards

Lucy

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Ab initio gene prediction; 0 genes when creating HMM via SNAP

Carson Holt-2
In reply to this post by lucys-world
It looks like you have no genes to train with. So you did something wrong on your second run. Either no gene predictor was running or you provided no evidence for the predictor, so you produced no models.

—Carson


On Mar 6, 2017, at 7:40 AM, [hidden email] wrote:

Dear maker-devel group,


I have some issues with my maker ab initio gene prediction (for a new mammal genome) when creating an HMM via SNAP.

after two maker runs I wanted to create a new HMM for the third maker run, but the command


fathom genome.ann genoma.dna -gene-stats


resulted in 0 genes.


What have I done so far:

  • for the first training run I only used BUSCO and Swiss-Port data bank as references (Since no EST are available for my species). Additionally I set protein2genome =1


  • I was able to create an HMM based on all merged *.gff But these were not many:
    • out of 27.032 Scafolds (Sequences) only 280 were used for the HMM; here the gene-stats:
    • 280 sequences
      0.458676 avg GC fraction (min=0.338014 max=0.708052)
      7445 genes (plus=3192 minus=4253)
      1621 (0.217730) single-exon
      5824 (0.782270) multi-exon
      168.412018 mean exon (min=1 max=5224)
      1464.349243 mean intron (min=30 max=41197)


  • For the second maker run I then used this HMM and again the BUSCO+SwissPort.fasta reference file.
    • the gene-stats for the output of the second maker run are:
    • 282 sequences
      0.473125 avg GC fraction (min=0.338014 max=0.725131)
      0 genes (plus=0 minus=0)
      0 (-nan) single-exon
      0 (-nan) multi-exon
      -nan mean exon (min=2147483647 max=0)
      -nan mean intron (min=2147483647 max=0)


Would you recommend to rerun everything, e.g. with an additional Augustus gene prediction (species=human), or EST from related species? (If so how close related?)


Thank you for your time and help

kind regards

Lucy

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Ab initio gene prediction; 0 genes when creating HMM via SNAP

lucys-world

Hallo Carson, hello Daniel,


thank you for your fast reply and help.


To Daniels question:

Yes unfortunately I had protein2genome=1 in all runs


To Carson:

After reading a lot through the forum I figured that I had a mistake in understanding an initio gene prediction. I thought one had to perform 3 maker run in total. One training run and then two maker runs for annotation. But now I think there are only two maker in to perform in total (one training and then one annotation run) is that correct?

So after my first run I created an HMM based on the first gene-stats (with 7445 genes) and performed my second run with this HMM. Then I tried to create a new HMM based on my second run output. I think that is not necessary since the output of the second run should be my annotated genome?


I think I have to redo my maker runs and for that have to questions regarding the maker_opts.ctl:


1. Training run: For that I have to give maker my genome, my evidence (in my Case Busco and Swissport data sets) and set protein2genome=1 . Since that is my only evidence I don't change anything else? (I don't add anything in the gene prediction paragraph?)


2. Annotation run: With the gff output of the training run I create my own HMM from SNAP. In the maker_opts.ctl I then add for this annotation run my SNAP-HMM and set AugustusSpecies on the closest related species (as recommended in the Augustus manual), is that correct? Do I give also my Protein evidence as I did in the Trainingsrun?



Thank you very much for your time and help with that !


- Lucy




Carson Holt <[hidden email]> hat am 6. März 2017 um 20:48 geschrieben:

It looks like you have no genes to train with. So you did something wrong on your second run. Either no gene predictor was running or you provided no evidence for the predictor, so you produced no models.

—Carson


On Mar 6, 2017, at 7:40 AM, [hidden email] wrote:

Dear maker-devel group,


I have some issues with my maker ab initio gene prediction (for a new mammal genome) when creating an HMM via SNAP.

after two maker runs I wanted to create a new HMM for the third maker run, but the command


fathom genome.ann genoma.dna -gene-stats


resulted in 0 genes.


What have I done so far:

  • for the first training run I only used BUSCO and Swiss-Port data bank as references (Since no EST are available for my species). Additionally I set protein2genome =1


  • I was able to create an HMM based on all merged *.gff But these were not many:
    • out of 27.032 Scafolds (Sequences) only 280 were used for the HMM; here the gene-stats:
    • 280 sequences
      0.458676 avg GC fraction (min=0.338014 max=0.708052)
      7445 genes (plus=3192 minus=4253)
      1621 (0.217730) single-exon
      5824 (0.782270) multi-exon
      168.412018 mean exon (min=1 max=5224)
      1464.349243 mean intron (min=30 max=41197)


  • For the second maker run I then used this HMM and again the BUSCO+SwissPort.fasta reference file.
    • the gene-stats for the output of the second maker run are:
    • 282 sequences
      0.473125 avg GC fraction (min=0.338014 max=0.725131)
      0 genes (plus=0 minus=0)
      0 (-nan) single-exon
      0 (-nan) multi-exon
      -nan mean exon (min=2147483647 max=0)
      -nan mean intron (min=2147483647 max=0)


Would you recommend to rerun everything, e.g. with an additional Augustus gene prediction (species=human), or EST from related species? (If so how close related?)


Thank you for your time and help

kind regards

Lucy

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Loading...