Masking is causing problems with exon/CDS boundary predictions

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Masking is causing problems with exon/CDS boundary predictions

Salim Bougouffa
Hi,

I am having trouble with MAKER/AUGUSTUS annotation.

One of the recurrent ones is the example I attach in artemis screenshot. In red is the gene of interest. There are three GFF files. Top is augustus standalone that I executed on the non-masked scaffold. The second is maker annotation and third is the augustus (from maker) on the masked query.


The gene clearly has four exons and three CDS but the masking seem to somehow cause augustus to get the boundaries wrong for the first and third CDS and skip the second CDS entirely.

How do I fix this problem?

Best,
/SB
artemis.png
--

____________________________
Sent from Inbox Mobile


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Masking is causing problems with exon/CDS boundary predictions

Carson Holt-2
What do your evidence alignments look like? i.e. assembled mRNA-seq and protein homology? Not the mRNA-seq pileup (because MAKER can’t see that).

Masking is applied before evidence seeding and the first ab initio run. It is actually then removed for evidence polishing around splice sites, and the hint based rerun of Augustus. So evidence is allowed to extend through masked regions and Augustus can make it part of the model if it wants on the second run. Since it didn’t the question becomes, what does the transcript and protein homology alignments from the maker run look like.

—Carson





On Feb 8, 2017, at 12:22 AM, Salim Bougouffa <[hidden email]> wrote:

Hi,

I am having trouble with MAKER/AUGUSTUS annotation.

One of the recurrent ones is the example I attach in artemis screenshot. In red is the gene of interest. There are three GFF files. Top is augustus standalone that I executed on the non-masked scaffold. The second is maker annotation and third is the augustus (from maker) on the masked query.


The gene clearly has four exons and three CDS but the masking seem to somehow cause augustus to get the boundaries wrong for the first and third CDS and skip the second CDS entirely.

How do I fix this problem?

Best,
/SB
<artemis.png>
--

____________________________
Sent from Inbox Mobile

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org