99.98% of repeatmasker features on plus strand, anyone else seen this?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

99.98% of repeatmasker features on plus strand, anyone else seen this?

Matt Simenc
Hi everybody,

I just noticed that the vast majority of features with type repeatmasker are on the plus strand in my MAKER GFFs. There are a handful on the minus strand. Has anyone else seen that in their MAKER GFFs?

MAKER 2.31.8

I looked at a standalone RepeatMasker run I did and the features are more evenly distributed between the +/- strands.


Matt

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: 99.98% of repeatmasker features on plus strand, anyone else seen this?

Carson Holt-2
While transposons that encode proteins will technically have a strand, simple repeats and many others do not so the algorithms used to find them will not necessarily assign a strand. For this reason the repeats are treated as strand-less since both strands are masked and are they are arbitrarily assigned to the plus strand to avoid issues with genome browsers that cannot handle strandless features.

—Carson



> On Nov 17, 2017, at 6:39 PM, Matt Simenc <[hidden email]> wrote:
>
> Hi everybody,
>
> I just noticed that the vast majority of features with type repeatmasker are on the plus strand in my MAKER GFFs. There are a handful on the minus strand. Has anyone else seen that in their MAKER GFFs?
>
> MAKER 2.31.8
>
> I looked at a standalone RepeatMasker run I did and the features are more evenly distributed between the +/- strands.
>
>
> Matt
> _______________________________________________
> maker-devel mailing list
> [hidden email]
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: 99.98% of repeatmasker features on plus strand, anyone else seen this?

Carson Holt-2
In reply to this post by Matt Simenc
Also MAKER clusters overlapping repeats to generate the best masking of the assembly. For the GFF3 it then assigns the name of the repeat encompassing the greatest portion of the cluster to the feature (i.e. the best representative). But the cluster is technically build from overlapping repeats on both strands (repeats tend to jump on top of other repeats, so they stack with bits and pieces of other repeats at the edges). Yet another reason why everything is just assigned to the plus strand.

—Carson


> On Nov 17, 2017, at 6:39 PM, Matt Simenc <[hidden email]> wrote:
>
> Hi everybody,
>
> I just noticed that the vast majority of features with type repeatmasker are on the plus strand in my MAKER GFFs. There are a handful on the minus strand. Has anyone else seen that in their MAKER GFFs?
>
> MAKER 2.31.8
>
> I looked at a standalone RepeatMasker run I did and the features are more evenly distributed between the +/- strands.
>
>
> Matt
> _______________________________________________
> maker-devel mailing list
> [hidden email]
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: 99.98% of repeatmasker features on plus strand, anyone else seen this?

Matt Simenc
Ah ok. A messy problem! I need to approximate strandedness for TE loci if possible so will do some post processing using blast/hmmer to Repbase and Dfam. Thanks for the speedy response Carson!

On Fri, Nov 17, 2017 at 6:23 PM, Carson Holt <[hidden email]> wrote:
Also MAKER clusters overlapping repeats to generate the best masking of the assembly. For the GFF3 it then assigns the name of the repeat encompassing the greatest portion of the cluster to the feature (i.e. the best representative). But the cluster is technically build from overlapping repeats on both strands (repeats tend to jump on top of other repeats, so they stack with bits and pieces of other repeats at the edges). Yet another reason why everything is just assigned to the plus strand.

—Carson


> On Nov 17, 2017, at 6:39 PM, Matt Simenc <[hidden email]> wrote:
>
> Hi everybody,
>
> I just noticed that the vast majority of features with type repeatmasker are on the plus strand in my MAKER GFFs. There are a handful on the minus strand. Has anyone else seen that in their MAKER GFFs?
>
> MAKER 2.31.8
>
> I looked at a standalone RepeatMasker run I did and the features are more evenly distributed between the +/- strands.
>
>
> Matt
> _______________________________________________
> maker-devel mailing list
> [hidden email]
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org



_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org