Evidences in fasta format but be alignment in every run?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Evidences in fasta format but be alignment in every run?

Quanwei Zhang
Hello:

I am annotating a new genome using Maker. I have RNA-seq assembly and protein sequences (from other organisms) in fasta format. Since I need to train gene finders, so I have to run Maker several times. I think the aligning process between the transcript assembly (protein sequences) and the genome assembly may be time consuming. So I wonder whether I can save such alignment in the first run, and then make use of such alignment in the following runs?

Thanks

Best
Quanwei

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Evidences in fasta format but be alignment in every run?

Carson Holt-2
MAKER is restartable. As long as you run each time in the same location, it can reuse existing alignments from the previous run. You also only need to train on ~10MB of the genome depending on gene density. Target size should be 300-400 genes.

If you follow this GMOD wiki, this is demonstrated (you can also watch video - link as top of page - to see it being done) —> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors

—Carson



On Feb 10, 2017, at 8:50 AM, Quanwei Zhang <[hidden email]> wrote:

Hello:

I am annotating a new genome using Maker. I have RNA-seq assembly and protein sequences (from other organisms) in fasta format. Since I need to train gene finders, so I have to run Maker several times. I think the aligning process between the transcript assembly (protein sequences) and the genome assembly may be time consuming. So I wonder whether I can save such alignment in the first run, and then make use of such alignment in the following runs?

Thanks

Best
Quanwei
_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Evidences in fasta format but be alignment in every run?

Quanwei Zhang
Great. Many thanks. So I can select part of my genome assembly for the training. Which the following ways do you think is better? (a) Select the longest contigs. (b) Randomly select contigs with any length?

Thank you!

Best
Quanwei

2017-02-10 11:39 GMT-05:00 Carson Holt <[hidden email]>:
MAKER is restartable. As long as you run each time in the same location, it can reuse existing alignments from the previous run. You also only need to train on ~10MB of the genome depending on gene density. Target size should be 300-400 genes.

If you follow this GMOD wiki, this is demonstrated (you can also watch video - link as top of page - to see it being done) —> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors

—Carson



On Feb 10, 2017, at 8:50 AM, Quanwei Zhang <[hidden email]> wrote:

Hello:

I am annotating a new genome using Maker. I have RNA-seq assembly and protein sequences (from other organisms) in fasta format. Since I need to train gene finders, so I have to run Maker several times. I think the aligning process between the transcript assembly (protein sequences) and the genome assembly may be time consuming. So I wonder whether I can save such alignment in the first run, and then make use of such alignment in the following runs?

Thanks

Best
Quanwei
_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org



_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Evidences in fasta format but be alignment in every run?

Michael Campbell
The longer ones will have more complete genes on them. If you get a set scaffolds that has about 1,000 genes you probably have enough for training.
Mike
On Feb 10, 2017, at 12:04 PM, Quanwei Zhang <[hidden email]> wrote:

Great. Many thanks. So I can select part of my genome assembly for the training. Which the following ways do you think is better? (a) Select the longest contigs. (b) Randomly select contigs with any length?

Thank you!

Best
Quanwei

2017-02-10 11:39 GMT-05:00 Carson Holt <[hidden email]>:
MAKER is restartable. As long as you run each time in the same location, it can reuse existing alignments from the previous run. You also only need to train on ~10MB of the genome depending on gene density. Target size should be 300-400 genes.

If you follow this GMOD wiki, this is demonstrated (you can also watch video - link as top of page - to see it being done) —> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors

—Carson



On Feb 10, 2017, at 8:50 AM, Quanwei Zhang <[hidden email]> wrote:

Hello:

I am annotating a new genome using Maker. I have RNA-seq assembly and protein sequences (from other organisms) in fasta format. Since I need to train gene finders, so I have to run Maker several times. I think the aligning process between the transcript assembly (protein sequences) and the genome assembly may be time consuming. So I wonder whether I can save such alignment in the first run, and then make use of such alignment in the following runs?

Thanks

Best
Quanwei
_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org