Pseudogene identification

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Pseudogene identification

Quanwei Zhang
Hello:

We used Maker2 to annotate a new rodent genome. By using the annotated genes we did gene family expansion analysis, and found several gene families under expansion in the new rodent genome. But we want to check whether some annotated genes are Pseudogenes, which lead to the expansion. Do you have any suggestions on this?

We found the Maker-P can annotate Pseudogene, but we are not sure whether it is worth to repeat our annotation with Maker-P. Besides, we are not sure whether the default parameters of Maker-P are good for a rodent species. What's more, in my understanding the Maker-P will identify Pseudogenes in the intergenic spaces (which I think the annotated coding genes will be not be tested and checked).

Do you have any suggestions to solve our problem? We do not want to identify Pseudogene on the genome wide, but only want to check those genes showing expansion (to make sure all those gene copies really function).

Many thanks

Best
Quanwei


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Pseudogene identification

Carson Holt-2
The MAKER-P fork was merged back into standard MAKER with version 2.29 (roughly 3 years ago - a separate download no longer exists). This is because MAKER-P’s functionality is almost entirely in accessory scripts and written protocols. The …/maker/bin/maker called by both MAKER2 and MAKER-P is actually the same script. So no need to rerun, because if you are using version 2.29 or later, you already ran it.

Pseudogene calling is therefore handled by accessory scripts and protocols you can find here —> http://shiulab.plantbiology.msu.edu/wiki/index.php/Protocol:Pseudogene

The other MAKER-P protocols can be found here —> http://www.yandell-lab.org/software/maker-p.html 

--Carson



On Jul 31, 2017, at 5:02 PM, Quanwei Zhang <[hidden email]> wrote:

Hello:

We used Maker2 to annotate a new rodent genome. By using the annotated genes we did gene family expansion analysis, and found several gene families under expansion in the new rodent genome. But we want to check whether some annotated genes are Pseudogenes, which lead to the expansion. Do you have any suggestions on this?

We found the Maker-P can annotate Pseudogene, but we are not sure whether it is worth to repeat our annotation with Maker-P. Besides, we are not sure whether the default parameters of Maker-P are good for a rodent species. What's more, in my understanding the Maker-P will identify Pseudogenes in the intergenic spaces (which I think the annotated coding genes will be not be tested and checked).

Do you have any suggestions to solve our problem? We do not want to identify Pseudogene on the genome wide, but only want to check those genes showing expansion (to make sure all those gene copies really function).

Many thanks

Best
Quanwei

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Pseudogene identification

Quanwei Zhang
Hi Carson:

I took a look at the description about the pipe line of pseudogene identification. In my understanding it will use the annotated (predicted by Maker2) protein coding genes as input (i.e., query sequences), search in the intergenic spaces (so annotated genes will not be checked) and find the regions where show certain level of similarity to the annotated genes.

If my understanding is correct, I think it is not what I want to do get. Based on the annotated coding genes from Maker2, we found some genes are under expansion in the new species. We want to check and make sure all the gene copies in the expanded gene family really function (to make sure they are not pseudogenes).

Do you think the pseudogene identification pipe line of Maker2 can be helpful for my goal? Or do you have any suggestions on this?

Many thanks

Best
Quanwei

2017-07-31 19:54 GMT-04:00 Carson Holt <[hidden email]>:
The MAKER-P fork was merged back into standard MAKER with version 2.29 (roughly 3 years ago - a separate download no longer exists). This is because MAKER-P’s functionality is almost entirely in accessory scripts and written protocols. The …/maker/bin/maker called by both MAKER2 and MAKER-P is actually the same script. So no need to rerun, because if you are using version 2.29 or later, you already ran it.

Pseudogene calling is therefore handled by accessory scripts and protocols you can find here —> http://shiulab.plantbiology.msu.edu/wiki/index.php/Protocol:Pseudogene

The other MAKER-P protocols can be found here —> http://www.yandell-lab.org/software/maker-p.html 

--Carson



On Jul 31, 2017, at 5:02 PM, Quanwei Zhang <[hidden email]> wrote:

Hello:

We used Maker2 to annotate a new rodent genome. By using the annotated genes we did gene family expansion analysis, and found several gene families under expansion in the new rodent genome. But we want to check whether some annotated genes are Pseudogenes, which lead to the expansion. Do you have any suggestions on this?

We found the Maker-P can annotate Pseudogene, but we are not sure whether it is worth to repeat our annotation with Maker-P. Besides, we are not sure whether the default parameters of Maker-P are good for a rodent species. What's more, in my understanding the Maker-P will identify Pseudogenes in the intergenic spaces (which I think the annotated coding genes will be not be tested and checked).

Do you have any suggestions to solve our problem? We do not want to identify Pseudogene on the genome wide, but only want to check those genes showing expansion (to make sure all those gene copies really function).

Many thanks

Best
Quanwei

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org



_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Pseudogene identification

Quanwei Zhang
Dear Carson:

Thanks again for all your previous help. I am still trying to identify pseudo genes. For me, I have two goals to identify pseudo genes.
(1) Check which predicted protein coding genes are in fact pseudo genes (especially I want to check genes belong to gene families, which are found under expansion in our analyzed geome).
(2) Identify pseudo genes in the intergentic regions.
But the link "http://shiulab.plantbiology.msu.edu/index.php/Protocol:Pseudogene" only describe Identification of pseudo genes in the intergentic regions. I wonder whether I can simple follow the pipeline and run it on the whole genome, so I can achieve my two goals above. Or do I need to do some adjustment of the pipeline?

At beginning, I though the pipeline is only useful to identify pseudo genes in the intergentic regions, but in the following paper, they mentioned the pipeline was applied to the whole genome. So I think maybe I can also do this, but not sure whether I can simply run the same pipeline in the following link on whole genome ("http://shiulab.plantbiology.msu.edu/index.php/Protocol:Pseudogene").

In the paper "MAKER-P: A Tool Kit for the Rapid Creation,Management, and Quality Control of Plant Genome Annotations". It saids "If the analysis pipeline is applied to the whole genome, 2.5% and 0.6% of currently annotated proteincoding genes are identified as pseudogenes due to the presence of misidentified stops and frame shifts, respectively, indicating that the false-positive rate of our pipeline is 3.1%."

Many thanks for your help.

Best
Quanwei


2017-08-01 9:32 GMT-04:00 Quanwei Zhang <[hidden email]>:
Hi Carson:

I took a look at the description about the pipe line of pseudogene identification. In my understanding it will use the annotated (predicted by Maker2) protein coding genes as input (i.e., query sequences), search in the intergenic spaces (so annotated genes will not be checked) and find the regions where show certain level of similarity to the annotated genes.

If my understanding is correct, I think it is not what I want to do get. Based on the annotated coding genes from Maker2, we found some genes are under expansion in the new species. We want to check and make sure all the gene copies in the expanded gene family really function (to make sure they are not pseudogenes).

Do you think the pseudogene identification pipe line of Maker2 can be helpful for my goal? Or do you have any suggestions on this?

Many thanks

Best
Quanwei

2017-07-31 19:54 GMT-04:00 Carson Holt <[hidden email]>:
The MAKER-P fork was merged back into standard MAKER with version 2.29 (roughly 3 years ago - a separate download no longer exists). This is because MAKER-P’s functionality is almost entirely in accessory scripts and written protocols. The …/maker/bin/maker called by both MAKER2 and MAKER-P is actually the same script. So no need to rerun, because if you are using version 2.29 or later, you already ran it.

Pseudogene calling is therefore handled by accessory scripts and protocols you can find here —> http://shiulab.plantbiology.msu.edu/wiki/index.php/Protocol:Pseudogene

The other MAKER-P protocols can be found here —> http://www.yandell-lab.org/software/maker-p.html 

--Carson



On Jul 31, 2017, at 5:02 PM, Quanwei Zhang <[hidden email]> wrote:

Hello:

We used Maker2 to annotate a new rodent genome. By using the annotated genes we did gene family expansion analysis, and found several gene families under expansion in the new rodent genome. But we want to check whether some annotated genes are Pseudogenes, which lead to the expansion. Do you have any suggestions on this?

We found the Maker-P can annotate Pseudogene, but we are not sure whether it is worth to repeat our annotation with Maker-P. Besides, we are not sure whether the default parameters of Maker-P are good for a rodent species. What's more, in my understanding the Maker-P will identify Pseudogenes in the intergenic spaces (which I think the annotated coding genes will be not be tested and checked).

Do you have any suggestions to solve our problem? We do not want to identify Pseudogene on the genome wide, but only want to check those genes showing expansion (to make sure all those gene copies really function).

Many thanks

Best
Quanwei

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org




_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org