How to explain the maker results?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

How to explain the maker results?

dcg@cau.edu.cn
Dear sir:
    I‘ve been using maker to do my genome annotation. However, I still have something I can't understand:

    1. After assembly, I have many contigs. Firstly, I set est2genome=1 and protein2genome=1 , with my proteins, ESTs and RNA-seq.. Which way below is correct?
    1.1 Each contig has its own gff. I just use its own maker_gff file to get a pyu.hmm(be used in snap practice), and then, train the single contig.
    1.2 I merge all the maker_gff to produce a pyu.hmm(for snap) , and then, use this pyu.hmm to train all the contigs.
    
    2. The aim of my project is to find new protein, so I need to guarantee the rigor of my annotation.
        I  made a plan that the predicted protein should be successfully aligned to the Uniprot(reviewed protein, total number is about 30K) with 100% identity and coverage.
        However, if I choose method 1.2 as above:
        After the first step (est2genome=1 and protein2genome=1), about 1600 proteins can be 100% aligned to the Uniprot. After 2 rounds training(est2genome=0 and protein2genome=0), less proteins can be 100% aligned.
        Is my test method reasonable? Why the final results can't get more well aligned proteins?
        After training and fasta_merge, the results can be index_all.log.all.maker.proteins.fasta, index_all.log.all.maker.snap_masked.proteins.fasta, index_all.log.all.maker.non_overlapping_ab_initio.proteins.fasta,  which is the final results?

    
     I'm looking forward to hearing from you. Thanks!
Yours sincerely!

     
Chao Chao


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: How to explain the maker results?

Carson Holt-2
Use the merged gff3 to train snap, otherwise you won’t have enough models.


Also you can find additional detailed info by searching the mailing list archives —> http://groups.google.com/group/maker-devel

I’m not sure what you are asking with the last question. Alignment is not a function of training, and will not be affected by the hmm, but 100% coverage and identity is too strict a threshold even for data derived from the same species.

—Carson




On May 3, 2017, at 9:29 AM, [hidden email] wrote:

Dear sir:
    I‘ve been using maker to do my genome annotation. However, I still have something I can't understand:

    1. After assembly, I have many contigs. Firstly, I set est2genome=1 and protein2genome=1 , with my proteins, ESTs and RNA-seq.. Which way below is correct?
    1.1 Each contig has its own gff. I just use its own maker_gff file to get a pyu.hmm(be used in snap practice), and then, train the single contig.
    1.2 I merge all the maker_gff to produce a pyu.hmm(for snap) , and then, use this pyu.hmm to train all the contigs.
    
    2. The aim of my project is to find new protein, so I need to guarantee the rigor of my annotation.
        I  made a plan that the predicted protein should be successfully aligned to the Uniprot(reviewed protein, total number is about 30K) with 100% identity and coverage.
        However, if I choose method 1.2 as above:
        After the first step (est2genome=1 and protein2genome=1), about 1600 proteins can be 100% aligned to the Uniprot. After 2 rounds training(est2genome=0 and protein2genome=0), less proteins can be 100% aligned.
        Is my test method reasonable? Why the final results can't get more well aligned proteins?
        After training and fasta_merge, the results can be index_all.log.all.maker.proteins.fasta, index_all.log.all.maker.snap_masked.proteins.fasta, index_all.log.all.maker.non_overlapping_ab_initio.proteins.fasta,  which is the final results?

    
     I'm looking forward to hearing from you. Thanks!
Yours sincerely!

     
Chao Chao

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org