GFF no longer valid after renaming genes

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

GFF no longer valid after renaming genes

Glenna Kramer
Hi there,

I am hoping that you can give me some assistance with finishing up my maker annotated genome for submission.  I have been able to rename the genes for GenBank submission - using Support Protocol 2 in the paper by Campbell et. al "Genome Annotation and Curation Using MAKER and MAKER-P" Curr Protoc Bioinformatics. 2014; 48: 4.11.1–4.11.39. (PMC4286374).  I have also been able to use the Support Protocol 3 from that same paper to assign a putative gene function.  However, I am running into problems when I am trying to convert the GFF file to the tbl format for submission.  I have tried to use scripts from GAG (Genome Annotation Generator) and maker (gff32table).  Both of these scripts work wonderfully on the gff originally output from maker, but do not work once I rename the genes for GenBank submission.  When I feed my file into a gff validator it turns out that my gff is valid prior to renaming, but after I rename the gff is no longer valid.  I have been trying to troubleshoot what is happening to my gff when I rename as in Support Protocol 2, but am stumped.  Has anyone else out there had a similar issue?  I would be very thankful for any insight that you can provide! 

Best,
Glenna    

Not sure if this will be helpful, but here is an example gene from prior to renaming:

##gff-version 3
ChromoV|quiver|quiver    maker    gene    62081    62650    .    +    .    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;Name=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9
ChromoV|quiver|quiver    maker    mRNA    62081    62650    .    +    .    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;Name=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;_AED=0.00;_eAED=0.00;_QI=0|-1|0|1|-1|1|1|0|189
ChromoV|quiver|quiver    maker    exon    62081    62650    .    +    .    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1:exon:11978;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1
ChromoV|quiver|quiver    maker    CDS    62081    62650    .    +    0    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1:cds;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1

And after renaming:

##gff-version 3
ChromoV|quiver|quiver    maker    gene    62081    62650    .    +    .    ID=A9K44_2555|quiver|quiver-processed-gene-0.9;Name=A9K55_2555|quiver|quiver-processed-gene-0.9;Alias=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;
ChromoV|quiver|quiver    maker    mRNA    62081    62650    .    +    .    ID=A9K44_2555|A9K55_2555-RA|quiver-processed-gene-0.9-mRNA-1;Parent=A9K55_2555|A9K55_2555-RA|quiver-processed-gene-0.9;Name=A9K55_2555|A9K55_2555-RA|quiver-processed-gene-0.9-mRNA-1;Alias=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;_AED=0.00;_QI=0|-1|0|1|-1|1|1|0|189;_eAED=0.00;
ChromoV|quiver|quiver    maker    exon    62081    62650    .    +    .    ID=A9K44_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1:exon:11978;Parent=A9K55_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1;
ChromoV|quiver|quiver    maker    CDS    62081    62650    .    +    0    ID=A9K44_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1:cds;Parent=A9K55_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1;

The commands I used were:

% maker_map_ids --prefix_A9K44_ --justify 4 myfilename.gff>myfilename.map

%map_gff_ids myfilename.map myfilename.gff


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GFF no longer valid after renaming genes

adf_ncgr
Hi Glenna-
this may be totally off-base but I have a vague memory that some validators will complain about the
semicolon after the last attribute in the column nine attribute list; it's not clear to me from the specification
that this is truly illegal, but can imagine why a parser might not like to deal with it. In any case,
you might try just removing that terminal semicolon character and see if that solves the validation complaint.

but apologies in advance if my dim recollection has misled me into wasting your time...

Andrew Farmer

On 3/20/17 7:37 PM, Glenna Kramer wrote:
Hi there,

I am hoping that you can give me some assistance with finishing up my maker annotated genome for submission.  I have been able to rename the genes for GenBank submission - using Support Protocol 2 in the paper by Campbell et. al "Genome Annotation and Curation Using MAKER and MAKER-P" Curr Protoc Bioinformatics. 2014; 48: 4.11.1–4.11.39. (PMC4286374).  I have also been able to use the Support Protocol 3 from that same paper to assign a putative gene function.  However, I am running into problems when I am trying to convert the GFF file to the tbl format for submission.  I have tried to use scripts from GAG (Genome Annotation Generator) and maker (gff32table).  Both of these scripts work wonderfully on the gff originally output from maker, but do not work once I rename the genes for GenBank submission.  When I feed my file into a gff validator it turns out that my gff is valid prior to renaming, but after I rename the gff is no longer valid.  I have been trying to troubleshoot what is happening to my gff when I rename as in Support Protocol 2, but am stumped.  Has anyone else out there had a similar issue?  I would be very thankful for any insight that you can provide! 

Best,
Glenna    

Not sure if this will be helpful, but here is an example gene from prior to renaming:

##gff-version 3
ChromoV|quiver|quiver    maker    gene    62081    62650    .    +    .    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;Name=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9
ChromoV|quiver|quiver    maker    mRNA    62081    62650    .    +    .    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;Name=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;_AED=0.00;_eAED=0.00;_QI=0|-1|0|1|-1|1|1|0|189
ChromoV|quiver|quiver    maker    exon    62081    62650    .    +    .    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1:exon:11978;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1
ChromoV|quiver|quiver    maker    CDS    62081    62650    .    +    0    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1:cds;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1

And after renaming:

##gff-version 3
ChromoV|quiver|quiver    maker    gene    62081    62650    .    +    .    ID=A9K44_2555|quiver|quiver-processed-gene-0.9;Name=A9K55_2555|quiver|quiver-processed-gene-0.9;Alias=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;
ChromoV|quiver|quiver    maker    mRNA    62081    62650    .    +    .    ID=A9K44_2555|A9K55_2555-RA|quiver-processed-gene-0.9-mRNA-1;Parent=A9K55_2555|A9K55_2555-RA|quiver-processed-gene-0.9;Name=A9K55_2555|A9K55_2555-RA|quiver-processed-gene-0.9-mRNA-1;Alias=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;_AED=0.00;_QI=0|-1|0|1|-1|1|1|0|189;_eAED=0.00;
ChromoV|quiver|quiver    maker    exon    62081    62650    .    +    .    ID=A9K44_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1:exon:11978;Parent=A9K55_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1;
ChromoV|quiver|quiver    maker    CDS    62081    62650    .    +    0    ID=A9K44_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1:cds;Parent=A9K55_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1;

The commands I used were:

% maker_map_ids --prefix_A9K44_ --justify 4 myfilename.gff>myfilename.map

%map_gff_ids myfilename.map myfilename.gff



_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-- 
...all concepts in which an entire process is semiotically concentrated
elude definition; only that which has no history is definable.

Friedrich Nietzsche

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GFF no longer valid after renaming genes

Carson Holt-2
In reply to this post by Glenna Kramer
The problem appears to be the multiple  ‘|’ characters in your contig names (ChromoV|quiver|quiver). They end up in the gene ID, and since ‘|’ has a special meaning in perl, it creates weird replacement behavior. I’ve attached two scripts that will fix that.
Use them to replace their counterparts in the …/maker/bin/ and .../maker/src/bin/ directories, then rerun all renaming steps on a new gff3 (not the one you already tried to rename).  Also you may want to consider changing IDs in the assembly itself before you release it or use it for analysis. You would want to remove the '|quiver|quiver’ tail on every contig. That tail has the potential to open up hidden downstream analysis errors from other tools for the same reasons outlined above, since ‘|’ characters have special meaning.

Thanks,
Carson







On Mar 20, 2017, at 7:37 PM, Glenna Kramer <[hidden email]> wrote:

Hi there,

I am hoping that you can give me some assistance with finishing up my maker annotated genome for submission.  I have been able to rename the genes for GenBank submission - using Support Protocol 2 in the paper by Campbell et. al "Genome Annotation and Curation Using MAKER and MAKER-P" Curr Protoc Bioinformatics. 2014; 48: 4.11.1–4.11.39. (PMC4286374).  I have also been able to use the Support Protocol 3 from that same paper to assign a putative gene function.  However, I am running into problems when I am trying to convert the GFF file to the tbl format for submission.  I have tried to use scripts from GAG (Genome Annotation Generator) and maker (gff32table).  Both of these scripts work wonderfully on the gff originally output from maker, but do not work once I rename the genes for GenBank submission.  When I feed my file into a gff validator it turns out that my gff is valid prior to renaming, but after I rename the gff is no longer valid.  I have been trying to troubleshoot what is happening to my gff when I rename as in Support Protocol 2, but am stumped.  Has anyone else out there had a similar issue?  I would be very thankful for any insight that you can provide!  

Best,
Glenna     

Not sure if this will be helpful, but here is an example gene from prior to renaming:

##gff-version 3
ChromoV|quiver|quiver    maker    gene    62081    62650    .    +    .    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;Name=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9
ChromoV|quiver|quiver    maker    mRNA    62081    62650    .    +    .    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;Name=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;_AED=0.00;_eAED=0.00;_QI=0|-1|0|1|-1|1|1|0|189
ChromoV|quiver|quiver    maker    exon    62081    62650    .    +    .    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1:exon:11978;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1
ChromoV|quiver|quiver    maker    CDS    62081    62650    .    +    0    ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1:cds;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1

And after renaming:

##gff-version 3
ChromoV|quiver|quiver    maker    gene    62081    62650    .    +    .    ID=A9K44_2555|quiver|quiver-processed-gene-0.9;Name=A9K55_2555|quiver|quiver-processed-gene-0.9;Alias=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;
ChromoV|quiver|quiver    maker    mRNA    62081    62650    .    +    .    ID=A9K44_2555|A9K55_2555-RA|quiver-processed-gene-0.9-mRNA-1;Parent=A9K55_2555|A9K55_2555-RA|quiver-processed-gene-0.9;Name=A9K55_2555|A9K55_2555-RA|quiver-processed-gene-0.9-mRNA-1;Alias=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;_AED=0.00;_QI=0|-1|0|1|-1|1|1|0|189;_eAED=0.00;
ChromoV|quiver|quiver    maker    exon    62081    62650    .    +    .    ID=A9K44_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1:exon:11978;Parent=A9K55_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1;
ChromoV|quiver|quiver    maker    CDS    62081    62650    .    +    0    ID=A9K44_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1:cds;Parent=A9K55_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1;

The commands I used were:

% maker_map_ids --prefix_A9K44_ --justify 4 myfilename.gff>myfilename.map

%map_gff_ids myfilename.map myfilename.gff

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

map_fasta_ids (1K) Download Attachment
map_gff_ids (5K) Download Attachment
Loading...