I am hoping that you can give me some assistance with finishing up my maker annotated genome for submission. I have been able to rename the genes for GenBank submission - using Support Protocol 2 in the paper by Campbell et. al "Genome Annotation and Curation Using MAKER and MAKER-P" Curr Protoc Bioinformatics. 2014; 48: 4.11.1–4.11.39. (PMC4286374). I have also been able to use the Support Protocol 3 from that same paper to assign a putative gene function. However, I am running into problems when I am trying to convert the GFF file to the tbl format for submission. I have tried to use scripts from GAG (Genome Annotation Generator) and maker (gff32table). Both of these scripts work wonderfully on the gff originally output from maker, but do not work once I rename the genes for GenBank submission. When I feed my file into a gff validator it turns out that my gff is valid prior to renaming, but after I rename the gff is no longer valid. I have been trying to troubleshoot what is happening to my gff when I rename as in Support Protocol 2, but am stumped. Has anyone else out there had a similar issue? I would be very thankful for any insight that you can provide!
Not sure if this will be helpful, but here is an example gene from prior to renaming:
ChromoV|quiver|quiver maker gene 62081 62650 . + . ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;Name=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9
ChromoV|quiver|quiver maker mRNA 62081 62650 . + . ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;Name=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;_AED=0.00;_eAED=0.00;_QI=0|-1|0|1|-1|1|1|0|189
ChromoV|quiver|quiver maker exon 62081 62650 . + . ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1:exon:11978;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1
ChromoV|quiver|quiver maker CDS 62081 62650 . + 0 ID=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1:cds;Parent=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1
And after renaming:
ChromoV|quiver|quiver maker gene 62081 62650 . + . ID=A9K44_2555|quiver|quiver-processed-gene-0.9;Name=A9K55_2555|quiver|quiver-processed-gene-0.9;Alias=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9;
ChromoV|quiver|quiver maker mRNA 62081 62650 . + . ID=A9K44_2555|A9K55_2555-RA|quiver-processed-gene-0.9-mRNA-1;Parent=A9K55_2555|A9K55_2555-RA|quiver-processed-gene-0.9;Name=A9K55_2555|A9K55_2555-RA|quiver-processed-gene-0.9-mRNA-1;Alias=augustus_masked-ChromoV|quiver|quiver-processed-gene-0.9-mRNA-1;_AED=0.00;_QI=0|-1|0|1|-1|1|1|0|189;_eAED=0.00;
ChromoV|quiver|quiver maker exon 62081 62650 . + . ID=A9K44_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1:exon:11978;Parent=A9K55_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1;
ChromoV|quiver|quiver maker CDS 62081 62650 . + 0 ID=A9K44_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1:cds;Parent=A9K55_2555-RA|quiver|quiver-processed-gene-0.9-mRNA-1;
The commands I used were:
% maker_map_ids --prefix_A9K44_ --justify 4 myfilename.gff>myfilename.map
%map_gff_ids myfilename.map myfilename.gff
maker-devel mailing list
this may be totally off-base but I have a vague memory that some validators will complain about the
semicolon after the last attribute in the column nine attribute list; it's not clear to me from the specification
that this is truly illegal, but can imagine why a parser might not like to deal with it. In any case,
you might try just removing that terminal semicolon character and see if that solves the validation complaint.
but apologies in advance if my dim recollection has misled me into wasting your time...
On 3/20/17 7:37 PM, Glenna Kramer wrote:
-- ...all concepts in which an entire process is semiotically concentrated elude definition; only that which has no history is definable. Friedrich Nietzsche
maker-devel mailing list
In reply to this post by Glenna Kramer
The problem appears to be the multiple ‘|’ characters in your contig names (ChromoV|quiver|quiver). They end up in the gene ID, and since ‘|’ has a special meaning in perl, it creates weird replacement behavior. I’ve attached two scripts that will fix that.
Use them to replace their counterparts in the …/maker/bin/ and .../maker/src/bin/ directories, then rerun all renaming steps on a new gff3 (not the one you already tried to rename). Also you may want to consider changing IDs in the assembly itself before you release it or use it for analysis. You would want to remove the '|quiver|quiver’ tail on every contig. That tail has the potential to open up hidden downstream analysis errors from other tools for the same reasons outlined above, since ‘|’ characters have special meaning.
maker-devel mailing list
|Free forum by Nabble||Edit this page|