Hi Claudia,
I doubt if there is a maker preference to fix this, but I'll cc the maker list just in case. I also don't know of a script that will do this for you, though it wouldn't be terribly hard to write a perl script that did it (possibly even in one line :-) Anyone want to have a shot at perl golf? :-) Scott On Wed, Dec 8, 2010 at 9:47 AM, claudia <[hidden email]> wrote: > Hi thank you for the quick reply, > I realize I could manually edit the GFF3, but I have a database full of > files like this produced from 'Maker', is there any script available, or > Maker preference to change this? > > Claudia > > > On 08/12/2010 9:18 AM, Scott Cain wrote: >> >> Hi Claudia, >> >> Questions about Chado are best sent to the schema mailing list, which >> I cc'ed here. >> >> The problem you are having is that the comma has special meaning in >> column nine of a GFF3 file, indicating more than one value, so that >> feature really has two names, >> "30128.m008887#Guanosine-5'-triphosphate" and "3'-diphosphate", which >> isn't allowed. In order for that to be the name, the comma needs to >> be URI escaped, which is to say, replaced with "%2C". >> >> Scott >> >> >> On Tue, Dec 7, 2010 at 2:54 PM, Dinatale C<[hidden email]> wrote: >>> >>> To whom it may concern, >>> >>> I am attempting to load a preproccessed gff3 file that is a merged gff3 >>> from a maker output and I am getting this response (below) when I use the >>> gmod bulk load script. Could you shed some light for me in solving this >>> problem? >>> >>> Thank you, >>> >>> Claudia DiNatale >>> >>> contig00562 blastx protein_match 24 422 187 - . >>> >>> ID=contig00562:hit:365;Name=30128.m008887#Guanosine-5'-triphosphate,3'-diphosphate; >>> >>> A feature may have at most one Name value >>> STACK: Error::throw >>> STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:368 >>> STACK: Bio::FeatureIO::gff::_handle_feature >>> /usr/share/perl5/Bio/FeatureIO/gff.pm:729 >>> STACK: Bio::FeatureIO::gff::next_feature >>> /usr/share/perl5/Bio/FeatureIO/gff.pm:172 >>> STACK: /usr/local/bin/gmod_bulk_load_gff3.pl:777 >>> >>> >>> ------------------------------------------------------------------------------ >>> What happens now with your Lotus Notes apps - do you make another costly >>> upgrade, or settle for being marooned without product support? Time to >>> move >>> off Lotus Notes and onto the cloud with Force.com, apps are easier to >>> build, >>> use, and manage than apps on traditional platforms. Sign up for the Lotus >>> Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d >>> _______________________________________________ >>> Gmod-devel mailing list >>> [hidden email] >>> https://lists.sourceforge.net/lists/listinfo/gmod-devel >>> >>> >> >> > > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research _______________________________________________ maker-devel mailing list [hidden email] http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org |
Hi Claudia,
As a quick fix, you can use the attached perl file. I think the only assumption I made when writing this (which is perhaps a little more verbose that I would have typically done :-) is that the value of the Name tag ends with a semicolon, as it did in your example line. If the Name value is the last thing on the line, the semicolon isn't required, but it is not unusual for it to be there because of how the file is constructed. If it can't be counted on to be there, the regular expression that finds the commas to replace would have to be changed a little bit. To use it, do this: perl comma-fix.pl problemfile.gff > new_gff_file.gff which should hopefully do the trick. Scott On Wed, Dec 8, 2010 at 9:55 AM, Scott Cain <[hidden email]> wrote: > Hi Claudia, > > I doubt if there is a maker preference to fix this, but I'll cc the > maker list just in case. > > I also don't know of a script that will do this for you, though it > wouldn't be terribly hard to write a perl script that did it (possibly > even in one line :-) Anyone want to have a shot at perl golf? :-) > > Scott > > > On Wed, Dec 8, 2010 at 9:47 AM, claudia <[hidden email]> wrote: >> Hi thank you for the quick reply, >> I realize I could manually edit the GFF3, but I have a database full of >> files like this produced from 'Maker', is there any script available, or >> Maker preference to change this? >> >> Claudia >> >> >> On 08/12/2010 9:18 AM, Scott Cain wrote: >>> >>> Hi Claudia, >>> >>> Questions about Chado are best sent to the schema mailing list, which >>> I cc'ed here. >>> >>> The problem you are having is that the comma has special meaning in >>> column nine of a GFF3 file, indicating more than one value, so that >>> feature really has two names, >>> "30128.m008887#Guanosine-5'-triphosphate" and "3'-diphosphate", which >>> isn't allowed. In order for that to be the name, the comma needs to >>> be URI escaped, which is to say, replaced with "%2C". >>> >>> Scott >>> >>> >>> On Tue, Dec 7, 2010 at 2:54 PM, Dinatale C<[hidden email]> wrote: >>>> >>>> To whom it may concern, >>>> >>>> I am attempting to load a preproccessed gff3 file that is a merged gff3 >>>> from a maker output and I am getting this response (below) when I use the >>>> gmod bulk load script. Could you shed some light for me in solving this >>>> problem? >>>> >>>> Thank you, >>>> >>>> Claudia DiNatale >>>> >>>> contig00562 blastx protein_match 24 422 187 - . >>>> >>>> ID=contig00562:hit:365;Name=30128.m008887#Guanosine-5'-triphosphate,3'-diphosphate; >>>> >>>> A feature may have at most one Name value >>>> STACK: Error::throw >>>> STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:368 >>>> STACK: Bio::FeatureIO::gff::_handle_feature >>>> /usr/share/perl5/Bio/FeatureIO/gff.pm:729 >>>> STACK: Bio::FeatureIO::gff::next_feature >>>> /usr/share/perl5/Bio/FeatureIO/gff.pm:172 >>>> STACK: /usr/local/bin/gmod_bulk_load_gff3.pl:777 >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> What happens now with your Lotus Notes apps - do you make another costly >>>> upgrade, or settle for being marooned without product support? Time to >>>> move >>>> off Lotus Notes and onto the cloud with Force.com, apps are easier to >>>> build, >>>> use, and manage than apps on traditional platforms. Sign up for the Lotus >>>> Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d >>>> _______________________________________________ >>>> Gmod-devel mailing list >>>> [hidden email] >>>> https://lists.sourceforge.net/lists/listinfo/gmod-devel >>>> >>>> >>> >>> >> >> > > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott at scottcain dot net > GMOD Coordinator (http://gmod.org/) 216-392-3087 > Ontario Institute for Cancer Research > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research _______________________________________________ maker-devel mailing list [hidden email] http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org |
In reply to this post by Scott Cain
cat file.gff | perl -MURI::Escape -ane '$_ =~ s/(Name|ID)=([^\;\n]+)/"$1=".uri_escape($2, ",\x27\#")/ge; print $_' > fixed_file.gff I am surprised this is not already being escaped in MAKER. Which version are you using? Thanks, Carson On 12/8/10 7:55 AM, "Scott Cain" <scott@...> wrote: Hi Claudia, _______________________________________________ maker-devel mailing list [hidden email] http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org |
Free forum by Nabble | Edit this page |