gmod_bulk_load_gff3.pl

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

gmod_bulk_load_gff3.pl

Becksfort, Jared

Hello,

 

I keep getting an error that says such-and-such CDS or UTR “does not have a parent:  I think that is wrong!”  I have sorted the gff3 file using the gmod gff3 preprocessor, and I have validated the file using http://modencode.oicr.on.ca/cgi-bin/validate_gff3_online

 

It only happens for CDS or UTR.  All of them have parents in the ninth column, and the parent exists and is above the children in the file.  As far as I can tell, there is nothing wrong with the file.  I also tried the –noexon option, but that didn’t work either.

 

Has this happened to anyone else?  Were you able to solve the problem?  If so, what did you do to solve it?  My gff file is about .5M so I can attach it if I need to (I didn’t this time).

 

Thanks,

Jared Becksfort

 



Email Disclaimer: www.stjude.org/emaildisclaimer

------------------------------------------------------------------------------


_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: gmod_bulk_load_gff3.pl

Scott Cain
Hi Jared,

It is possible for the GFF to be valid (passing the validator) and run
through the preprocessor (which might not be needed anyway--I would
suggest you only use that if you know you need it), and still give
this message.  Where did the GFF come from?  Are you sure that the CDS
and UTR Parent attributes refer to a feature that already exists?

On possibly relevent note: the Chado loader doesn't support grouping
features like CDSes by ID, they must have a Parent attribute that
points at something else like a transcript or gene.

You can attach the GFF file and send it to the list if you can't sort it out.

Scott


On Tue, May 25, 2010 at 3:24 PM, Becksfort, Jared
<[hidden email]> wrote:

> Hello,
>
>
>
> I keep getting an error that says such-and-such CDS or UTR “does not have a
> parent:  I think that is wrong!”  I have sorted the gff3 file using the gmod
> gff3 preprocessor, and I have validated the file using
> http://modencode.oicr.on.ca/cgi-bin/validate_gff3_online.
>
>
>
> It only happens for CDS or UTR.  All of them have parents in the ninth
> column, and the parent exists and is above the children in the file.  As far
> as I can tell, there is nothing wrong with the file.  I also tried the
> –noexon option, but that didn’t work either.
>
>
>
> Has this happened to anyone else?  Were you able to solve the problem?  If
> so, what did you do to solve it?  My gff file is about .5M so I can attach
> it if I need to (I didn’t this time).
>
>
>
> Thanks,
>
> Jared Becksfort
>
>
>
> ________________________________
> Email Disclaimer: www.stjude.org/emaildisclaimer
>
> ------------------------------------------------------------------------------
>
>
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------

_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: gmod_bulk_load_gff3.pl

Becksfort, Jared
Scott,

Thank you very much for replying.

The attached file is, I believe, a public AceView file for Hg18.  When you ask if I am sure that the parent attributes refer to a feature that already exists, are you asking if they are in the same file or if they are in the database prior to loading this file?  The parents appear in the file on earlier lines, but they are not in the database prior to loading this file.

When the load completes, the feature and featureloc tables are populated, but the feature_relationship table has no rows.  That could be an unrelated problem (or not a problem at all??), but I am trying to eliminate possible causes.  I figure that it is a problem because how else would the parent relationships be captured?

Thanks again!
Jared

-----Original Message-----
From: Scott Cain [mailto:[hidden email]]
Sent: Tuesday, May 25, 2010 3:58 PM
To: Becksfort, Jared
Cc: [hidden email]
Subject: Re: [Gmod-schema] gmod_bulk_load_gff3.pl

Hi Jared,

It is possible for the GFF to be valid (passing the validator) and run
through the preprocessor (which might not be needed anyway--I would
suggest you only use that if you know you need it), and still give
this message.  Where did the GFF come from?  Are you sure that the CDS
and UTR Parent attributes refer to a feature that already exists?

On possibly relevent note: the Chado loader doesn't support grouping
features like CDSes by ID, they must have a Parent attribute that
points at something else like a transcript or gene.

You can attach the GFF file and send it to the list if you can't sort it out.

Scott


On Tue, May 25, 2010 at 3:24 PM, Becksfort, Jared
<[hidden email]> wrote:

> Hello,
>
>
>
> I keep getting an error that says such-and-such CDS or UTR "does not have a
> parent:  I think that is wrong!"  I have sorted the gff3 file using the gmod
> gff3 preprocessor, and I have validated the file using
> http://modencode.oicr.on.ca/cgi-bin/validate_gff3_online.
>
>
>
> It only happens for CDS or UTR.  All of them have parents in the ninth
> column, and the parent exists and is above the children in the file.  As far
> as I can tell, there is nothing wrong with the file.  I also tried the
> -noexon option, but that didn't work either.
>
>
>
> Has this happened to anyone else?  Were you able to solve the problem?  If
> so, what did you do to solve it?  My gff file is about .5M so I can attach
> it if I need to (I didn't this time).
>
>
>
> Thanks,
>
> Jared Becksfort
>
>
>
> ________________________________
> Email Disclaimer: www.stjude.org/emaildisclaimer
>
> ------------------------------------------------------------------------------
>
>
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>


--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


------------------------------------------------------------------------------


_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema

AceView_hg18_chrY.gff3.sorted (803K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: gmod_bulk_load_gff3.pl

Scott Cain
Hi Jared,

When I load that file, it loads fine for me (after I added a line at
the top for chrY; you must already have chrY in your database or it
would have complained about that first).

What versions of BioPerl and Chado are you  using?  I just released
gmod-1.1 (Chado) on Monday, but if you are using Chado from svn, that
should be fine.  If you are using the gmod-1.0 release, please upgrade
and try again.

Scott


On Wed, May 26, 2010 at 11:51 AM, Becksfort, Jared
<[hidden email]> wrote:

> Scott,
>
> Thank you very much for replying.
>
> The attached file is, I believe, a public AceView file for Hg18.  When you ask if I am sure that the parent attributes refer to a feature that already exists, are you asking if they are in the same file or if they are in the database prior to loading this file?  The parents appear in the file on earlier lines, but they are not in the database prior to loading this file.
>
> When the load completes, the feature and featureloc tables are populated, but the feature_relationship table has no rows.  That could be an unrelated problem (or not a problem at all??), but I am trying to eliminate possible causes.  I figure that it is a problem because how else would the parent relationships be captured?
>
> Thanks again!
> Jared
>
> -----Original Message-----
> From: Scott Cain [mailto:[hidden email]]
> Sent: Tuesday, May 25, 2010 3:58 PM
> To: Becksfort, Jared
> Cc: [hidden email]
> Subject: Re: [Gmod-schema] gmod_bulk_load_gff3.pl
>
> Hi Jared,
>
> It is possible for the GFF to be valid (passing the validator) and run
> through the preprocessor (which might not be needed anyway--I would
> suggest you only use that if you know you need it), and still give
> this message.  Where did the GFF come from?  Are you sure that the CDS
> and UTR Parent attributes refer to a feature that already exists?
>
> On possibly relevent note: the Chado loader doesn't support grouping
> features like CDSes by ID, they must have a Parent attribute that
> points at something else like a transcript or gene.
>
> You can attach the GFF file and send it to the list if you can't sort it out.
>
> Scott
>
>
> On Tue, May 25, 2010 at 3:24 PM, Becksfort, Jared
> <[hidden email]> wrote:
>> Hello,
>>
>>
>>
>> I keep getting an error that says such-and-such CDS or UTR "does not have a
>> parent:  I think that is wrong!"  I have sorted the gff3 file using the gmod
>> gff3 preprocessor, and I have validated the file using
>> http://modencode.oicr.on.ca/cgi-bin/validate_gff3_online.
>>
>>
>>
>> It only happens for CDS or UTR.  All of them have parents in the ninth
>> column, and the parent exists and is above the children in the file.  As far
>> as I can tell, there is nothing wrong with the file.  I also tried the
>> -noexon option, but that didn't work either.
>>
>>
>>
>> Has this happened to anyone else?  Were you able to solve the problem?  If
>> so, what did you do to solve it?  My gff file is about .5M so I can attach
>> it if I need to (I didn't this time).
>>
>>
>>
>> Thanks,
>>
>> Jared Becksfort
>>
>>
>>
>> ________________________________
>> Email Disclaimer: www.stjude.org/emaildisclaimer
>>
>> ------------------------------------------------------------------------------
>>
>>
>> _______________________________________________
>> Gmod-schema mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>
>>
>
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> Ontario Institute for Cancer Research
>
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------

_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: gmod_bulk_load_gff3.pl

Becksfort, Jared
Hi Scott,

I upgraded to gmod-1.1.  I ran the bulk loader on the gff3 file and accidentally used the older schema (because I forgot to update $CHADO_ROOT) and it worked.  I thought all was good and would then load it into the newer 1.1 schema on a completely new database, and I get this error:

Loading data into feature table ...
Loading data into featureloc table ...
Loading data into feature_relationship table ...
ERROR:  invalid input syntax for integer: ""
CONTEXT:  COPY feature_relationship, line 4030, column type_id: ""
STATEMENT:  COPY feature_relationship (feature_relationship_id,subject_id,object_id,type_id) FROM STDIN;
DBD::Pg::db pg_endcopy failed: ERROR:  invalid input syntax for integer: ""
CONTEXT:  COPY feature_relationship, line 4030, column type_id: "" at /opt/apps/perl/perl-5.8.9/lib/site_perl/5.8.9/Bio/GMOD/DB/Adapter.pm line 3222, <$fh> line 6575.

I get this whether or not I use --noexons or --recreate_cache.  I noticed on one of the support pages that a similar pg_endcopy error has appeared, but that is not the problem here.  The relationship "part_of" cvterm was not hijacked here.  Just to be sure, I ran the queries and it didn't change anything or fix the problem.

To get my version of BioPerl, I ran " perl -MBio::Root::Version -e 'print $Bio::Root::Version::VERSION,"\n"'" and got:

1.006001

Thanks,
Jared

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: calling endcopy for feature_relationship failed:
STACK: Error::throw
STACK: Bio::Root::Root::throw /opt/apps/perl/perl-5.8.9/lib/site_perl/5.8.9/Bio/Root/Root.pm:368
STACK: Bio::GMOD::DB::Adapter::copy_from_stdin /opt/apps/perl/perl-5.8.9/lib/site_perl/5.8.9/Bio/GMOD/DB/Adapter.pm:3222
STACK: Bio::GMOD::DB::Adapter::load_data /opt/apps/perl/perl-5.8.9/lib/site_perl/5.8.9/Bio/GMOD/DB/Adapter.pm:3144
STACK: /nfs_exports/apps/gnu-apps/ergatis/software/gmod-1.1/load/bin/gmod_bulk_load_gff3.pl:1060

-----Original Message-----
From: Scott Cain [mailto:[hidden email]]
Sent: Wednesday, May 26, 2010 11:54 AM
To: Becksfort, Jared
Cc: [hidden email]
Subject: Re: [Gmod-schema] gmod_bulk_load_gff3.pl

Hi Jared,

When I load that file, it loads fine for me (after I added a line at
the top for chrY; you must already have chrY in your database or it
would have complained about that first).

What versions of BioPerl and Chado are you  using?  I just released
gmod-1.1 (Chado) on Monday, but if you are using Chado from svn, that
should be fine.  If you are using the gmod-1.0 release, please upgrade
and try again.

Scott


On Wed, May 26, 2010 at 11:51 AM, Becksfort, Jared
<[hidden email]> wrote:

> Scott,
>
> Thank you very much for replying.
>
> The attached file is, I believe, a public AceView file for Hg18.  When you ask if I am sure that the parent attributes refer to a feature that already exists, are you asking if they are in the same file or if they are in the database prior to loading this file?  The parents appear in the file on earlier lines, but they are not in the database prior to loading this file.
>
> When the load completes, the feature and featureloc tables are populated, but the feature_relationship table has no rows.  That could be an unrelated problem (or not a problem at all??), but I am trying to eliminate possible causes.  I figure that it is a problem because how else would the parent relationships be captured?
>
> Thanks again!
> Jared
>
> -----Original Message-----
> From: Scott Cain [mailto:[hidden email]]
> Sent: Tuesday, May 25, 2010 3:58 PM
> To: Becksfort, Jared
> Cc: [hidden email]
> Subject: Re: [Gmod-schema] gmod_bulk_load_gff3.pl
>
> Hi Jared,
>
> It is possible for the GFF to be valid (passing the validator) and run
> through the preprocessor (which might not be needed anyway--I would
> suggest you only use that if you know you need it), and still give
> this message.  Where did the GFF come from?  Are you sure that the CDS
> and UTR Parent attributes refer to a feature that already exists?
>
> On possibly relevent note: the Chado loader doesn't support grouping
> features like CDSes by ID, they must have a Parent attribute that
> points at something else like a transcript or gene.
>
> You can attach the GFF file and send it to the list if you can't sort it out.
>
> Scott
>
>
> On Tue, May 25, 2010 at 3:24 PM, Becksfort, Jared
> <[hidden email]> wrote:
>> Hello,
>>
>>
>>
>> I keep getting an error that says such-and-such CDS or UTR "does not have a
>> parent:  I think that is wrong!"  I have sorted the gff3 file using the gmod
>> gff3 preprocessor, and I have validated the file using
>> http://modencode.oicr.on.ca/cgi-bin/validate_gff3_online.
>>
>>
>>
>> It only happens for CDS or UTR.  All of them have parents in the ninth
>> column, and the parent exists and is above the children in the file.  As far
>> as I can tell, there is nothing wrong with the file.  I also tried the
>> -noexon option, but that didn't work either.
>>
>>
>>
>> Has this happened to anyone else?  Were you able to solve the problem?  If
>> so, what did you do to solve it?  My gff file is about .5M so I can attach
>> it if I need to (I didn't this time).
>>
>>
>>
>> Thanks,
>>
>> Jared Becksfort
>>
>>
>>
>> ________________________________
>> Email Disclaimer: www.stjude.org/emaildisclaimer
>>
>> ------------------------------------------------------------------------------
>>
>>
>> _______________________________________________
>> Gmod-schema mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>
>>
>
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> Ontario Institute for Cancer Research
>
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research



------------------------------------------------------------------------------

_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: gmod_bulk_load_gff3.pl

Scott Cain
Hi Jared,

Did you create the schema from scratch, including reloading  
ontologies?  I'm wondering if there is something wrong with your  
relationship ontology.  You could try rerunning the load and supplying  
the --save option which will save your load files. Then you could look  
at the file for feature_relationship and feature to figure out what  
feature is causing the problem. Alternatively, you could send me those  
two files (and the gff file if you haven't sent it already), and I can  
try to diagnose it.

Scott

--
Scott Cain, Ph. D.
scott at scottcain dot net
Ontario Institute for Cancer Research
http://gmod.org/
216 392 3087

On May 27, 2010, at 2:55 PM, "Becksfort, Jared" <[hidden email]
 > wrote:

> Hi Scott,
>
> I upgraded to gmod-1.1.  I ran the bulk loader on the gff3 file and  
> accidentally used the older schema (because I forgot to update  
> $CHADO_ROOT) and it worked.  I thought all was good and would then  
> load it into the newer 1.1 schema on a completely new database, and  
> I get this error:
>
> Loading data into feature table ...
> Loading data into featureloc table ...
> Loading data into feature_relationship table ...
> ERROR:  invalid input syntax for integer: ""
> CONTEXT:  COPY feature_relationship, line 4030, column type_id: ""
> STATEMENT:  COPY feature_relationship  
> (feature_relationship_id,subject_id,object_id,type_id) FROM STDIN;
> DBD::Pg::db pg_endcopy failed: ERROR:  invalid input syntax for  
> integer: ""
> CONTEXT:  COPY feature_relationship, line 4030, column type_id: ""  
> at /opt/apps/perl/perl-5.8.9/lib/site_perl/5.8.9/Bio/GMOD/DB/
> Adapter.pm line 3222, <$fh> line 6575.
>
> I get this whether or not I use --noexons or --recreate_cache.  I  
> noticed on one of the support pages that a similar pg_endcopy error  
> has appeared, but that is not the problem here.  The relationship  
> "part_of" cvterm was not hijacked here.  Just to be sure, I ran the  
> queries and it didn't change anything or fix the problem.
>
> To get my version of BioPerl, I ran " perl -MBio::Root::Version -e  
> 'print $Bio::Root::Version::VERSION,"\n"'" and got:
>
> 1.006001
>
> Thanks,
> Jared
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: calling endcopy for feature_relationship failed:
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /opt/apps/perl/perl-5.8.9/lib/
> site_perl/5.8.9/Bio/Root/Root.pm:368
> STACK: Bio::GMOD::DB::Adapter::copy_from_stdin /opt/apps/perl/
> perl-5.8.9/lib/site_perl/5.8.9/Bio/GMOD/DB/Adapter.pm:3222
> STACK: Bio::GMOD::DB::Adapter::load_data /opt/apps/perl/perl-5.8.9/
> lib/site_perl/5.8.9/Bio/GMOD/DB/Adapter.pm:3144
> STACK: /nfs_exports/apps/gnu-apps/ergatis/software/gmod-1.1/load/bin/
> gmod_bulk_load_gff3.pl:1060
>
> -----Original Message-----
> From: Scott Cain [mailto:[hidden email]]
> Sent: Wednesday, May 26, 2010 11:54 AM
> To: Becksfort, Jared
> Cc: [hidden email]
> Subject: Re: [Gmod-schema] gmod_bulk_load_gff3.pl
>
> Hi Jared,
>
> When I load that file, it loads fine for me (after I added a line at
> the top for chrY; you must already have chrY in your database or it
> would have complained about that first).
>
> What versions of BioPerl and Chado are you  using?  I just released
> gmod-1.1 (Chado) on Monday, but if you are using Chado from svn, that
> should be fine.  If you are using the gmod-1.0 release, please upgrade
> and try again.
>
> Scott
>
>
> On Wed, May 26, 2010 at 11:51 AM, Becksfort, Jared
> <[hidden email]> wrote:
>> Scott,
>>
>> Thank you very much for replying.
>>
>> The attached file is, I believe, a public AceView file for Hg18.  
>> When you ask if I am sure that the parent attributes refer to a  
>> feature that already exists, are you asking if they are in the same  
>> file or if they are in the database prior to loading this file?  
>> The parents appear in the file on earlier lines, but they are not  
>> in the database prior to loading this file.
>>
>> When the load completes, the feature and featureloc tables are  
>> populated, but the feature_relationship table has no rows.  That  
>> could be an unrelated problem (or not a problem at all??), but I am  
>> trying to eliminate possible causes.  I figure that it is a problem  
>> because how else would the parent relationships be captured?
>>
>> Thanks again!
>> Jared
>>
>> -----Original Message-----
>> From: Scott Cain [mailto:[hidden email]]
>> Sent: Tuesday, May 25, 2010 3:58 PM
>> To: Becksfort, Jared
>> Cc: [hidden email]
>> Subject: Re: [Gmod-schema] gmod_bulk_load_gff3.pl
>>
>> Hi Jared,
>>
>> It is possible for the GFF to be valid (passing the validator) and  
>> run
>> through the preprocessor (which might not be needed anyway--I would
>> suggest you only use that if you know you need it), and still give
>> this message.  Where did the GFF come from?  Are you sure that the  
>> CDS
>> and UTR Parent attributes refer to a feature that already exists?
>>
>> On possibly relevent note: the Chado loader doesn't support grouping
>> features like CDSes by ID, they must have a Parent attribute that
>> points at something else like a transcript or gene.
>>
>> You can attach the GFF file and send it to the list if you can't  
>> sort it out.
>>
>> Scott
>>
>>
>> On Tue, May 25, 2010 at 3:24 PM, Becksfort, Jared
>> <[hidden email]> wrote:
>>> Hello,
>>>
>>>
>>>
>>> I keep getting an error that says such-and-such CDS or UTR "does  
>>> not have a
>>> parent:  I think that is wrong!"  I have sorted the gff3 file  
>>> using the gmod
>>> gff3 preprocessor, and I have validated the file using
>>> http://modencode.oicr.on.ca/cgi-bin/validate_gff3_online.
>>>
>>>
>>>
>>> It only happens for CDS or UTR.  All of them have parents in the  
>>> ninth
>>> column, and the parent exists and is above the children in the  
>>> file.  As far
>>> as I can tell, there is nothing wrong with the file.  I also tried  
>>> the
>>> -noexon option, but that didn't work either.
>>>
>>>
>>>
>>> Has this happened to anyone else?  Were you able to solve the  
>>> problem?  If
>>> so, what did you do to solve it?  My gff file is about .5M so I  
>>> can attach
>>> it if I need to (I didn't this time).
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Jared Becksfort
>>>
>>>
>>>
>>> ________________________________
>>> Email Disclaimer: www.stjude.org/emaildisclaimer
>>>
>>> ---
>>> ---
>>> ---
>>> ---
>>> ------------------------------------------------------------------
>>>
>>>
>>> _______________________________________________
>>> Gmod-schema mailing list
>>> [hidden email]
>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>>
>>>
>>
>>
>>
>> --
>> ---
>> ---------------------------------------------------------------------
>> Scott Cain, Ph. D.                                   scott at  
>> scottcain dot net
>> GMOD Coordinator (http://gmod.org/)                     216-392-3087
>> Ontario Institute for Cancer Research
>>
>>
>
>
>
> --
> ---
> ---------------------------------------------------------------------
> Scott Cain, Ph. D.                                   scott at  
> scottcain dot net
> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> Ontario Institute for Cancer Research
>
>

------------------------------------------------------------------------------

_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: gmod_bulk_load_gff3.pl

Becksfort, Jared
Scott,

Removing the database and running "make clean" followed by a complete reinstall of gmod-1.1 with all the make steps seems to fix the problem with the CDS rows.

Thank you very much for your help.

Jared

-----Original Message-----
From: Scott Cain [mailto:[hidden email]]
Sent: Thursday, May 27, 2010 3:12 PM
To: Becksfort, Jared
Cc: [hidden email]
Subject: Re: [Gmod-schema] gmod_bulk_load_gff3.pl. .

Hi Jared,

Did you create the schema from scratch, including reloading  
ontologies?  I'm wondering if there is something wrong with your  
relationship ontology.  You could try rerunning the load and supplying  
the --save option which will save your load files. Then you could look  
at the file for feature_relationship and feature to figure out what  
feature is causing the problem. Alternatively, you could send me those  
two files (and the gff file if you haven't sent it already), and I can  
try to diagnose it.

Scott

--
Scott Cain, Ph. D.
scott at scottcain dot net
Ontario Institute for Cancer Research
http://gmod.org/
216 392 3087

On May 27, 2010, at 2:55 PM, "Becksfort, Jared" <[hidden email]
 > wrote:

> Hi Scott,
>
> I upgraded to gmod-1.1.  I ran the bulk loader on the gff3 file and  
> accidentally used the older schema (because I forgot to update  
> $CHADO_ROOT) and it worked.  I thought all was good and would then  
> load it into the newer 1.1 schema on a completely new database, and  
> I get this error:
>
> Loading data into feature table ...
> Loading data into featureloc table ...
> Loading data into feature_relationship table ...
> ERROR:  invalid input syntax for integer: ""
> CONTEXT:  COPY feature_relationship, line 4030, column type_id: ""
> STATEMENT:  COPY feature_relationship  
> (feature_relationship_id,subject_id,object_id,type_id) FROM STDIN;
> DBD::Pg::db pg_endcopy failed: ERROR:  invalid input syntax for  
> integer: ""
> CONTEXT:  COPY feature_relationship, line 4030, column type_id: ""  
> at /opt/apps/perl/perl-5.8.9/lib/site_perl/5.8.9/Bio/GMOD/DB/
> Adapter.pm line 3222, <$fh> line 6575.
>
> I get this whether or not I use --noexons or --recreate_cache.  I  
> noticed on one of the support pages that a similar pg_endcopy error  
> has appeared, but that is not the problem here.  The relationship  
> "part_of" cvterm was not hijacked here.  Just to be sure, I ran the  
> queries and it didn't change anything or fix the problem.
>
> To get my version of BioPerl, I ran " perl -MBio::Root::Version -e  
> 'print $Bio::Root::Version::VERSION,"\n"'" and got:
>
> 1.006001
>
> Thanks,
> Jared
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: calling endcopy for feature_relationship failed:
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /opt/apps/perl/perl-5.8.9/lib/
> site_perl/5.8.9/Bio/Root/Root.pm:368
> STACK: Bio::GMOD::DB::Adapter::copy_from_stdin /opt/apps/perl/
> perl-5.8.9/lib/site_perl/5.8.9/Bio/GMOD/DB/Adapter.pm:3222
> STACK: Bio::GMOD::DB::Adapter::load_data /opt/apps/perl/perl-5.8.9/
> lib/site_perl/5.8.9/Bio/GMOD/DB/Adapter.pm:3144
> STACK: /nfs_exports/apps/gnu-apps/ergatis/software/gmod-1.1/load/bin/
> gmod_bulk_load_gff3.pl:1060
>
> -----Original Message-----
> From: Scott Cain [mailto:[hidden email]]
> Sent: Wednesday, May 26, 2010 11:54 AM
> To: Becksfort, Jared
> Cc: [hidden email]
> Subject: Re: [Gmod-schema] gmod_bulk_load_gff3.pl
>
> Hi Jared,
>
> When I load that file, it loads fine for me (after I added a line at
> the top for chrY; you must already have chrY in your database or it
> would have complained about that first).
>
> What versions of BioPerl and Chado are you  using?  I just released
> gmod-1.1 (Chado) on Monday, but if you are using Chado from svn, that
> should be fine.  If you are using the gmod-1.0 release, please upgrade
> and try again.
>
> Scott
>
>
> On Wed, May 26, 2010 at 11:51 AM, Becksfort, Jared
> <[hidden email]> wrote:
>> Scott,
>>
>> Thank you very much for replying.
>>
>> The attached file is, I believe, a public AceView file for Hg18.  
>> When you ask if I am sure that the parent attributes refer to a  
>> feature that already exists, are you asking if they are in the same  
>> file or if they are in the database prior to loading this file?  
>> The parents appear in the file on earlier lines, but they are not  
>> in the database prior to loading this file.
>>
>> When the load completes, the feature and featureloc tables are  
>> populated, but the feature_relationship table has no rows.  That  
>> could be an unrelated problem (or not a problem at all??), but I am  
>> trying to eliminate possible causes.  I figure that it is a problem  
>> because how else would the parent relationships be captured?
>>
>> Thanks again!
>> Jared
>>
>> -----Original Message-----
>> From: Scott Cain [mailto:[hidden email]]
>> Sent: Tuesday, May 25, 2010 3:58 PM
>> To: Becksfort, Jared
>> Cc: [hidden email]
>> Subject: Re: [Gmod-schema] gmod_bulk_load_gff3.pl
>>
>> Hi Jared,
>>
>> It is possible for the GFF to be valid (passing the validator) and  
>> run
>> through the preprocessor (which might not be needed anyway--I would
>> suggest you only use that if you know you need it), and still give
>> this message.  Where did the GFF come from?  Are you sure that the  
>> CDS
>> and UTR Parent attributes refer to a feature that already exists?
>>
>> On possibly relevent note: the Chado loader doesn't support grouping
>> features like CDSes by ID, they must have a Parent attribute that
>> points at something else like a transcript or gene.
>>
>> You can attach the GFF file and send it to the list if you can't  
>> sort it out.
>>
>> Scott
>>
>>
>> On Tue, May 25, 2010 at 3:24 PM, Becksfort, Jared
>> <[hidden email]> wrote:
>>> Hello,
>>>
>>>
>>>
>>> I keep getting an error that says such-and-such CDS or UTR "does  
>>> not have a
>>> parent:  I think that is wrong!"  I have sorted the gff3 file  
>>> using the gmod
>>> gff3 preprocessor, and I have validated the file using
>>> http://modencode.oicr.on.ca/cgi-bin/validate_gff3_online.
>>>
>>>
>>>
>>> It only happens for CDS or UTR.  All of them have parents in the  
>>> ninth
>>> column, and the parent exists and is above the children in the  
>>> file.  As far
>>> as I can tell, there is nothing wrong with the file.  I also tried  
>>> the
>>> -noexon option, but that didn't work either.
>>>
>>>
>>>
>>> Has this happened to anyone else?  Were you able to solve the  
>>> problem?  If
>>> so, what did you do to solve it?  My gff file is about .5M so I  
>>> can attach
>>> it if I need to (I didn't this time).
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Jared Becksfort
>>>
>>>
>>>
>>> ________________________________
>>> Email Disclaimer: www.stjude.org/emaildisclaimer
>>>
>>> ---
>>> ---
>>> ---
>>> ---
>>> ------------------------------------------------------------------
>>>
>>>
>>> _______________________________________________
>>> Gmod-schema mailing list
>>> [hidden email]
>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>>
>>>
>>
>>
>>
>> --
>> ---
>> ---------------------------------------------------------------------
>> Scott Cain, Ph. D.                                   scott at  
>> scottcain dot net
>> GMOD Coordinator (http://gmod.org/)                     216-392-3087
>> Ontario Institute for Cancer Research
>>
>>
>
>
>
> --
> ---
> ---------------------------------------------------------------------
> Scott Cain, Ph. D.                                   scott at  
> scottcain dot net
> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> Ontario Institute for Cancer Research
>
>

------------------------------------------------------------------------------

_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema



------------------------------------------------------------------------------

_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema