problem infeature sync

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

problem infeature sync

P. Ziarsolo
Hi,
When I sync features with chado using the feature_sync options of
tripal, I have realized that there are features that althought they
create a accesion in the database, they don't have a feature node of
drupal.

Looking depper into the problem, I have found that the first feature
that fails has a strange behaviour when features sync. Before Feature
sync:

chado_melon=# select * from feature where name='cu6';
 feature_id | dbxref_id | organism_id | name | uniquename | residues |
seqlen | md5checksum | type_id | is_analysis | is_obsolete |
timeaccessioned       |      timelastmodified      
------------+-----------+-------------+------+------------+----------+--------+-------------+---------+-------------+-------------+----------------------------+----------------------------
      17381 |           |          13 | cu6  | cu6_0      |          |
|             |     210 | f           | f           | 2010-07-09
09:18:45.242713 | 2010-07-09 09:18:45.242713
(1 row)

after feature sync:

chado_melon=# select * from feature where name='cu6';
 feature_id | dbxref_id | organism_id | name | uniquename | residues |
seqlen | md5checksum | type_id | is_analysis | is_obsolete |
timeaccessioned       |      timelastmodified      
------------+-----------+-------------+------+------------+----------+--------+-------------+---------+-------------+-------------+----------------------------+----------------------------
      18260 |           |          13 | cu6  | cu6        |          |
0 |             |     210 | f           | f           | 2010-07-12
08:03:39.067277 | 2010-07-12 08:03:39.067277
      17381 |           |          13 | cu6  | cu6_0      |          |
|             |     210 | f           | f           | 2010-07-09
09:18:45.242713 | 2010-07-09 09:18:45.242713
(2 rows)



I dont know why tripal replicates a feature, The unique difference
between the features is the unique name. Is this a bug or a feature of
tripal?
p.



------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal
Reply | Threaded
Open this post in threaded view
|

Re: problem infeature sync

Meg Staton
Does the feature with 17381 as the feature_id disappear from the chado database? 

Tripal should not alter the chado database during a feature sync event.  I've scanned the code, and it looks like only select statements are being used for chado, definitely no inserts.

As far as not syncing every feature, that is a part of tripal.  You can specify the types of features to sync on the feature admin page; all other types are ignored.  Also, Tripal only looks at features that are associated with an organism that has been previously synced.

I think I remember that the "_0" part is something that the gmod perl load scripts do when you load two features of the same name, but I could be wrong.

Thanks,
Meg
 
--
Margaret E. Staton
Clemson University Genomics Institute
[hidden email]
864-656-4643



From: P. Ziarsolo <[hidden email]>
To: GMOD Tripal <[hidden email]>
Sent: Mon, July 12, 2010 4:06:58 AM
Subject: [Gmod-tripal] problem infeature sync

Hi,
When I sync features with chado using the feature_sync options of
tripal, I have realized that there are features that althought they
create a accesion in the database, they don't have a feature node of
drupal.

Looking depper into the problem, I have found that the first feature
that fails has a strange behaviour when features sync. Before Feature
sync:

chado_melon=# select * from feature where name='cu6';
feature_id | dbxref_id | organism_id | name | uniquename | residues |
seqlen | md5checksum | type_id | is_analysis | is_obsolete |
timeaccessioned      |      timelastmodified     
------------+-----------+-------------+------+------------+----------+--------+-------------+---------+-------------+-------------+----------------------------+----------------------------
      17381 |          |          13 | cu6  | cu6_0      |          |
|            |    210 | f          | f          | 2010-07-09
09:18:45.242713 | 2010-07-09 09:18:45.242713
(1 row)

after feature sync:

chado_melon=# select * from feature where name='cu6';
feature_id | dbxref_id | organism_id | name | uniquename | residues |
seqlen | md5checksum | type_id | is_analysis | is_obsolete |
timeaccessioned      |      timelastmodified     
------------+-----------+-------------+------+------------+----------+--------+-------------+---------+-------------+-------------+----------------------------+----------------------------
      18260 |          |          13 | cu6  | cu6        |          |
0 |            |    210 | f          | f          | 2010-07-12
08:03:39.067277 | 2010-07-12 08:03:39.067277
      17381 |          |          13 | cu6  | cu6_0      |          |
|            |    210 | f          | f          | 2010-07-09
09:18:45.242713 | 2010-07-09 09:18:45.242713
(2 rows)



I dont know why tripal replicates a feature, The unique difference
between the features is the unique name. Is this a bug or a feature of
tripal?
p.



------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal


------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal
Reply | Threaded
Open this post in threaded view
|

Re: problem infeature sync

P. Ziarsolo

Jatorrizko mezua: al., 2010-07-12 07:05 -0700, egilea: Meg Staton
> Does the feature with 17381 as the feature_id disappear from the chado
> database?  
No, indeed this feature gets replicated.
>
> Tripal should not alter the chado database during a feature sync
> event.  I've scanned the code, and it looks like only select
> statements are being used for chado, definitely no inserts.
>
> As far as not syncing every feature, that is a part of tripal.  You
> can specify the types of features to sync on the feature admin page;
> all other types are ignored.  Also, Tripal only looks at features that
> are associated with an organism that has been previously synced.

cu6 feature is associated with a previously synced organism and is of a
type that is synced.


>
> I think I remember that the "_0" part is something that the gmod perl
> load scripts do when you load two features of the same name, but I
> could be wrong.
>
When I load my gff3 to chado, features don't get replicated. It looks
like it loads fine. It is only after feature sync of tripal that it gets
replicated in the chado database.
Previously posted selects are done just before and after the feature
sync.

Thanks
p.



> Thanks,
> Meg
>  
> --
> Margaret E. Staton
> Clemson University Genomics Institute
> [hidden email]
> 864-656-4643
>
>
>
>
> ______________________________________________________________________
> From: P. Ziarsolo <[hidden email]>
> To: GMOD Tripal <[hidden email]>
> Sent: Mon, July 12, 2010 4:06:58 AM
> Subject: [Gmod-tripal] problem infeature sync
>
> Hi,
> When I sync features with chado using the feature_sync options of
> tripal, I have realized that there are features that althought they
> create a accesion in the database, they don't have a feature node of
> drupal.
>
> Looking depper into the problem, I have found that the first feature
> that fails has a strange behaviour when features sync. Before Feature
> sync:
>
> chado_melon=# select * from feature where name='cu6';
> feature_id | dbxref_id | organism_id | name | uniquename | residues |
> seqlen | md5checksum | type_id | is_analysis | is_obsolete |
> timeaccessioned      |      timelastmodified      
> ------------+-----------+-------------+------+------------+----------+--------+-------------+---------+-------------+-------------+----------------------------+----------------------------
>       17381 |          |          13 | cu6  | cu6_0      |          |
> |            |    210 | f          | f          | 2010-07-09
> 09:18:45.242713 | 2010-07-09 09:18:45.242713
> (1 row)
>
> after feature sync:
>
> chado_melon=# select * from feature where name='cu6';
> feature_id | dbxref_id | organism_id | name | uniquename | residues |
> seqlen | md5checksum | type_id | is_analysis | is_obsolete |
> timeaccessioned      |      timelastmodified      
> ------------+-----------+-------------+------+------------+----------+--------+-------------+---------+-------------+-------------+----------------------------+----------------------------
>       18260 |          |          13 | cu6  | cu6        |          |
> 0 |            |    210 | f          | f          | 2010-07-12
> 08:03:39.067277 | 2010-07-12 08:03:39.067277
>       17381 |          |          13 | cu6  | cu6_0      |          |
> |            |    210 | f          | f          | 2010-07-09
> 09:18:45.242713 | 2010-07-09 09:18:45.242713
> (2 rows)
>
>
>
> I dont know why tripal replicates a feature, The unique difference
> between the features is the unique name. Is this a bug or a feature of
> tripal?
> p.
>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Sprint
> What will you do first with EVO, the first 4G phone?
> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
> _______________________________________________
> Gmod-tripal mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-tripal
>
>



------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal
Reply | Threaded
Open this post in threaded view
|

Re: problem infeature sync

Stephen Ficklin
Hi P?,

Meg is correct.  Tripal should not be creating features (or duplicating features) during syncing.  If this is occurring for you then this is incorrect behavior which we have not seen in our other installations of Tripal.  

The feature module does have an insert function for adding new features to Chado.  This gets called under two conditions:  1) when a new feature is added manually through the Drupal interface, or 2) when creating new nodes programatically.   When Tripal "syncs" it is  creating new nodes in Drupal programatically, which triggers an insert on the chado database.   However, the Tripal code recognizes this and does not try to reinsert the same feature twice.   The fact that this is occurring is bizarre.  Also, the fact that an "_0" is added to the uniquename is also bizarre since we have not programmed this type of behavior.  

Would you be willing to share your dataset, or at least a subset, so that we can try to load it at our end to see what's going on?  If that's not permissible i understand.  If you can share, email me off the mailing list and we can coordinate.

Also, what version of Chado are you using?  Are you using Chado v1.0 or v1.1?

Thanks,
Stephen
________________________________________
From: P. Ziarsolo [[hidden email]]
Sent: Tuesday, July 13, 2010 2:34 AM
To: Meg Staton
Cc: GMOD Tripal
Subject: Re: [Gmod-tripal] problem infeature sync

Jatorrizko mezua: al., 2010-07-12 07:05 -0700, egilea: Meg Staton
> Does the feature with 17381 as the feature_id disappear from the chado
> database?
No, indeed this feature gets replicated.
>
> Tripal should not alter the chado database during a feature sync
> event.  I've scanned the code, and it looks like only select
> statements are being used for chado, definitely no inserts.
>
> As far as not syncing every feature, that is a part of tripal.  You
> can specify the types of features to sync on the feature admin page;
> all other types are ignored.  Also, Tripal only looks at features that
> are associated with an organism that has been previously synced.

cu6 feature is associated with a previously synced organism and is of a
type that is synced.


>
> I think I remember that the "_0" part is something that the gmod perl
> load scripts do when you load two features of the same name, but I
> could be wrong.
>
When I load my gff3 to chado, features don't get replicated. It looks
like it loads fine. It is only after feature sync of tripal that it gets
replicated in the chado database.
Previously posted selects are done just before and after the feature
sync.

Thanks
p.



> Thanks,
> Meg
>
> --
> Margaret E. Staton
> Clemson University Genomics Institute
> [hidden email]
> 864-656-4643
>
>
>
>
> ______________________________________________________________________
> From: P. Ziarsolo <[hidden email]>
> To: GMOD Tripal <[hidden email]>
> Sent: Mon, July 12, 2010 4:06:58 AM
> Subject: [Gmod-tripal] problem infeature sync
>
> Hi,
> When I sync features with chado using the feature_sync options of
> tripal, I have realized that there are features that althought they
> create a accesion in the database, they don't have a feature node of
> drupal.
>
> Looking depper into the problem, I have found that the first feature
> that fails has a strange behaviour when features sync. Before Feature
> sync:
>
> chado_melon=# select * from feature where name='cu6';
> feature_id | dbxref_id | organism_id | name | uniquename | residues |
> seqlen | md5checksum | type_id | is_analysis | is_obsolete |
> timeaccessioned      |      timelastmodified
> ------------+-----------+-------------+------+------------+----------+--------+-------------+---------+-------------+-------------+----------------------------+----------------------------
>       17381 |          |          13 | cu6  | cu6_0      |          |
> |            |    210 | f          | f          | 2010-07-09
> 09:18:45.242713 | 2010-07-09 09:18:45.242713
> (1 row)
>
> after feature sync:
>
> chado_melon=# select * from feature where name='cu6';
> feature_id | dbxref_id | organism_id | name | uniquename | residues |
> seqlen | md5checksum | type_id | is_analysis | is_obsolete |
> timeaccessioned      |      timelastmodified
> ------------+-----------+-------------+------+------------+----------+--------+-------------+---------+-------------+-------------+----------------------------+----------------------------
>       18260 |          |          13 | cu6  | cu6        |          |
> 0 |            |    210 | f          | f          | 2010-07-12
> 08:03:39.067277 | 2010-07-12 08:03:39.067277
>       17381 |          |          13 | cu6  | cu6_0      |          |
> |            |    210 | f          | f          | 2010-07-09
> 09:18:45.242713 | 2010-07-09 09:18:45.242713
> (2 rows)
>
>
>
> I dont know why tripal replicates a feature, The unique difference
> between the features is the unique name. Is this a bug or a feature of
> tripal?
> p.
>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Sprint
> What will you do first with EVO, the first 4G phone?
> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
> _______________________________________________
> Gmod-tripal mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-tripal
>
>



------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal

------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal
Reply | Threaded
Open this post in threaded view
|

Re: problem infeature sync

Stephen Ficklin
Looking further at this. I realize I made a mistake in my response below.  Your original sequence was named 'cu6_0' the duplicated sequence is just named 'cu6'.   I reversed them in my mind.

Yes, you are correct!  This is a bug in Tripal.  I just realized what is happening.  When the insert function is called in Tripal during syncing it checks to see if the feature exists, but it mistakenly uses the 'name' when querying the 'uniquename' column.  Since your uniquename and name are different for the feature, it thinks the feature doesn't exists and it adds it.  

We will correct this and we'll post a bug fix as quickly as we can.

Stephen
________________________________________
From: Stephen Ficklin [[hidden email]]
Sent: Tuesday, July 13, 2010 11:44 PM
To: P. Ziarsolo; Meg Staton
Cc: GMOD Tripal
Subject: Re: [Gmod-tripal] problem infeature sync

Hi P?,

Meg is correct.  Tripal should not be creating features (or duplicating features) during syncing.  If this is occurring for you then this is incorrect behavior which we have not seen in our other installations of Tripal.

The feature module does have an insert function for adding new features to Chado.  This gets called under two conditions:  1) when a new feature is added manually through the Drupal interface, or 2) when creating new nodes programatically.   When Tripal "syncs" it is  creating new nodes in Drupal programatically, which triggers an insert on the chado database.   However, the Tripal code recognizes this and does not try to reinsert the same feature twice.   The fact that this is occurring is bizarre.  Also, the fact that an "_0" is added to the uniquename is also bizarre since we have not programmed this type of behavior.

Would you be willing to share your dataset, or at least a subset, so that we can try to load it at our end to see what's going on?  If that's not permissible i understand.  If you can share, email me off the mailing list and we can coordinate.

Also, what version of Chado are you using?  Are you using Chado v1.0 or v1.1?

Thanks,
Stephen
________________________________________
From: P. Ziarsolo [[hidden email]]
Sent: Tuesday, July 13, 2010 2:34 AM
To: Meg Staton
Cc: GMOD Tripal
Subject: Re: [Gmod-tripal] problem infeature sync

Jatorrizko mezua: al., 2010-07-12 07:05 -0700, egilea: Meg Staton
> Does the feature with 17381 as the feature_id disappear from the chado
> database?
No, indeed this feature gets replicated.
>
> Tripal should not alter the chado database during a feature sync
> event.  I've scanned the code, and it looks like only select
> statements are being used for chado, definitely no inserts.
>
> As far as not syncing every feature, that is a part of tripal.  You
> can specify the types of features to sync on the feature admin page;
> all other types are ignored.  Also, Tripal only looks at features that
> are associated with an organism that has been previously synced.

cu6 feature is associated with a previously synced organism and is of a
type that is synced.


>
> I think I remember that the "_0" part is something that the gmod perl
> load scripts do when you load two features of the same name, but I
> could be wrong.
>
When I load my gff3 to chado, features don't get replicated. It looks
like it loads fine. It is only after feature sync of tripal that it gets
replicated in the chado database.
Previously posted selects are done just before and after the feature
sync.

Thanks
p.



> Thanks,
> Meg
>
> --
> Margaret E. Staton
> Clemson University Genomics Institute
> [hidden email]
> 864-656-4643
>
>
>
>
> ______________________________________________________________________
> From: P. Ziarsolo <[hidden email]>
> To: GMOD Tripal <[hidden email]>
> Sent: Mon, July 12, 2010 4:06:58 AM
> Subject: [Gmod-tripal] problem infeature sync
>
> Hi,
> When I sync features with chado using the feature_sync options of
> tripal, I have realized that there are features that althought they
> create a accesion in the database, they don't have a feature node of
> drupal.
>
> Looking depper into the problem, I have found that the first feature
> that fails has a strange behaviour when features sync. Before Feature
> sync:
>
> chado_melon=# select * from feature where name='cu6';
> feature_id | dbxref_id | organism_id | name | uniquename | residues |
> seqlen | md5checksum | type_id | is_analysis | is_obsolete |
> timeaccessioned      |      timelastmodified
> ------------+-----------+-------------+------+------------+----------+--------+-------------+---------+-------------+-------------+----------------------------+----------------------------
>       17381 |          |          13 | cu6  | cu6_0      |          |
> |            |    210 | f          | f          | 2010-07-09
> 09:18:45.242713 | 2010-07-09 09:18:45.242713
> (1 row)
>
> after feature sync:
>
> chado_melon=# select * from feature where name='cu6';
> feature_id | dbxref_id | organism_id | name | uniquename | residues |
> seqlen | md5checksum | type_id | is_analysis | is_obsolete |
> timeaccessioned      |      timelastmodified
> ------------+-----------+-------------+------+------------+----------+--------+-------------+---------+-------------+-------------+----------------------------+----------------------------
>       18260 |          |          13 | cu6  | cu6        |          |
> 0 |            |    210 | f          | f          | 2010-07-12
> 08:03:39.067277 | 2010-07-12 08:03:39.067277
>       17381 |          |          13 | cu6  | cu6_0      |          |
> |            |    210 | f          | f          | 2010-07-09
> 09:18:45.242713 | 2010-07-09 09:18:45.242713
> (2 rows)
>
>
>
> I dont know why tripal replicates a feature, The unique difference
> between the features is the unique name. Is this a bug or a feature of
> tripal?
> p.
>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Sprint
> What will you do first with EVO, the first 4G phone?
> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
> _______________________________________________
> Gmod-tripal mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-tripal
>
>



------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal

------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal

------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Gmod-tripal mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-tripal