Storing ontology annotations for stocks

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Storing ontology annotations for stocks

Naama Menda
While migrating our plant accession database to Chado's stock module I've encountered an issue with ontology annotations.
Should these go into stock_dbxref or stock_cvterm ?

We also have evidence codes for each annotation, which should be properties. There is a stock_dbxrefprop table, but no stock_cvtermprop table. If stock_cvterm is more appropriate for ontology annotations, then there is a need for a prop table.

The second issue is with having multiple evidence codes for the same ontology term.
If a plant accession is annotated with PO 'fruit ripening', and has 2 evidence codes (e.g. 'involved in fruit ripening' evidence: inferred from mutant phenotype, and 'involved in fruit ripening' evidence 'traceable_author_statement' ) 
I can't store both evidences in the prop table pointing to one stock_dbxref, since each annotation evidence has multiple properties (such as date, person who submitted the annotation, etc.).
This means we need either a stock_dbxrefpropprop table (not a good idea) or drop the unique constraint of stock_dbxref , or if stock_cvterm is used, then assign a different pub_id to each annotation? (this is feasible, although 2 similar annotations with different evidence codes might have the same pub_id )

Any thoughts?
-Naama
 



Naama Menda
Boyce Thompson Institute for Plant Research
Tower Rd
Ithaca NY 14853
USA

(607) 254 3569
Sol Genomics Network
http://solgenomics.net/
[hidden email]

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Scott Cain
Hi Naama,

Comments below.

Scott


On Tue, Mar 15, 2011 at 11:16 AM, Naama Menda <[hidden email]> wrote:
> While migrating our plant accession database to Chado's stock module I've
> encountered an issue with ontology annotations.
> Should these go into stock_dbxref or stock_cvterm ?

I think stock_cvterm.
>
> We also have evidence codes for each annotation, which should be properties.
> There is a stock_dbxrefprop table, but no stock_cvtermprop table. If
> stock_cvterm is more appropriate for ontology annotations, then there is a
> need for a prop table.

Yes, stock_cvtermprop is a reasonable thing to do (analogous to
feature_cvtermprop).

>
> The second issue is with having multiple evidence codes for the same
> ontology term.
> If a plant accession is annotated with PO 'fruit ripening', and has 2
> evidence codes (e.g. 'involved in fruit ripening' evidence: inferred from
> mutant phenotype, and 'involved in fruit ripening' evidence
> 'traceable_author_statement' )
> I can't store both evidences in the prop table pointing to one stock_dbxref,
> since each annotation evidence has multiple properties (such as date, person
> who submitted the annotation, etc.).
> This means we need either a stock_dbxrefpropprop table (not a good idea) or
> drop the unique constraint of stock_dbxref , or if stock_cvterm is used,
> then assign a different pub_id to each annotation? (this is feasible,
> although 2 similar annotations with different evidence codes might have the
> same pub_id )

The feature_cvtermprop table has a rank column, which I think would
fix this problem if included in a stock_cvtermprop table, right?

>
> Any thoughts?
> -Naama
>
>
>
>
> Naama Menda
> Boyce Thompson Institute for Plant Research
> Tower Rd
> Ithaca NY 14853
> USA
>
> (607) 254 3569
> Sol Genomics Network
> http://solgenomics.net/
> [hidden email]
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Andy Schroeder
In reply to this post by Naama Menda
Hi Naama,

On your first question I would say stock_cvterm and then yes a
stock_cvtermprop table should be added.  This is analogous to our use of
feature_cvterm and feature_cvtermprop.

On your second issue this is something that we have run across at
FlyBase as well.  Our GO curators would like to be able to have
properties of properties as you have described but we also agree that
adding propprop tables is a bad idea.

When we need to link multiple properties (or attributes of properties)
we generally store them together in the value field and parse them as
necessary.

An alternative would be to keep track of ranks for different types to
stock_cvtermprops so each prop of type evidence_code with rank = 1 would
be matched with prop date with rank = 1 and prop author rank = 1 but IMO
this way lies madness.

cheers,
Andy

On 3/15/11 11:16 AM, Naama Menda wrote:

> While migrating our plant accession database to Chado's stock module
> I've encountered an issue with ontology annotations.
> Should these go into stock_dbxref or stock_cvterm ?
>
> We also have evidence codes for each annotation, which should be
> properties. There is a stock_dbxrefprop table, but no stock_cvtermprop
> table. If stock_cvterm is more appropriate for ontology annotations,
> then there is a need for a prop table.
>
> The second issue is with having multiple evidence codes for the same
> ontology term.
> If a plant accession is annotated with PO 'fruit ripening', and has 2
> evidence codes (e.g. 'involved in fruit ripening' evidence: inferred
> from mutant phenotype, and 'involved in fruit ripening' evidence
> 'traceable_author_statement' )�
> I can't store both evidences in the prop table pointing to one
> stock_dbxref, since each annotation evidence has multiple properties
> (such as date, person who submitted the annotation, etc.).
> This means we need either a stock_dbxrefpropprop table (not a good idea)
> or drop the unique constraint of stock_dbxref , or if stock_cvterm is
> used, then assign a different pub_id to each annotation? (this is
> feasible, although 2 similar annotations with different evidence codes
> might have the same pub_id )
>
> Any thoughts?
> -Naama
> �
>
>
>
> Naama Menda
> Boyce Thompson Institute for Plant Research
> Tower Rd
> Ithaca NY 14853
> USA
>
> (607) 254 3569
> Sol Genomics Network
> http://solgenomics.net/
> [hidden email] <mailto:[hidden email]>
>
>
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
>
>
>
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Siddhartha Basu
In reply to this post by Scott Cain

Hi Naama,
Overall,  it could follow the same pattern analogous to storing gene ontology
annotations using feature_cvterm/feature_cvtermprop etc tables. As
proposed by scott you have to use stock_cvterm/stock_cvtermprop(create) tables for that.

On Tue, 15 Mar 2011, Scott Cain wrote:

> Hi Naama,
>
> Comments below.
>
> Scott
>
>
> On Tue, Mar 15, 2011 at 11:16 AM, Naama Menda <[hidden email]> wrote:
> > While migrating our plant accession database to Chado's stock module I've
> > encountered an issue with ontology annotations.
> > Should these go into stock_dbxref or stock_cvterm ?
>
> I think stock_cvterm.
> >
> > We also have evidence codes for each annotation, which should be properties.

I would suggest to make a ontology out of your evidence codes and store
it in cvterm module. Then link to it using stock_cvtermprop table. This
would be similar to having evidence_code
ontology(http://www.obofoundry.org/cgi-bin/detail.cgi?id=evidence_code) for GO annotations.

> > There is a stock_dbxrefprop table, but no stock_cvtermprop table. If
> > stock_cvterm is more appropriate for ontology annotations, then there is a
> > need for a prop table.
>
> Yes, stock_cvtermprop is a reasonable thing to do (analogous to
> feature_cvtermprop).
>
> >
> > The second issue is with having multiple evidence codes for the same
> > ontology term.
> > If a plant accession is annotated with PO 'fruit ripening', and has 2
> > evidence codes (e.g. 'involved in fruit ripening' evidence: inferred from
> > mutant phenotype, and 'involved in fruit ripening' evidence
> > 'traceable_author_statement' )
> > I can't store both evidences in the prop table pointing to one stock_dbxref,
> > since each annotation evidence has multiple properties (such as date, person
> > who submitted the annotation, etc.).
> > This means we need either a stock_dbxrefpropprop table (not a good idea) or
> > drop the unique constraint of stock_dbxref , or if stock_cvterm is used,
> > then assign a different pub_id to each annotation? (this is feasible,
> > although 2 similar annotations with different evidence codes might have the
> > same pub_id )
>
> The feature_cvtermprop table has a rank column, which I think would
> fix this problem if included in a stock_cvtermprop table, right?

In addition feature_cvterm itself has a rank column which allow you to
store 2 similar GO annotations with different evidence codes. I don't
know if it is possible but something might worth considering.

thanks,
-siddhartha

>
> >
> > Any thoughts?
> > -Naama
> >
> >
> >
> >
> > Naama Menda
> > Boyce Thompson Institute for Plant Research
> > Tower Rd
> > Ithaca NY 14853
> > USA
> >
> > (607) 254 3569
> > Sol Genomics Network
> > http://solgenomics.net/
> > [hidden email]
> >
> > ------------------------------------------------------------------------------
> > Colocation vs. Managed Hosting
> > A question and answer guide to determining the best fit
> > for your organization - today and in the future.
> > http://p.sf.net/sfu/internap-sfd2d
> > _______________________________________________
> > Gmod-schema mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/gmod-schema
> >
> >
>
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> Ontario Institute for Cancer Research
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Naama Menda
In reply to this post by Scott Cain
hi Scott,
I'll add the stock_cvtermprop DDL.

As for the rank, it would not solve the problem since the evidence is composed of multiple properties. It usually has a relationship, and evidence_code, and might have  evidence_description, evidence_with , reference, person, date.
If we rank the props, will all props of rank=1 belong to evidence 1 and rank 2 belong to evidence 2 ?  This might work only if the application knows how to fetch the evidence count, and increment the rank on all the evidence components. This  means if evidence 1 has no evidence_with prop, but evidence 2 does, the evidence_with prop of evidence 2 will have a rank = 2 , although there is no evidence_with prop with rank 1.

-Naama




On Tue, Mar 15, 2011 at 11:46 AM, Scott Cain <[hidden email]> wrote:
Hi Naama,

Comments below.

Scott


On Tue, Mar 15, 2011 at 11:16 AM, Naama Menda <[hidden email]> wrote:
> While migrating our plant accession database to Chado's stock module I've
> encountered an issue with ontology annotations.
> Should these go into stock_dbxref or stock_cvterm ?

I think stock_cvterm.
>
> We also have evidence codes for each annotation, which should be properties.
> There is a stock_dbxrefprop table, but no stock_cvtermprop table. If
> stock_cvterm is more appropriate for ontology annotations, then there is a
> need for a prop table.

Yes, stock_cvtermprop is a reasonable thing to do (analogous to
feature_cvtermprop).

>
> The second issue is with having multiple evidence codes for the same
> ontology term.
> If a plant accession is annotated with PO 'fruit ripening', and has 2
> evidence codes (e.g. 'involved in fruit ripening' evidence: inferred from
> mutant phenotype, and 'involved in fruit ripening' evidence
> 'traceable_author_statement' )
> I can't store both evidences in the prop table pointing to one stock_dbxref,
> since each annotation evidence has multiple properties (such as date, person
> who submitted the annotation, etc.).
> This means we need either a stock_dbxrefpropprop table (not a good idea) or
> drop the unique constraint of stock_dbxref , or if stock_cvterm is used,
> then assign a different pub_id to each annotation? (this is feasible,
> although 2 similar annotations with different evidence codes might have the
> same pub_id )

The feature_cvtermprop table has a rank column, which I think would
fix this problem if included in a stock_cvtermprop table, right?

>
> Any thoughts?
> -Naama
>
>
>
>
> Naama Menda
> Boyce Thompson Institute for Plant Research
> Tower Rd
> Ithaca NY 14853
> USA
>
> <a href="tel:%28607%29%20254%203569">(607) 254 3569
> Sol Genomics Network
> http://solgenomics.net/
> [hidden email]
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     <a href="tel:216-392-3087">216-392-3087
Ontario Institute for Cancer Research


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Naama Menda
In reply to this post by Andy Schroeder
hi Andy,

I agree collecting evidence properties by rank is not a good idea. It's shaky and error prone.

Concatenating the props into one value ensures all components of the evidence remain together, but it makes keeping data integrity more difficult.
I'd really like to have a way of storing properties for evidence codes, but with the current schema it seems to me like the only way is to drop the stock_cvterm unique constraint.

Thanks,
-Naama



On Tue, Mar 15, 2011 at 12:00 PM, Andy Schroeder <[hidden email]> wrote:
Hi Naama,

On your first question I would say stock_cvterm and then yes a
stock_cvtermprop table should be added.  This is analogous to our use of
feature_cvterm and feature_cvtermprop.

On your second issue this is something that we have run across at
FlyBase as well.  Our GO curators would like to be able to have
properties of properties as you have described but we also agree that
adding propprop tables is a bad idea.

When we need to link multiple properties (or attributes of properties)
we generally store them together in the value field and parse them as
necessary.

An alternative would be to keep track of ranks for different types to
stock_cvtermprops so each prop of type evidence_code with rank = 1 would
be matched with prop date with rank = 1 and prop author rank = 1 but IMO
this way lies madness.

cheers,
Andy

On 3/15/11 11:16 AM, Naama Menda wrote:
> While migrating our plant accession database to Chado's stock module
> I've encountered an issue with ontology annotations.
> Should these go into stock_dbxref or stock_cvterm ?
>
> We also have evidence codes for each annotation, which should be
> properties. There is a stock_dbxrefprop table, but no stock_cvtermprop
> table. If stock_cvterm is more appropriate for ontology annotations,
> then there is a need for a prop table.
>
> The second issue is with having multiple evidence codes for the same
> ontology term.
> If a plant accession is annotated with PO 'fruit ripening', and has 2
> evidence codes (e.g. 'involved in fruit ripening' evidence: inferred
> from mutant phenotype, and 'involved in fruit ripening' evidence
> 'traceable_author_statement' )�
> I can't store both evidences in the prop table pointing to one
> stock_dbxref, since each annotation evidence has multiple properties
> (such as date, person who submitted the annotation, etc.).
> This means we need either a stock_dbxrefpropprop table (not a good idea)
> or drop the unique constraint of stock_dbxref , or if stock_cvterm is
> used, then assign a different pub_id to each annotation? (this is
> feasible, although 2 similar annotations with different evidence codes
> might have the same pub_id )
>
> Any thoughts?
> -Naama
> �
>
>
>
> Naama Menda
> Boyce Thompson Institute for Plant Research
> Tower Rd
> Ithaca NY 14853
> USA
>
> <a href="tel:%28607%29%20254%203569">(607) 254 3569
> Sol Genomics Network
> http://solgenomics.net/
> [hidden email] <mailto:[hidden email]>
>
>
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
>
>
>
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Jonathan "Duke" Leto
Howdy,

> I agree collecting evidence properties by rank is not a good idea. It's
> shaky and error prone.

Agreed.

> Concatenating the props into one value ensures all components of the
> evidence remain together, but it makes keeping data integrity more
> difficult.
> I'd really like to have a way of storing properties for evidence codes, but
> with the current schema it seems to me like the only way is to drop the
> stock_cvterm unique constraint.

+1 to dropping the unique constraint on stock_cvterm

Duke


--
Jonathan "Duke" Leto
[hidden email]
http://leto.net

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Naama Menda
If we go this way, we need to drop the unique constraint of feature_cvterm as well.
Any objections?

-Naama


On Wed, Mar 16, 2011 at 9:54 AM, Jonathan "Duke" Leto <[hidden email]> wrote:
Howdy,

> I agree collecting evidence properties by rank is not a good idea. It's
> shaky and error prone.

Agreed.

> Concatenating the props into one value ensures all components of the
> evidence remain together, but it makes keeping data integrity more
> difficult.
> I'd really like to have a way of storing properties for evidence codes, but
> with the current schema it seems to me like the only way is to drop the
> stock_cvterm unique constraint.

+1 to dropping the unique constraint on stock_cvterm

Duke


--
Jonathan "Duke" Leto
[hidden email]
http://leto.net


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Andy Schroeder
I assume you mean feature_cvtermprop and stock_cvtermprop tables and not
feature_cvterm and stock_cvterm?

FlyBase relies on a non-primary key unique constraints on tables so we
wouldn't be dropping these constraints.  If people really want to drop
the constraint then maybe there should be an option to do so but I don't
think this should be made default.

cheers,
Andy

On 3/16/11 11:16 AM, Naama Menda wrote:

> If we go this way, we need to drop the unique constraint of
> feature_cvterm as well.
> Any objections?
>
> -Naama
>
>
> On Wed, Mar 16, 2011 at 9:54 AM, Jonathan "Duke" Leto <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Howdy,
>
>      > I agree collecting evidence properties by rank is not a good
>     idea. It's
>      > shaky and error prone.
>
>     Agreed.
>
>      > Concatenating the props into one value ensures all components of the
>      > evidence remain together, but it makes keeping data integrity more
>      > difficult.
>      > I'd really like to have a way of storing properties for evidence
>     codes, but
>      > with the current schema it seems to me like the only way is to
>     drop the
>      > stock_cvterm unique constraint.
>
>     +1 to dropping the unique constraint on stock_cvterm
>
>     Duke
>
>
>     --
>     Jonathan "Duke" Leto
>     [hidden email] <mailto:[hidden email]>
>     http://leto.net
>
>
>
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
>
>
>
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Siddhartha Basu
In reply to this post by Naama Menda
Which constraint in feature_cvterm you are refering to. Is it the
'feature_id', 'cvterm_id',  'pub_id',  'rank' unique constraint. It is
quite accurate in storing GO annotations and its integrity. Why it
should be dropped ? How it would help to maintain the integrity of GO
annotations. Could you give some concrete example.

-siddhartha  

On Wed, 16 Mar 2011, Naama Menda wrote:

>    If we go this way, we need to drop the unique constraint of feature_cvterm
>    as well.
>    Any objections?
>
>    -Naama
>
>    On Wed, Mar 16, 2011 at 9:54 AM, Jonathan "Duke" Leto <[hidden email]>
>    wrote:
>
>      Howdy,
> > I agree collecting evidence properties by rank is not a good idea.
>      It's
> > shaky and error prone.
>
>      Agreed.
> > Concatenating the props into one value ensures all components of the
> > evidence remain together, but it makes keeping data integrity more
> > difficult.
> > I'd really like to have a way of storing properties for evidence
>      codes, but
> > with the current schema it seems to me like the only way is to drop
>      the
> > stock_cvterm unique constraint.
>
>      +1 to dropping the unique constraint on stock_cvterm
>
>      Duke
>
>      --
>      Jonathan "Duke" Leto
>      [hidden email]
>      http://leto.net

> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Naama Menda
In reply to this post by Andy Schroeder
hi Andy,

no, I mean stock_cvterm and feture_cvterm, thinking how to allow storing multiple evidence codes with properties of their own .

I'm not sure I understand how FlyBase relies on the unique constraints,
but if this introduces issues, I'd go with your approach of concatenating the properties of the evidences into the prop.value field.

Since we use Bio:;Chado::Schema, and distribute code based on it,  I can't remove the constraint locally.

Does anyone else use Chado for ontology annotation evidences?

-Naama

 

On Wed, Mar 16, 2011 at 11:32 AM, Andy Schroeder <[hidden email]> wrote:
I assume you mean feature_cvtermprop and stock_cvtermprop tables and not
feature_cvterm and stock_cvterm?

FlyBase relies on a non-primary key unique constraints on tables so we
wouldn't be dropping these constraints.  If people really want to drop
the constraint then maybe there should be an option to do so but I don't
think this should be made default.

cheers,
Andy

On 3/16/11 11:16 AM, Naama Menda wrote:
> If we go this way, we need to drop the unique constraint of
> feature_cvterm as well.
> Any objections?
>
> -Naama
>
>
> On Wed, Mar 16, 2011 at 9:54 AM, Jonathan "Duke" Leto <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Howdy,
>
>      > I agree collecting evidence properties by rank is not a good
>     idea. It's
>      > shaky and error prone.
>
>     Agreed.
>
>      > Concatenating the props into one value ensures all components of the
>      > evidence remain together, but it makes keeping data integrity more
>      > difficult.
>      > I'd really like to have a way of storing properties for evidence
>     codes, but
>      > with the current schema it seems to me like the only way is to
>     drop the
>      > stock_cvterm unique constraint.
>
>     +1 to dropping the unique constraint on stock_cvterm
>
>     Duke
>
>
>     --
>     Jonathan "Duke" Leto
>     [hidden email] <mailto:[hidden email]>
>     http://leto.net
>
>
>
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
>
>
>
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Naama Menda
In reply to this post by Siddhartha Basu
The constraint is
 "feature_cvterm_c1" UNIQUE, btree (feature_id, cvterm_id, pub_id)

This is fine for storing GO annotations, but if you want to have 2 similar annotations, with the same reference, but with different evidence codes, I would load 2 similar annotations in feature_cvterm , each having a different set of feature_cvtermprop (relationship, evidence_code, evidence description, person, date, etc. ). As Any noted these are really properties of the evidence property

Scott suggested incrementing the rank in the prop table, but I think it's a shaky solution (especially since you don't always have the same property types for each evidence) , and Andy noted that in FlyBase they concatenate the evidence props into a string that goes into the prop.value field, and this string is parsed by their software.

I suppose each methodology has pros and cons.

-Naama



On Wed, Mar 16, 2011 at 11:54 AM, Siddhartha Basu <[hidden email]> wrote:
Which constraint in feature_cvterm you are refering to. Is it the
'feature_id', 'cvterm_id',  'pub_id',  'rank' unique constraint. It is
quite accurate in storing GO annotations and its integrity. Why it
should be dropped ? How it would help to maintain the integrity of GO
annotations. Could you give some concrete example.

-siddhartha

On Wed, 16 Mar 2011, Naama Menda wrote:

>    If we go this way, we need to drop the unique constraint of feature_cvterm
>    as well.
>    Any objections?
>
>    -Naama
>
>    On Wed, Mar 16, 2011 at 9:54 AM, Jonathan "Duke" Leto <[hidden email]>
>    wrote:
>
>      Howdy,
> > I agree collecting evidence properties by rank is not a good idea.
>      It's
> > shaky and error prone.
>
>      Agreed.
> > Concatenating the props into one value ensures all components of the
> > evidence remain together, but it makes keeping data integrity more
> > difficult.
> > I'd really like to have a way of storing properties for evidence
>      codes, but
> > with the current schema it seems to me like the only way is to drop
>      the
> > stock_cvterm unique constraint.
>
>      +1 to dropping the unique constraint on stock_cvterm
>
>      Duke
>
>      --
>      Jonathan "Duke" Leto
>      [hidden email]
>      http://leto.net

> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Siddhartha Basu
In reply to this post by Naama Menda

Hi Naama,

On Wed, 16 Mar 2011, Naama Menda wrote:

>    hi Andy,
>
>    no, I mean stock_cvterm and feture_cvterm, thinking how to allow storing
>    multiple evidence codes with properties of their own .

I am not sure how your data in stock_cvterm is influencing the constraint of
feature_cvterm. Could you explain your use case.
We store GO annotations in chado along with GO and evidence code
ontologies(mentioned couple of times in previous e-mails). The unique
constraint helps to prevent duplicate records from getting populated
which is built around the specification of GAF
format(http://www.geneontology.org/GO.format.gaf-2_0.shtml). Anything
with identical gene_id,pub_id, go_id and evidence code are considered
duplicate and should be avoided. The unique constraint in feature_cvterm
enforces that concept. The evidence code is linked via
feature_cvtermprop to its own evidence code(again mentioned in prevois
e-mail) ontology and so that uniqueness has to be checked at application
level. However,  it is possible to have different evidence codes with
identical gene_id, pub_id and go_id and in that case you need to bump the
rank to maintain that constraint. So, annotations only differing in
evidence codes are considered to be separate annotations. Overall,  i do
see a need for that constraint based on GAF spec and data integrity.

>
>    I'm not sure I understand how FlyBase relies on the unique constraints,
>    but if this introduces issues, I'd go with your approach of concatenating
>    the properties of the evidences into the prop.value field.
>
>    Since we use Bio:;Chado::Schema, and distribute code based on it,  I can't
>    remove the constraint locally.
In case you are writing to feature_cvterm,  it might not be exchangable
with chado along with the constraints. Here are couple of options ...
* You could definitely provide some options in your code to turn it off before you main body of
  code runs.
* Either you code to make it default,  or expect it to be default
  but in all case give some doc to explain what it expect.
* You could definitely provide a turn off sql in BCS's on_connect_do hook.  
* You could detect the constraint in your code and refuse to run the
  code unless it is taken off. Again on_connect hooks could be your
  friend here.

thanks,
-siddhartha


>
>    Does anyone else use Chado for ontology annotation evidences?

>
>    -Naama
>
>    
>
>    On Wed, Mar 16, 2011 at 11:32 AM, Andy Schroeder <[hidden email]>
>    wrote:
>
>      I assume you mean feature_cvtermprop and stock_cvtermprop tables and not
>      feature_cvterm and stock_cvterm?
>
>      FlyBase relies on a non-primary key unique constraints on tables so we
>      wouldn't be dropping these constraints.  If people really want to drop
>      the constraint then maybe there should be an option to do so but I don't
>      think this should be made default.
>
>      cheers,
>      Andy
>      On 3/16/11 11:16 AM, Naama Menda wrote:
> > If we go this way, we need to drop the unique constraint of
> > feature_cvterm as well.
> > Any objections?
> >
> > -Naama
> >
> >
> > On Wed, Mar 16, 2011 at 9:54 AM, Jonathan "Duke" Leto
>      <[hidden email]
> > <mailto:[hidden email]>> wrote:
> >
> >     Howdy,
> >
> >      > I agree collecting evidence properties by rank is not a good
> >     idea. It's
> >      > shaky and error prone.
> >
> >     Agreed.
> >
> >      > Concatenating the props into one value ensures all components
>      of the
> >      > evidence remain together, but it makes keeping data integrity
>      more
> >      > difficult.
> >      > I'd really like to have a way of storing properties for
>      evidence
> >     codes, but
> >      > with the current schema it seems to me like the only way is to
> >     drop the
> >      > stock_cvterm unique constraint.
> >
> >     +1 to dropping the unique constraint on stock_cvterm
> >
> >     Duke
> >
> >
> >     --
> >     Jonathan "Duke" Leto
> >     [hidden email] <mailto:[hidden email]>
> >     http://leto.net
> >
> >
> >
> >
> >
>      ------------------------------------------------------------------------------
> > Colocation vs. Managed Hosting
> > A question and answer guide to determining the best fit
> > for your organization - today and in the future.
> > http://p.sf.net/sfu/internap-sfd2d
> >
> >
> >
> > _______________________________________________
> > Gmod-schema mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>      ------------------------------------------------------------------------------
>      Colocation vs. Managed Hosting
>      A question and answer guide to determining the best fit
>      for your organization - today and in the future.
>      http://p.sf.net/sfu/internap-sfd2d
>      _______________________________________________
>      Gmod-schema mailing list
>      [hidden email]
>      https://lists.sourceforge.net/lists/listinfo/gmod-schema

> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Jonathan "Duke" Leto
In reply to this post by Andy Schroeder
Howdy,

> FlyBase relies on a non-primary key unique constraints on tables so we
> wouldn't be dropping these constraints.

Can you explain this in more detail? You seem to be saying that columns
that are not primary keys must have unique constraints, which doesn't
quite make sense to me.

Duke

--
Jonathan "Duke" Leto
[hidden email]
http://leto.net

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Hilmar Lapp-3
In principle, every table should have a natural primary key that is  
enforced through a unique key constraint.

Otherwise the only way to uniquely identify a row is through the  
surrogate primary key, and hence you can't if you don't have that in  
hand, such as when checking whether a DML should be an update or an  
insert.

Obviously, there are some table for which updates don't make sense  
(because the natural primary key would include all columns except the  
surrogate primary key). But ORMs do typically get irritated if you  
can't give the unique identification for a row in a table.

        -hilmar

On Mar 16, 2011, at 2:47 PM, Jonathan Duke Leto wrote:

> Howdy,
>
>> FlyBase relies on a non-primary key unique constraints on tables so  
>> we
>> wouldn't be dropping these constraints.
>
> Can you explain this in more detail? You seem to be saying that  
> columns
> that are not primary keys must have unique constraints, which doesn't
> quite make sense to me.
>
> Duke
>
> --
> Jonathan "Duke" Leto
> [hidden email]
> http://leto.net
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema

--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================





------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Andy Schroeder
Exactly.

cheers,
Andy

On 3/16/11 3:03 PM, Hilmar Lapp wrote:

> In principle, every table should have a natural primary key that is
> enforced through a unique key constraint.
>
> Otherwise the only way to uniquely identify a row is through the
> surrogate primary key, and hence you can't if you don't have that in
> hand, such as when checking whether a DML should be an update or an insert.
>
> Obviously, there are some table for which updates don't make sense
> (because the natural primary key would include all columns except the
> surrogate primary key). But ORMs do typically get irritated if you can't
> give the unique identification for a row in a table.
>
> -hilmar
>
> On Mar 16, 2011, at 2:47 PM, Jonathan Duke Leto wrote:
>
>> Howdy,
>>
>>> FlyBase relies on a non-primary key unique constraints on tables so we
>>> wouldn't be dropping these constraints.
>>
>> Can you explain this in more detail? You seem to be saying that columns
>> that are not primary keys must have unique constraints, which doesn't
>> quite make sense to me.
>>
>> Duke
>>
>> --
>> Jonathan "Duke" Leto
>> [hidden email]
>> http://leto.net
>>
>> ------------------------------------------------------------------------------
>>
>> Colocation vs. Managed Hosting
>> A question and answer guide to determining the best fit
>> for your organization - today and in the future.
>> http://p.sf.net/sfu/internap-sfd2d
>> _______________________________________________
>> Gmod-schema mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Siddhartha Basu
In reply to this post by Naama Menda


On Wed, 16 Mar 2011, Naama Menda wrote:

>    The constraint is
>     "feature_cvterm_c1" UNIQUE, btree (feature_id, cvterm_id, pub_id)

You are missing the rank column in the constraint.
It's also in BCS ...
http://cpansearch.perl.org/src/RBUELS/Bio-Chado-Schema-0.08002/lib/Bio/Chado/Schema/Result/Sequence/FeatureCvterm.pm

>
>    This is fine for storing GO annotations, but if you want to have 2 similar
>    annotations, with the same reference, but with different evidence codes, I
>    would load 2 similar annotations in feature_cvterm , each having a
>    different set of feature_cvtermprop (relationship, evidence_code, evidence
>    description, person, date, etc. ). As Any noted these are really
>    properties of the evidence property

You just bump the rank column in
feature_cvterm. Annotation with different evidence code considered to be separate annotations.
Could the spec of your annotation format include this feature.

>
>    Scott suggested incrementing the rank in the prop table, but I think it's
>    a shaky solution (especially since you don't always have the same property
>    types for each evidence) , and Andy noted that in FlyBase they concatenate
>    the evidence props into a string that goes into the prop.value field, and
>    this string is parsed by their software.

Yes,  if they are identical property type we bump the rank(which is a
very well defined chado pattern; correct me if i wrong). For example in
case of multiple with fields. And if they have different property types then use different type_id to
avoid bumping the rank. For example, in case date and source column of
GAF.

>
>    I suppose each methodology has pros and cons.

I believe that's almost true for everything,  however,  i am trying to
understand the part of your annotation spec where the standard feature_cvterm
model fails in such a way so that it warrants the change of constraint.

thanks,
-siddhartha


>
>    -Naama
>
>    On Wed, Mar 16, 2011 at 11:54 AM, Siddhartha Basu <[hidden email]>
>    wrote:
>
>      Which constraint in feature_cvterm you are refering to. Is it the
>      'feature_id', 'cvterm_id',  'pub_id',  'rank' unique constraint. It is
>      quite accurate in storing GO annotations and its integrity. Why it
>      should be dropped ? How it would help to maintain the integrity of GO
>      annotations. Could you give some concrete example.
>      -siddhartha
>      On Wed, 16 Mar 2011, Naama Menda wrote:
>
> >    If we go this way, we need to drop the unique constraint of
>      feature_cvterm
> >    as well.
> >    Any objections?
> >
> >    -Naama
> >
> >    On Wed, Mar 16, 2011 at 9:54 AM, Jonathan "Duke" Leto
>      <[hidden email]>
> >    wrote:
> >
> >      Howdy,
> > > I agree collecting evidence properties by rank is not a good idea.
> >      It's
> > > shaky and error prone.
> >
> >      Agreed.
> > > Concatenating the props into one value ensures all components of the
> > > evidence remain together, but it makes keeping data integrity more
> > > difficult.
> > > I'd really like to have a way of storing properties for evidence
> >      codes, but
> > > with the current schema it seems to me like the only way is to drop
> >      the
> > > stock_cvterm unique constraint.
> >
> >      +1 to dropping the unique constraint on stock_cvterm
> >
> >      Duke
> >
> >      --
> >      Jonathan "Duke" Leto
> >      [hidden email]
> >      http://leto.net
>
> >
>      ------------------------------------------------------------------------------
> > Colocation vs. Managed Hosting
> > A question and answer guide to determining the best fit
> > for your organization - today and in the future.
> > http://p.sf.net/sfu/internap-sfd2d
> > _______________________________________________
> > Gmod-schema mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>      ------------------------------------------------------------------------------
>      Colocation vs. Managed Hosting
>      A question and answer guide to determining the best fit
>      for your organization - today and in the future.
>      http://p.sf.net/sfu/internap-sfd2d
>      _______________________________________________
>      Gmod-schema mailing list
>      [hidden email]
>      https://lists.sourceforge.net/lists/listinfo/gmod-schema

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Naama Menda
yes, you are right. my db is missing the feature_cvterm.rank column.

The rank column does solve the problem of adding multiple evidence codes for the same annotation.
Since these evidence codes are a grouping of several props (such as evidence code description, date, person, etc) it is hard to figure out which prop belongs to which annotation evidence.

I'll add a rank column to stock_cvterm too.

What is the 'is_not' column in feature_cvterm used for? Obsoleting annotations?

Thanks
-Naama



On Wed, Mar 16, 2011 at 3:15 PM, Siddhartha Basu <[hidden email]> wrote:


On Wed, 16 Mar 2011, Naama Menda wrote:

>    The constraint is
>     "feature_cvterm_c1" UNIQUE, btree (feature_id, cvterm_id, pub_id)

You are missing the rank column in the constraint.
It's also in BCS ...
http://cpansearch.perl.org/src/RBUELS/Bio-Chado-Schema-0.08002/lib/Bio/Chado/Schema/Result/Sequence/FeatureCvterm.pm

>
>    This is fine for storing GO annotations, but if you want to have 2 similar
>    annotations, with the same reference, but with different evidence codes, I
>    would load 2 similar annotations in feature_cvterm , each having a
>    different set of feature_cvtermprop (relationship, evidence_code, evidence
>    description, person, date, etc. ). As Any noted these are really
>    properties of the evidence property

You just bump the rank column in
feature_cvterm. Annotation with different evidence code considered to be separate annotations.
Could the spec of your annotation format include this feature.

>
>    Scott suggested incrementing the rank in the prop table, but I think it's
>    a shaky solution (especially since you don't always have the same property
>    types for each evidence) , and Andy noted that in FlyBase they concatenate
>    the evidence props into a string that goes into the prop.value field, and
>    this string is parsed by their software.

Yes,  if they are identical property type we bump the rank(which is a
very well defined chado pattern; correct me if i wrong). For example in
case of multiple with fields. And if they have different property types then use different type_id to
avoid bumping the rank. For example, in case date and source column of
GAF.

>
>    I suppose each methodology has pros and cons.

I believe that's almost true for everything,  however,  i am trying to
understand the part of your annotation spec where the standard feature_cvterm
model fails in such a way so that it warrants the change of constraint.

thanks,
-siddhartha


>
>    -Naama
>
>    On Wed, Mar 16, 2011 at 11:54 AM, Siddhartha Basu <[hidden email]>
>    wrote:
>
>      Which constraint in feature_cvterm you are refering to. Is it the
>      'feature_id', 'cvterm_id',  'pub_id',  'rank' unique constraint. It is
>      quite accurate in storing GO annotations and its integrity. Why it
>      should be dropped ? How it would help to maintain the integrity of GO
>      annotations. Could you give some concrete example.
>      -siddhartha
>      On Wed, 16 Mar 2011, Naama Menda wrote:
>
> >    If we go this way, we need to drop the unique constraint of
>      feature_cvterm
> >    as well.
> >    Any objections?
> >
> >    -Naama
> >
> >    On Wed, Mar 16, 2011 at 9:54 AM, Jonathan "Duke" Leto
>      <[hidden email]>
> >    wrote:
> >
> >      Howdy,
> > > I agree collecting evidence properties by rank is not a good idea.
> >      It's
> > > shaky and error prone.
> >
> >      Agreed.
> > > Concatenating the props into one value ensures all components of the
> > > evidence remain together, but it makes keeping data integrity more
> > > difficult.
> > > I'd really like to have a way of storing properties for evidence
> >      codes, but
> > > with the current schema it seems to me like the only way is to drop
> >      the
> > > stock_cvterm unique constraint.
> >
> >      +1 to dropping the unique constraint on stock_cvterm
> >
> >      Duke
> >
> >      --
> >      Jonathan "Duke" Leto
> >      [hidden email]
> >      http://leto.net
>
> >
>      ------------------------------------------------------------------------------
> > Colocation vs. Managed Hosting
> > A question and answer guide to determining the best fit
> > for your organization - today and in the future.
> > http://p.sf.net/sfu/internap-sfd2d
> > _______________________________________________
> > Gmod-schema mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>      ------------------------------------------------------------------------------
>      Colocation vs. Managed Hosting
>      A question and answer guide to determining the best fit
>      for your organization - today and in the future.
>      http://p.sf.net/sfu/internap-sfd2d
>      _______________________________________________
>      Gmod-schema mailing list
>      [hidden email]
>      https://lists.sourceforge.net/lists/listinfo/gmod-schema

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Siddhartha Basu
On Wed, 16 Mar 2011, Naama Menda wrote:

>    yes, you are right. my db is missing the feature_cvterm.rank column.
>
>    The rank column does solve the problem of adding multiple evidence codes
>    for the same annotation.
>    Since these evidence codes are a grouping of several props (such as
>    evidence code description, date, person, etc) it is hard to figure out
>    which prop belongs to which annotation evidence.

Great! that it helps with your situation. Overall,  it is a great
discussion.

>    I'll add a rank column to stock_cvterm too.
>
>    What is the 'is_not' column in feature_cvterm used for? Obsoleting
>    annotations?

Technically it is more for negating specifically where a particular
evidence explicity proves that a particular gene/protein is not
associated with this particular go/other term.
It maps to the Qualifier column of GAF
format specifically to the NOT value. IMHO, obsoleting either the
feature or cvterm implicitly obsoletes the annotation.

thanks,
-siddhartha

>
>    Thanks
>    -Naama
>
>    On Wed, Mar 16, 2011 at 3:15 PM, Siddhartha Basu <[hidden email]>
>    wrote:
>
>      On Wed, 16 Mar 2011, Naama Menda wrote:
>
> >    The constraint is
> >     "feature_cvterm_c1" UNIQUE, btree (feature_id, cvterm_id, pub_id)
>
>      You are missing the rank column in the constraint.
>      It's also in BCS ...
>      http://cpansearch.perl.org/src/RBUELS/Bio-Chado-Schema-0.08002/lib/Bio/Chado/Schema/Result/Sequence/FeatureCvterm.pm
> >
> >    This is fine for storing GO annotations, but if you want to have 2
>      similar
> >    annotations, with the same reference, but with different evidence
>      codes, I
> >    would load 2 similar annotations in feature_cvterm , each having a
> >    different set of feature_cvtermprop (relationship, evidence_code,
>      evidence
> >    description, person, date, etc. ). As Any noted these are really
> >    properties of the evidence property
>
>      You just bump the rank column in
>      feature_cvterm. Annotation with different evidence code considered to be
>      separate annotations.
>      Could the spec of your annotation format include this feature.
> >
> >    Scott suggested incrementing the rank in the prop table, but I
>      think it's
> >    a shaky solution (especially since you don't always have the same
>      property
> >    types for each evidence) , and Andy noted that in FlyBase they
>      concatenate
> >    the evidence props into a string that goes into the prop.value
>      field, and
> >    this string is parsed by their software.
>
>      Yes,  if they are identical property type we bump the rank(which is a
>      very well defined chado pattern; correct me if i wrong). For example in
>      case of multiple with fields. And if they have different property types
>      then use different type_id to
>      avoid bumping the rank. For example, in case date and source column of
>      GAF.
> >
> >    I suppose each methodology has pros and cons.
>
>      I believe that's almost true for everything,  however,  i am trying to
>      understand the part of your annotation spec where the standard
>      feature_cvterm
>      model fails in such a way so that it warrants the change of constraint.
>
>      thanks,
>      -siddhartha
>
> >
> >    -Naama
> >
> >    On Wed, Mar 16, 2011 at 11:54 AM, Siddhartha Basu
>      <[hidden email]>
> >    wrote:
> >
> >      Which constraint in feature_cvterm you are refering to. Is it the
> >      'feature_id', 'cvterm_id',  'pub_id',  'rank' unique constraint.
>      It is
> >      quite accurate in storing GO annotations and its integrity. Why
>      it
> >      should be dropped ? How it would help to maintain the integrity
>      of GO
> >      annotations. Could you give some concrete example.
> >      -siddhartha
> >      On Wed, 16 Mar 2011, Naama Menda wrote:
> >
> > >    If we go this way, we need to drop the unique constraint of
> >      feature_cvterm
> > >    as well.
> > >    Any objections?
> > >
> > >    -Naama
> > >
> > >    On Wed, Mar 16, 2011 at 9:54 AM, Jonathan "Duke" Leto
> >      <[hidden email]>
> > >    wrote:
> > >
> > >      Howdy,
> > > > I agree collecting evidence properties by rank is not a good idea.
> > >      It's
> > > > shaky and error prone.
> > >
> > >      Agreed.
> > > > Concatenating the props into one value ensures all components of
>      the
> > > > evidence remain together, but it makes keeping data integrity more
> > > > difficult.
> > > > I'd really like to have a way of storing properties for evidence
> > >      codes, but
> > > > with the current schema it seems to me like the only way is to
>      drop
> > >      the
> > > > stock_cvterm unique constraint.
> > >
> > >      +1 to dropping the unique constraint on stock_cvterm
> > >
> > >      Duke
> > >
> > >      --
> > >      Jonathan "Duke" Leto
> > >      [hidden email]
> > >      http://leto.net
> >
> > >
> >    
>       ------------------------------------------------------------------------------
> > > Colocation vs. Managed Hosting
> > > A question and answer guide to determining the best fit
> > > for your organization - today and in the future.
> > > http://p.sf.net/sfu/internap-sfd2d
> > > _______________________________________________
> > > Gmod-schema mailing list
> > > [hidden email]
> > > https://lists.sourceforge.net/lists/listinfo/gmod-schema
> >
> >    
>       ------------------------------------------------------------------------------
> >      Colocation vs. Managed Hosting
> >      A question and answer guide to determining the best fit
> >      for your organization - today and in the future.
> >      http://p.sf.net/sfu/internap-sfd2d
> >      _______________________________________________
> >      Gmod-schema mailing list
> >      [hidden email]
> >      https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>      ------------------------------------------------------------------------------
>      Colocation vs. Managed Hosting
>      A question and answer guide to determining the best fit
>      for your organization - today and in the future.
>      http://p.sf.net/sfu/internap-sfd2d
>      _______________________________________________
>      Gmod-schema mailing list
>      [hidden email]
>      https://lists.sourceforge.net/lists/listinfo/gmod-schema

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Storing ontology annotations for stocks

Andy Schroeder
In reply to this post by Naama Menda
Hi Naama,

> The rank column does solve the problem of adding multiple evidence codes
> for the same annotation.
> Since these evidence codes are a grouping of several props (such as
> evidence code description, date, person, etc) it is hard to figure out
> which prop belongs to which annotation evidence.

If you use a different type for each of these attributes of the evidence
code then you can store each of them (including the evidence code
itself) as individual properties of the stock_cvterm.  Think of the
evidence code name as an attribute of the evidence as a whole with other
attributes being the date, person, etc. each of which can be stored as a
property of the annotation.
>
> What is the 'is_not' column in feature_cvterm used for? Obsoleting
> annotations?

See: http://www.geneontology.org/GO.annotation.conventions.shtml#not

cheers,
Andy

>
> Thanks
> -Naama
>
>
>
> On Wed, Mar 16, 2011 at 3:15 PM, Siddhartha Basu <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>
>
>     On Wed, 16 Mar 2011, Naama Menda wrote:
>
>      > � �The constraint is
>      > � � "feature_cvterm_c1" UNIQUE, btree (feature_id, cvterm_id, pub_id)
>
>     You are missing the rank column in the constraint.
>     It's also in BCS ...
>     http://cpansearch.perl.org/src/RBUELS/Bio-Chado-Schema-0.08002/lib/Bio/Chado/Schema/Result/Sequence/FeatureCvterm.pm
>
>      >
>      > � �This is fine for storing GO annotations, but if you want to
>     have 2 similar
>      > � �annotations, with the same reference, but with different
>     evidence codes, I
>      > � �would load 2 similar annotations in feature_cvterm , each having a
>      > � �different set of feature_cvtermprop (relationship,
>     evidence_code, evidence
>      > � �description, person, date, etc. ). As Any noted these are really
>      > � �properties of the evidence property
>
>     You just bump the rank column in
>     feature_cvterm. Annotation with different evidence code considered
>     to be separate annotations.
>     Could the spec of your annotation format include this feature.
>
>      >
>      > � �Scott suggested incrementing the rank in the prop table, but I
>     think it's
>      > � �a shaky solution (especially since you don't always have the
>     same property
>      > � �types for each evidence) , and Andy noted that in FlyBase they
>     concatenate
>      > � �the evidence props into a string that goes into the prop.value
>     field, and
>      > � �this string is parsed by their software.
>
>     Yes, �if they are identical property type we bump the rank(which is a
>     very well defined chado pattern; correct me if i wrong). For example in
>     case of multiple with fields. And if they have different property
>     types then use different type_id to
>     avoid bumping the rank. For example, in case date and source column of
>     GAF.
>
>      >
>      > � �I suppose each methodology has pros and cons.
>
>     I believe that's almost true for everything, �however, �i am trying to
>     understand the part of your annotation spec where the standard
>     feature_cvterm
>     model fails in such a way so that it warrants the change of constraint.
>
>     thanks,
>     -siddhartha
>
>
>      >
>      > � �-Naama
>      >
>      > � �On Wed, Mar 16, 2011 at 11:54 AM, Siddhartha Basu
>     <[hidden email] <mailto:[hidden email]>>
>      > � �wrote:
>      >
>      > � � �Which constraint in feature_cvterm you are refering to. Is
>     it the
>      > � � �'feature_id', 'cvterm_id', �'pub_id', �'rank' unique
>     constraint. It is
>      > � � �quite accurate in storing GO annotations and its integrity.
>     Why it
>      > � � �should be dropped ? How it would help to maintain the
>     integrity of GO
>      > � � �annotations. Could you give some concrete example.
>      > � � �-siddhartha
>      > � � �On Wed, 16 Mar 2011, Naama Menda wrote:
>      >
>      > > � �If we go this way, we need to drop the unique constraint of
>      > � � �feature_cvterm
>      > > � �as well.
>      > > � �Any objections?
>      > >
>      > > � �-Naama
>      > >
>      > > � �On Wed, Mar 16, 2011 at 9:54 AM, Jonathan "Duke" Leto
>      > � � �<[hidden email] <mailto:[hidden email]>>
>      > > � �wrote:
>      > >
>      > > � � �Howdy,
>      > > > I agree collecting evidence properties by rank is not a good
>     idea.
>      > > � � �It's
>      > > > shaky and error prone.
>      > >
>      > > � � �Agreed.
>      > > > Concatenating the props into one value ensures all components
>     of the
>      > > > evidence remain together, but it makes keeping data integrity
>     more
>      > > > difficult.
>      > > > I'd really like to have a way of storing properties for evidence
>      > > � � �codes, but
>      > > > with the current schema it seems to me like the only way is
>     to drop
>      > > � � �the
>      > > > stock_cvterm unique constraint.
>      > >
>      > > � � �+1 to dropping the unique constraint on stock_cvterm
>      > >
>      > > � � �Duke
>      > >
>      > > � � �--
>      > > � � �Jonathan "Duke" Leto
>      > > � � �[hidden email] <mailto:[hidden email]>
>      > > � � �http://leto.net
>      >
>      > >
>      > � �
>     �------------------------------------------------------------------------------
>      > > Colocation vs. Managed Hosting
>      > > A question and answer guide to determining the best fit
>      > > for your organization - today and in the future.
>      > > http://p.sf.net/sfu/internap-sfd2d
>      > > _______________________________________________
>      > > Gmod-schema mailing list
>      > > [hidden email]
>     <mailto:[hidden email]>
>      > > https://lists.sourceforge.net/lists/listinfo/gmod-schema
>      >
>      > � �
>     �------------------------------------------------------------------------------
>      > � � �Colocation vs. Managed Hosting
>      > � � �A question and answer guide to determining the best fit
>      > � � �for your organization - today and in the future.
>      > � � �http://p.sf.net/sfu/internap-sfd2d
>      > � � �_______________________________________________
>      > � � �Gmod-schema mailing list
>      > � � �[hidden email]
>     <mailto:[hidden email]>
>      > � � �https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>     ------------------------------------------------------------------------------
>     Colocation vs. Managed Hosting
>     A question and answer guide to determining the best fit
>     for your organization - today and in the future.
>     http://p.sf.net/sfu/internap-sfd2d
>     _______________________________________________
>     Gmod-schema mailing list
>     [hidden email]
>     <mailto:[hidden email]>
>     https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>
>
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
>
>
>
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
12