Fwd: Load rpkm with cv_type

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: Load rpkm with cv_type

girlwithglasses

Forwarding to the GMOD schema mailing list (for Chado-related questions).

---------- Forwarded message ----------
From: Maximo Rivarola <[hidden email]>
Date: Thu, Jan 10, 2013 at 8:55 PM
Subject: Load rpkm with cv_type
To: [hidden email]


Hi,
thanks in advanced for the chado squema, works well!!!!
I have been stuck with one problem:

We have new info on rna-seq and would like to store the RPKM values in the "feature_expressionprop" table....
Yet, this requieres I use a cv_type related to the RPKM value i want to store.

What cv_type is related to the RPKM values that you obtain from rna-seq experiements???

My Q is basically is, how do you load RPKM values to genes stored in chado database???

RPKM:   Reads   per   kilobase   of   exon   model   per   million   mapped   reads


cheers and thanks!!!!
maximo

--
----------------------------------------------------------------------------

Maximo Rivarola PhD.
Inv. Asistente CONICET. Grupo de Bioinformatica, Instituto de Biotecnologia
Centro de Investigaciones en Ciencias Veterinarias y Agronómicas (CICVyA)
INTA-Castelar
Calle Repetto y De Los Reseros s/n; (C.P. 1686) 
Hurlingham- Provincia de Buenos Aires
Tel: (0054-11) 4621-1278 interno: 173
http://www.linkedin.com/pub/maximo-rivarola/30/659/80b

[hidden email]
[hidden email]

----------------------------------------------------------------------------





--
Amelia Ireland
GMOD Community Support || http://gmod.org

------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122912
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Load rpkm with cv_type

David Emmert
Hi Maximo,

FlyBase has extended the library module and we store RPKM values (linked to gene features) as attributes of library_feature relationships in a table called library_featureprop.

Metadata for each RNA-Seq experiment (provenance, attributes like read length and total reads) are stored in tables library, libraryprop, library_pub, etc, with xrefs to primary sequence archives linked via library_dbxref. 

Expression information is stored in expression module tables, linked to each experiment via library_expression.

We're managing RPKM data for multiple fly genomes from over 190 individual RNA-Seq experiments in our production chado instance now, and this implementation seems to be scaling well.  Getting the metadata squared-away is key, especially if you'll be recalculating RPKM values as gene models are updated.

If you want to follow our implementation, I'd suggest you consider downloading an instance of FlyBase chado (from flybase.org, here: ftp://ftp.flybase.net/releases/FB2012_06/psql/) and pattern your implementation after what you'll find there.   I'll check our library module extension into GMOD SVN on Monday so you can get the DDL.

Hope this helps!

Best,

-Dave

David Emmert
FlyBase - Harvard
[hidden email]



------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122412
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Load rpkm with cv_type

Maximo Rivarola
Hi dave,
Thanks so much for you reply!!!
I will try to implement it this week and let you know,
cheers,
maximo
On Sun, Jan 13, 2013 at 8:30 PM, David Emmert <[hidden email]> wrote:
Hi Maximo,

FlyBase has extended the library module and we store RPKM values (linked to gene features) as attributes of library_feature relationships in a table called library_featureprop.

Metadata for each RNA-Seq experiment (provenance, attributes like read length and total reads) are stored in tables library, libraryprop, library_pub, etc, with xrefs to primary sequence archives linked via library_dbxref. 

Expression information is stored in expression module tables, linked to each experiment via library_expression.

We're managing RPKM data for multiple fly genomes from over 190 individual RNA-Seq experiments in our production chado instance now, and this implementation seems to be scaling well.  Getting the metadata squared-away is key, especially if you'll be recalculating RPKM values as gene models are updated.

If you want to follow our implementation, I'd suggest you consider downloading an instance of FlyBase chado (from flybase.org, here: ftp://ftp.flybase.net/releases/FB2012_06/psql/) and pattern your implementation after what you'll find there.   I'll check our library module extension into GMOD SVN on Monday so you can get the DDL.

Hope this helps!

Best,

-Dave

David Emmert
FlyBase - Harvard
[hidden email]





--
----------------------------------------------------------------------------------------------

Maximo Rivarola PhD.
Grupo de Bioinformatica, Instituto de Biotecnologia
Centro de Investigaciones en Ciencias Veterinarias y Agronómicas
(CICVyA)
INTA-Castelar
Calle Repetto y De Los Reseros s/n; (C.P. 1686)
Hurlingham- Provincia de Buenos Aires
Tel: (0054-11) 4621-1278 interno: 173
http://www.linkedin.com/pub/maximo-rivarola/30/659/80b

[hidden email]
[hidden email]

------------------------------------------------------------------

------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122412
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Load rpkm with cv_type

Maximo Rivarola
In reply to this post by David Emmert
Hi david,
thanks again for your reply,
I have a follow up question or comment:
Given that we use the chado-build-schema.pl (works great on previous DBs),
we need to include the following 2 "flybase" tables to load rpkm values:

- library_expression
( public | library_expression                                              | table    | postgres)

- library_featureprop
( public | library_featureprop                                             | table    | postgres)

correct??
Once I have these, I can use the same strategy. I hope!!! ;)

I plan to download the full flybase.psql and from there obtain correct info on linking fields of these 2 tables with rest of squema, etc.....
From there, add tables to file in library directory where script uses as input.

Think this will work???
It easy to build new DB from chado-build-schema.pl, we have many projects.

Hope this works!
cheers,
maximo



On Sun, 2013-01-13 at 18:30 -0500, David Emmert wrote:
Hi Maximo,


FlyBase has extended the library module and we store RPKM values (linked to gene features) as attributes of library_feature relationships in a table called library_featureprop.

Metadata for each RNA-Seq experiment (provenance, attributes like read length and total reads) are stored in tables library, libraryprop, library_pub, etc, with xrefs to primary sequence archives linked via library_dbxref. 

Expression information is stored in expression module tables, linked to each experiment via library_expression.


We're managing RPKM data for multiple fly genomes from over 190 individual RNA-Seq experiments in our production chado instance now, and this implementation seems to be scaling well.  Getting the metadata squared-away is key, especially if you'll be recalculating RPKM values as gene models are updated.


If you want to follow our implementation, I'd suggest you consider downloading an instance of FlyBase chado (from flybase.org, here: ftp://ftp.flybase.net/releases/FB2012_06/psql/) and pattern your implementation after what you'll find there.   I'll check our library module extension into GMOD SVN on Monday so you can get the DDL.


Hope this helps!



Best,

-Dave


David Emmert
FlyBase - Harvard

[hidden email]




------------------------------------------------------------------------------
Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
and more. Get SQL Server skills now (including 2012) with LearnDevNow -
200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only - learn more at:
http://p.sf.net/sfu/learnmore_122512
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Load rpkm with cv_type

David Emmert
Hi Maximo,

I think those tables will give you the minimum of what you need to get RPKM into your chado instance, but we've made refinements to the library and expression modules which you may find useful (you'd need them if you wanted to follow the FB implementation closely), and some additional changes to the organism and interaction module (which you can probably do without).

I've never used chado-build-schema.pl, so I can't comment on how well that works...

I spent most of the afternoon updating the DDL for these modules, but find myself unable to commit my changes to the GMOD SVN.   What I thought was my password doesn't work.   Amelia or Scott, could one of you help me out?  

Best,

-Dave


On Mon, Jan 14, 2013 at 3:29 PM, Maximo Rivarola <[hidden email]> wrote:
Hi david,
thanks again for your reply,
I have a follow up question or comment:
Given that we use the chado-build-schema.pl (works great on previous DBs),
we need to include the following 2 "flybase" tables to load rpkm values:

- library_expression
( public | library_expression                                              | table    | postgres)

- library_featureprop
( public | library_featureprop                                             | table    | postgres)

correct??
Once I have these, I can use the same strategy. I hope!!! ;)

I plan to download the full flybase.psql and from there obtain correct info on linking fields of these 2 tables with rest of squema, etc.....
From there, add tables to file in library directory where script uses as input.

Think this will work???
It easy to build new DB from chado-build-schema.pl, we have many projects.

Hope this works!
cheers,
maximo




On Sun, 2013-01-13 at 18:30 -0500, David Emmert wrote:
Hi Maximo,


FlyBase has extended the library module and we store RPKM values (linked to gene features) as attributes of library_feature relationships in a table called library_featureprop.

Metadata for each RNA-Seq experiment (provenance, attributes like read length and total reads) are stored in tables library, libraryprop, library_pub, etc, with xrefs to primary sequence archives linked via library_dbxref. 

Expression information is stored in expression module tables, linked to each experiment via library_expression.


We're managing RPKM data for multiple fly genomes from over 190 individual RNA-Seq experiments in our production chado instance now, and this implementation seems to be scaling well.  Getting the metadata squared-away is key, especially if you'll be recalculating RPKM values as gene models are updated.


If you want to follow our implementation, I'd suggest you consider downloading an instance of FlyBase chado (from flybase.org, here: ftp://ftp.flybase.net/releases/FB2012_06/psql/) and pattern your implementation after what you'll find there.   I'll check our library module extension into GMOD SVN on Monday so you can get the DDL.


Hope this helps!



Best,

-Dave


David Emmert
FlyBase - Harvard

[hidden email]





------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122412
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Load rpkm with cv_type

David Emmert
Hi Maximo, other Interested Parties,

I've just committed changes to the library, expression, interaction, and organism chado modules to the gmod svn repository.   Maximo, you should be able to get the DDL you need for new library module tables via the svn now.

The expression module changes are largely cosmetic changes to some of the constraint declarations.  The two tables added to the interaction module were added as a convenience to allow grouping of multiple interactions that involve the same players.   The extensions and implementations of these modules were largely produced by Andy Schroeder here at FlyBase.   

The extensions to the library module consist mostly in additional relations linking to other modules, plus some refinements like addition of relationships between libraries.

The extensions to the organism module consist mostly in additional tables for managing strain information.  Kathleen Falls, also of FlyBase, played a lead role in the extension of the library and organism modules.

All of these schema changes are backwards compatible, consisting of new tables, but no changes to existing tables.

Best,

Dave


On Mon, Jan 14, 2013 at 6:00 PM, David Emmert <[hidden email]> wrote:
Hi Maximo,

I think those tables will give you the minimum of what you need to get RPKM into your chado instance, but we've made refinements to the library and expression modules which you may find useful (you'd need them if you wanted to follow the FB implementation closely), and some additional changes to the organism and interaction module (which you can probably do without).

I've never used chado-build-schema.pl, so I can't comment on how well that works...

I spent most of the afternoon updating the DDL for these modules, but find myself unable to commit my changes to the GMOD SVN.   What I thought was my password doesn't work.   Amelia or Scott, could one of you help me out?  

Best,

-Dave


On Mon, Jan 14, 2013 at 3:29 PM, Maximo Rivarola <[hidden email]> wrote:
Hi david,
thanks again for your reply,
I have a follow up question or comment:
Given that we use the chado-build-schema.pl (works great on previous DBs),
we need to include the following 2 "flybase" tables to load rpkm values:

- library_expression
( public | library_expression                                              | table    | postgres)

- library_featureprop
( public | library_featureprop                                             | table    | postgres)

correct??
Once I have these, I can use the same strategy. I hope!!! ;)

I plan to download the full flybase.psql and from there obtain correct info on linking fields of these 2 tables with rest of squema, etc.....
From there, add tables to file in library directory where script uses as input.

Think this will work???
It easy to build new DB from chado-build-schema.pl, we have many projects.

Hope this works!
cheers,
maximo




On Sun, 2013-01-13 at 18:30 -0500, David Emmert wrote:
Hi Maximo,


FlyBase has extended the library module and we store RPKM values (linked to gene features) as attributes of library_feature relationships in a table called library_featureprop.

Metadata for each RNA-Seq experiment (provenance, attributes like read length and total reads) are stored in tables library, libraryprop, library_pub, etc, with xrefs to primary sequence archives linked via library_dbxref. 

Expression information is stored in expression module tables, linked to each experiment via library_expression.


We're managing RPKM data for multiple fly genomes from over 190 individual RNA-Seq experiments in our production chado instance now, and this implementation seems to be scaling well.  Getting the metadata squared-away is key, especially if you'll be recalculating RPKM values as gene models are updated.


If you want to follow our implementation, I'd suggest you consider downloading an instance of FlyBase chado (from flybase.org, here: ftp://ftp.flybase.net/releases/FB2012_06/psql/) and pattern your implementation after what you'll find there.   I'll check our library module extension into GMOD SVN on Monday so you can get the DDL.


Hope this helps!



Best,

-Dave


David Emmert
FlyBase - Harvard

[hidden email]






------------------------------------------------------------------------------
Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
and more. Get SQL Server skills now (including 2012) with LearnDevNow -
200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only - learn more at:
http://p.sf.net/sfu/learnmore_122512
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: Fwd: Load rpkm with cv_type

Maximo Rivarola
Thanks david!!!!!!!
I am downloading the new updated modules now.
https://gmod.svn.sourceforge.net/svnroot/gmod/schema/trunk/chado/modules/

We are greatfull for yor time and quick responses.
Thanks again and cheers,
maximo

--
----------------------------------------------------------------------------

Maximo Rivarola PhD.
Inv. Asistente CONICET. Grupo de Bioinformatica, Instituto de Biotecnologia
Centro de Investigaciones en Ciencias Veterinarias y Agronómicas (CICVyA)
INTA-Castelar
Calle Repetto y De Los Reseros s/n; (C.P. 1686) 
Hurlingham- Provincia de Buenos Aires
Tel: (0054-11) 4621-1278 interno: 173
http://www.linkedin.com/pub/maximo-rivarola/30/659/80b

----------------------------------------------------------------------------





On Tue, 2013-01-15 at 11:00 -0500, David Emmert wrote:
Hi Maximo, other Interested Parties,

I've just committed changes to the library, expression, interaction, and organism chado modules to the gmod svn repository.   Maximo, you should be able to get the DDL you need for new library module tables via the svn now.

The expression module changes are largely cosmetic changes to some of the constraint declarations.  The two tables added to the interaction module were added as a convenience to allow grouping of multiple interactions that involve the same players.   The extensions and implementations of these modules were largely produced by Andy Schroeder here at FlyBase.   

The extensions to the library module consist mostly in additional relations linking to other modules, plus some refinements like addition of relationships between libraries.

The extensions to the organism module consist mostly in additional tables for managing strain information.  Kathleen Falls, also of FlyBase, played a lead role in the extension of the library and organism modules.

All of these schema changes are backwards compatible, consisting of new tables, but no changes to existing tables.

Best,

Dave


On Mon, Jan 14, 2013 at 6:00 PM, David Emmert <[hidden email]> wrote:
Hi Maximo,


I think those tables will give you the minimum of what you need to get RPKM into your chado instance, but we've made refinements to the library and expression modules which you may find useful (you'd need them if you wanted to follow the FB implementation closely), and some additional changes to the organism and interaction module (which you can probably do without).


I've never used chado-build-schema.pl, so I can't comment on how well that works...



I spent most of the afternoon updating the DDL for these modules, but find myself unable to commit my changes to the GMOD SVN.   What I thought was my password doesn't work.   Amelia or Scott, could one of you help me out?  


Best,


-Dave



On Mon, Jan 14, 2013 at 3:29 PM, Maximo Rivarola <[hidden email]> wrote:
Hi david,
thanks again for your reply,
I have a follow up question or comment:
Given that we use the chado-build-schema.pl (works great on previous DBs),
we need to include the following 2 "flybase" tables to load rpkm values:

- library_expression
( public | library_expression                                              | table    | postgres)

- library_featureprop
( public | library_featureprop                                             | table    | postgres)

correct??
Once I have these, I can use the same strategy. I hope!!! ;)

I plan to download the full flybase.psql and from there obtain correct info on linking fields of these 2 tables with rest of squema, etc.....
From there, add tables to file in library directory where script uses as input.

Think this will work???
It easy to build new DB from chado-build-schema.pl, we have many projects.

Hope this works!
cheers,
maximo




On Sun, 2013-01-13 at 18:30 -0500, David Emmert wrote:
Hi Maximo,


FlyBase has extended the library module and we store RPKM values (linked to gene features) as attributes of library_feature relationships in a table called library_featureprop.

Metadata for each RNA-Seq experiment (provenance, attributes like read length and total reads) are stored in tables library, libraryprop, library_pub, etc, with xrefs to primary sequence archives linked via library_dbxref. 

Expression information is stored in expression module tables, linked to each experiment via library_expression.


We're managing RPKM data for multiple fly genomes from over 190 individual RNA-Seq experiments in our production chado instance now, and this implementation seems to be scaling well.  Getting the metadata squared-away is key, especially if you'll be recalculating RPKM values as gene models are updated.


If you want to follow our implementation, I'd suggest you consider downloading an instance of FlyBase chado (from flybase.org, here: ftp://ftp.flybase.net/releases/FB2012_06/psql/) and pattern your implementation after what you'll find there.   I'll check our library module extension into GMOD SVN on Monday so you can get the DDL.


Hope this helps!



Best,

-Dave


David Emmert
FlyBase - Harvard

[hidden email]








------------------------------------------------------------------------------
Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
and more. Get SQL Server skills now (including 2012) with LearnDevNow -
200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only - learn more at:
http://p.sf.net/sfu/learnmore_122512
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema