storing feature sequences - best practices ?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

storing feature sequences - best practices ?

ganeshS
Hi Members

I am trying to load prokaryotic and viral GFF features into a local Chado installation. I made the GFF files from the GenBank files (.gb) using a tool and these GFF files don't have sequences of the features.
 So obviously the residues column would be empty for the features is my guess ?

What is the best practise wrt Chado for storing the actual sequences/residues for features and the whole genome ?
I need to be able to pull them out for an interface in the future and for comparing annotations..

Thanka
Ganesh

------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: storing feature sequences - best practices ?

Scott Cain
Hi Ganesh,

Generally, just the sequences of the srcfeature (ie, the chromosome or full viral sequence) is stored in the residues column with that feature, since the sequence of any subfeature can be calculated by looking at its coordinates and getting a substring.

If your GFF doesn't have sequence, then yes, the residues column will be empty. You can use the --fastafile argument for gmod_bulk_load_gff3.pl to load a fasta sequence after you've loaded it's GFF--just be sure that the name of the sequence in the fasta file matches the feature ID in the GFF file.

Scott



On Mon, Apr 29, 2013 at 11:04 AM, Srinivasamoorthy, Ganesh - INTL <[hidden email]> wrote:
Hi Members

I am trying to load prokaryotic and viral GFF features into a local Chado installation. I made the GFF files from the GenBank files (.gb) using a tool and these GFF files don't have sequences of the features.
 So obviously the residues column would be empty for the features is my guess ?

What is the best practise wrt Chado for storing the actual sequences/residues for features and the whole genome ?
I need to be able to pull them out for an interface in the future and for comparing annotations..

Thanka
Ganesh

------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema




--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema