Re: (no subject)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: (no subject)

Scott Cain
Hi Andrew,

Since this email is mostly related to databases and Chado, I'm going
to cc it to the schema mailing list.  Future responses can trim the
help email address off the cc list.  I'm going to do my best to answer
in line below.

Scott


On Thu, Feb 7, 2013 at 1:24 PM, Block,Andrew <[hidden email]> wrote:

> Hello GMOD,
>
> My lab is starting to create a database for out gene of interest, since I am
> new to databases, I looked for models of great databases leading me to GMOD.
> We study RNA dependent RNA polymerases (RdRp) and have collected many
> different mutants for different viruses species.  The goal of the database
> is to analysis structure, sequences, & molecular data and keep track of
> storage through a webpage.  I have created a database --in postgreSQL--
> using Chado's general, control vocabularies and stock modules.  I have also
> created a simplified version of the sequence module with just feature,
> location and relationship.  I have several questions on the next steps for
> my database.
>
> The RdRp have several motifs (like introns and exons) that we like to have
> new sequences automatically align to.  Would Apollo or any of the other
> software offered be able to do this?  Can I use the simplified database or
> would I have to use the full Chado database?  What would be the easiest or
> simplified way I could go about setting this up?

I'm not sure what you mean by "automatically align to."  When you're
talking about doing anything automatically, I would think your talking
about some sort of pipeline to do the analysis and then parse the
results in to some usable format (like GFF3).  You might want to look
at MAKER for that, though since you're working with viral sequences,
it may not be appropriate.  Once you have GFF3, you can load it into
Chado using either Tripal (if you're using that) or the GFF bulk
loader that comes with Chado, gmod_bulk_load_gff3.pl.

If it's not really automatic, meaning you need a human to look at the
sequence and assign features, then yes, Apollo can do it.  A new
editor, WebApollo, was just released; it is focused on protein coding
genes in it's initial release, but that seems like it would work find
for you; setting it up is still a bit of a chore, but is certainly
doable (I set up a simple WebApollo instance in an afternoon last
week--but that was without it being connected to Chado).  The
Apollo/WebApollo mailing list is [hidden email].

Also, when you're talking about alignments (and thus computed
results), in Chado you need to store them in the companalysis module.
While I applaud your efforts to make something "simplified" from
Chado, it may make more sense to just use Chado if you want to use
other GMOD tools with it.  Having several empty tables won't hurt
anything.

Finally, I am working on a new instance of GMOD in the Cloud, which is
several GMOD components (Chado, GBrowse, JBrowse, Tripal, and
WebApollo) installed on an Amazon Web Services AMI.  While this
requires some money to run, it saves much of the hassle of setting up
and maintaining software (and makes it so you don't have to buy new
server hardware).  The current version is 1.3, but that doesn't have
Apollo or WebApollo on it.  People frequently worry about costs
getting out of control with AWS, but I can tell you that WormBase, a
large and heavily used model organism database, uses AWS and it costs
them around $2000/year.

>
> We are looking at single nucleotide polymorphism (snps).  Can GBrowser or
> any of the other visualization program display snps?  We would like to find
> the same snp across many different species of viruses.  Does any of your
> software packages have the ability to do this or do I need to custom code
> this?

Anything that can be localized on a reference sequence with
coordinates can be displayed by either GBrowse or JBrowse.  Lots of
people use them for displaying SNPs.  GMOD doesn't really do
"analysis", except to wrap analysis programs written by other groups
into pipelines, like MAKER does, so no, there isn't a tool provided by
GMOD to do that, but I wouldn't be surprised if something like that
didn't already exist.  Once you have the results relating SNPs, you
could store that in Chado, and we can talk about how best to do that
in another email.  It would almost certainly require custom code to
load the data into the database.

>
> The lab is also in the structure of each RdRp. I have looked through your
> list of websites using GMOD and could not find any structural databases, so
> I do not know how familiar you are with a database like this.  The goal
> would be to create a database similar to the RCSB Protein Data Bank, but
> that can link to the sequence database.  I would like to go from the
> GBrowser to the structure side with one link.  Would this be possible?

Yes, GBrowse is extremely flexible in how it's configured, and having
it generate links from features to anywhere else on the web is fairly
straight forward (take a look a the GBrowse tutorial for
this--generating simple links is one of the first sections, and
generating links with arbitrary complexity can be created with perl
callbacks in the GBrowse configuration file.  More questions on this
can be directed to the GBrowse mailing list,
[hidden email].

>
> I would use Tripal and Chado::AutoDBI to help create the website.  Is there
> any other software you would recomend using in our database?
>

I would stick with Tripal and avoid Chado::AutoDBI unless you really
need it for something.  While I still generate the classes for
Chado::AutoDBI, it is rather archaic technology and I don't think many
people use it anymore.  Tripal is extremely flexible and likely to do
everything you need it to do (and the Tripal developers are quite
clever--if you can't figure out how to do something, ask them and
you're likely to get a good answer).

>
> Thank you,
>
> Andrew Block
>
> Research Associate
> -------------------------------------------------------
> Colorado State University
> Dept. of Biochemistry & Molecular Biology
> Molecular and Radiological Biosciences 139
> 1870 Campus Delivery
> Colorado State University
> Ft. Collins, CO  80523-1870
> -------------------------------------------------------
> (970) 491-0271    Lab
> [hidden email]
> -------------------------------------------------------
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: (no subject)

Suzanna Lewis
Just a brief update. I believe that Carson has made some new additions that have Maker automatically set up/launch Apollo at the end of it's run in JBrowse mode (i.e. read/view only). Then you just have to set up permissions to enable editing. Carson could give more details though better than I can.

-S

On Feb 8, 2013, at 9:11 AM, Scott Cain <[hidden email]> wrote:

> Hi Andrew,
>
> Since this email is mostly related to databases and Chado, I'm going
> to cc it to the schema mailing list.  Future responses can trim the
> help email address off the cc list.  I'm going to do my best to answer
> in line below.
>
> Scott
>
>
> On Thu, Feb 7, 2013 at 1:24 PM, Block,Andrew <[hidden email]> wrote:
>> Hello GMOD,
>>
>> My lab is starting to create a database for out gene of interest, since I am
>> new to databases, I looked for models of great databases leading me to GMOD.
>> We study RNA dependent RNA polymerases (RdRp) and have collected many
>> different mutants for different viruses species.  The goal of the database
>> is to analysis structure, sequences, & molecular data and keep track of
>> storage through a webpage.  I have created a database --in postgreSQL--
>> using Chado's general, control vocabularies and stock modules.  I have also
>> created a simplified version of the sequence module with just feature,
>> location and relationship.  I have several questions on the next steps for
>> my database.
>>
>> The RdRp have several motifs (like introns and exons) that we like to have
>> new sequences automatically align to.  Would Apollo or any of the other
>> software offered be able to do this?  Can I use the simplified database or
>> would I have to use the full Chado database?  What would be the easiest or
>> simplified way I could go about setting this up?
>
> I'm not sure what you mean by "automatically align to."  When you're
> talking about doing anything automatically, I would think your talking
> about some sort of pipeline to do the analysis and then parse the
> results in to some usable format (like GFF3).  You might want to look
> at MAKER for that, though since you're working with viral sequences,
> it may not be appropriate.  Once you have GFF3, you can load it into
> Chado using either Tripal (if you're using that) or the GFF bulk
> loader that comes with Chado, gmod_bulk_load_gff3.pl.
>
> If it's not really automatic, meaning you need a human to look at the
> sequence and assign features, then yes, Apollo can do it.  A new
> editor, WebApollo, was just released; it is focused on protein coding
> genes in it's initial release, but that seems like it would work find
> for you; setting it up is still a bit of a chore, but is certainly
> doable (I set up a simple WebApollo instance in an afternoon last
> week--but that was without it being connected to Chado).  The
> Apollo/WebApollo mailing list is [hidden email].
>
> Also, when you're talking about alignments (and thus computed
> results), in Chado you need to store them in the companalysis module.
> While I applaud your efforts to make something "simplified" from
> Chado, it may make more sense to just use Chado if you want to use
> other GMOD tools with it.  Having several empty tables won't hurt
> anything.
>
> Finally, I am working on a new instance of GMOD in the Cloud, which is
> several GMOD components (Chado, GBrowse, JBrowse, Tripal, and
> WebApollo) installed on an Amazon Web Services AMI.  While this
> requires some money to run, it saves much of the hassle of setting up
> and maintaining software (and makes it so you don't have to buy new
> server hardware).  The current version is 1.3, but that doesn't have
> Apollo or WebApollo on it.  People frequently worry about costs
> getting out of control with AWS, but I can tell you that WormBase, a
> large and heavily used model organism database, uses AWS and it costs
> them around $2000/year.
>
>>
>> We are looking at single nucleotide polymorphism (snps).  Can GBrowser or
>> any of the other visualization program display snps?  We would like to find
>> the same snp across many different species of viruses.  Does any of your
>> software packages have the ability to do this or do I need to custom code
>> this?
>
> Anything that can be localized on a reference sequence with
> coordinates can be displayed by either GBrowse or JBrowse.  Lots of
> people use them for displaying SNPs.  GMOD doesn't really do
> "analysis", except to wrap analysis programs written by other groups
> into pipelines, like MAKER does, so no, there isn't a tool provided by
> GMOD to do that, but I wouldn't be surprised if something like that
> didn't already exist.  Once you have the results relating SNPs, you
> could store that in Chado, and we can talk about how best to do that
> in another email.  It would almost certainly require custom code to
> load the data into the database.
>
>>
>> The lab is also in the structure of each RdRp. I have looked through your
>> list of websites using GMOD and could not find any structural databases, so
>> I do not know how familiar you are with a database like this.  The goal
>> would be to create a database similar to the RCSB Protein Data Bank, but
>> that can link to the sequence database.  I would like to go from the
>> GBrowser to the structure side with one link.  Would this be possible?
>
> Yes, GBrowse is extremely flexible in how it's configured, and having
> it generate links from features to anywhere else on the web is fairly
> straight forward (take a look a the GBrowse tutorial for
> this--generating simple links is one of the first sections, and
> generating links with arbitrary complexity can be created with perl
> callbacks in the GBrowse configuration file.  More questions on this
> can be directed to the GBrowse mailing list,
> [hidden email].
>
>>
>> I would use Tripal and Chado::AutoDBI to help create the website.  Is there
>> any other software you would recomend using in our database?
>>
>
> I would stick with Tripal and avoid Chado::AutoDBI unless you really
> need it for something.  While I still generate the classes for
> Chado::AutoDBI, it is rather archaic technology and I don't think many
> people use it anymore.  Tripal is extremely flexible and likely to do
> everything you need it to do (and the Tripal developers are quite
> clever--if you can't figure out how to do something, ask them and
> you're likely to get a good answer).
>
>>
>> Thank you,
>>
>> Andrew Block
>>
>> Research Associate
>> -------------------------------------------------------
>> Colorado State University
>> Dept. of Biochemistry & Molecular Biology
>> Molecular and Radiological Biosciences 139
>> 1870 Campus Delivery
>> Colorado State University
>> Ft. Collins, CO  80523-1870
>> -------------------------------------------------------
>> (970) 491-0271    Lab
>> [hidden email]
>> -------------------------------------------------------
>>
>
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> Ontario Institute for Cancer Research
>
> ------------------------------------------------------------------------------
> Free Next-Gen Firewall Hardware Offer
> Buy your Sophos next-gen firewall before the end March 2013
> and get the hardware for free! Learn more.
> http://p.sf.net/sfu/sophos-d2d-feb
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema


------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema