[x-post] GIS Indexing for Chado

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[x-post] GIS Indexing for Chado

॥ स्वक्ष ॥
Hello,

I had applied for the "GIS Indexing for Chado"[0] Gsoc project and as
per my off-list mail to Scott and Robin yesterday, I propose to work
on the same project outside the Gsoc framework. The proposal is
available on Melange[1] and I would request you to take a look at the
same and provide feedback. I would like to know if mentoring outside
of Gsoc is feasible or does this suggestion sound impractical(?).

[0] http://gmod.org/wiki/GSoC#IDEA_8:_GIS_Indexing_for_Chado
[1] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/o/1

Besides there is another requirement -- within Gsoc, google usually
handles administrative matters (acceptance letter, CPT certificate,
etc..) and since my University has strict project requirements --
requires a formal letter (akin to a CPT) from the Organization,
including the assigned mentor's CV, etc... ; I am not sure how to go
about this outside of Gsoc.  Hence, this email to the gmod-schema and
gmod-devel lists for your opinions and also to request for mentors.

Thanks for reading.

Regards,
vid ॥ http://svaksha.com ॥

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Gmod-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-devel
Reply | Threaded
Open this post in threaded view
|

Re: [x-post] GIS Indexing for Chado

Jason Stajich-4
vid -


I did try to do this approach briefly with a short-read-db project I was building that I have abandoned. I found that building the indexes took up so much space and time it was slower than other indexing solutions and prohibitive.Perhaps since the order of magnitude will be smaller for the features in this type of DB it won't be so slow to load the data and build the indexes.

The code was here and I basically just added a location field to a feature table which was used in queries over the start/stop fields from before.

This was focused around Bio::DB::SeqFeature::Store not Chado so I'll definitely be curious to hear how it goes -- the range queries with the GIS features is easy once you understand the syntax.

Jason
On Apr 27, 2011, at 2:29 AM, ॥ स्वक्ष ॥ wrote:

Hello,

I had applied for the "GIS Indexing for Chado"[0] Gsoc project and as
per my off-list mail to Scott and Robin yesterday, I propose to work
on the same project outside the Gsoc framework. The proposal is
available on Melange[1] and I would request you to take a look at the
same and provide feedback. I would like to know if mentoring outside
of Gsoc is feasible or does this suggestion sound impractical(?).

[0] http://gmod.org/wiki/GSoC#IDEA_8:_GIS_Indexing_for_Chado
[1] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/o/1

Besides there is another requirement -- within Gsoc, google usually
handles administrative matters (acceptance letter, CPT certificate,
etc..) and since my University has strict project requirements --
requires a formal letter (akin to a CPT) from the Organization,
including the assigned mentor's CV, etc... ; I am not sure how to go
about this outside of Gsoc.  Hence, this email to the gmod-schema and
gmod-devel lists for your opinions and also to request for mentors.

Thanks for reading.

Regards,
vid ॥ http://svaksha.com

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Gmod-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-devel


------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Gmod-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-devel
Reply | Threaded
Open this post in threaded view
|

Re: [Gmod-schema] [x-post] GIS Indexing for Chado

Scott Cain
Hi Jason,

I was thinking about trying the same thing with SeqFeature::Store's Pg
adaptor as an option to see if it would bump it's speed relative to
the MySQL adaptor.  Do you think it would be worthwhile pursuing?

Scott


On Wed, Apr 27, 2011 at 11:39 AM, Jason Stajich <[hidden email]> wrote:

> vid -
>
> I did try to do this approach briefly with a short-read-db project I was
> building that I have abandoned. I found that building the indexes took up so
> much space and time it was slower than other indexing solutions and
> prohibitive.Perhaps since the order of magnitude will be smaller for the
> features in this type of DB it won't be so slow to load the data and build
> the indexes.
> The code was here and I basically just added a location field to a feature
> table which was used in queries over the start/stop fields from before.
> https://github.com/hyphaltip/short-read-db
> This was focused around Bio::DB::SeqFeature::Store not Chado so I'll
> definitely be curious to hear how it goes -- the range queries with the GIS
> features is easy once you understand the syntax.
> Jason
> On Apr 27, 2011, at 2:29 AM, ॥ स्वक्ष ॥ wrote:
>
> Hello,
>
> I had applied for the "GIS Indexing for Chado"[0] Gsoc project and as
> per my off-list mail to Scott and Robin yesterday, I propose to work
> on the same project outside the Gsoc framework. The proposal is
> available on Melange[1] and I would request you to take a look at the
> same and provide feedback. I would like to know if mentoring outside
> of Gsoc is feasible or does this suggestion sound impractical(?).
>
> [0] http://gmod.org/wiki/GSoC#IDEA_8:_GIS_Indexing_for_Chado
> [1] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/o/1
>
> Besides there is another requirement -- within Gsoc, google usually
> handles administrative matters (acceptance letter, CPT certificate,
> etc..) and since my University has strict project requirements --
> requires a formal letter (akin to a CPT) from the Organization,
> including the assigned mentor's CV, etc... ; I am not sure how to go
> about this outside of Gsoc.  Hence, this email to the gmod-schema and
> gmod-devel lists for your opinions and also to request for mentors.
>
> Thanks for reading.
>
> Regards,
> vid ॥ http://svaksha.com ॥
>
> ------------------------------------------------------------------------------
> WhatsUp Gold - Download Free Network Management Software
> The most intuitive, comprehensive, and cost-effective network
> management toolset available today.  Delivers lowest initial
> acquisition cost and overall TCO of any competing solution.
> http://p.sf.net/sfu/whatsupgold-sd
> _______________________________________________
> Gmod-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-devel
>
>
> ------------------------------------------------------------------------------
> WhatsUp Gold - Download Free Network Management Software
> The most intuitive, comprehensive, and cost-effective network
> management toolset available today.  Delivers lowest initial
> acquisition cost and overall TCO of any competing solution.
> http://p.sf.net/sfu/whatsupgold-sd
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Gmod-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-devel
Reply | Threaded
Open this post in threaded view
|

Re: [Gmod-schema] [x-post] GIS Indexing for Chado

Scott Cain
In reply to this post by ॥ स्वक्ष ॥
Hi Vidya,

I'm not sure how this would work--I'm not technically part of any
academic institution; I'm a contractor for the Ontario Institute for
Cancer Research.  Robin might be able to provide more insight on this.

Additionally (and this is somewhat embarrassing), it seems that the
range functions that implement the GIS indexing do still work in
current (8.4) versions of PostgreSQL (I haven't tried 9.0 yet).  I
don't recall what made me think they weren't working, but when looked
at the database description for the featureloc table (\d featureloc)
in a current instance of Chado, I see this:

    "binloc_boxrange" gist (boxrange(fmin, fmax))
    "binloc_boxrange_src" gist (boxrange(srcfeature_id, fmin, fmax))

Rendering this particular project moot.  I'm very sorry about that.  I
do still want to pursue other methods of speeding up the Chado GBrowse
adaptor (Bio::DB::Das::Chado) that I wrote about previously, like
creating and storing serialized BioPerl objects.  I would be willing
to help you with that project if you're interested.

Scott


On Wed, Apr 27, 2011 at 5:29 AM, ॥ स्वक्ष ॥ <[hidden email]> wrote:

> Hello,
>
> I had applied for the "GIS Indexing for Chado"[0] Gsoc project and as
> per my off-list mail to Scott and Robin yesterday, I propose to work
> on the same project outside the Gsoc framework. The proposal is
> available on Melange[1] and I would request you to take a look at the
> same and provide feedback. I would like to know if mentoring outside
> of Gsoc is feasible or does this suggestion sound impractical(?).
>
> [0] http://gmod.org/wiki/GSoC#IDEA_8:_GIS_Indexing_for_Chado
> [1] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/o/1
>
> Besides there is another requirement -- within Gsoc, google usually
> handles administrative matters (acceptance letter, CPT certificate,
> etc..) and since my University has strict project requirements --
> requires a formal letter (akin to a CPT) from the Organization,
> including the assigned mentor's CV, etc... ; I am not sure how to go
> about this outside of Gsoc.  Hence, this email to the gmod-schema and
> gmod-devel lists for your opinions and also to request for mentors.
>
> Thanks for reading.
>
> Regards,
> vid ॥ http://svaksha.com ॥
>
> ------------------------------------------------------------------------------
> WhatsUp Gold - Download Free Network Management Software
> The most intuitive, comprehensive, and cost-effective network
> management toolset available today.  Delivers lowest initial
> acquisition cost and overall TCO of any competing solution.
> http://p.sf.net/sfu/whatsupgold-sd
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Gmod-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-devel
Reply | Threaded
Open this post in threaded view
|

Re: [Gmod-schema] [x-post] GIS Indexing for Chado

Fields, Christopher J
In reply to this post by Scott Cain
Scott, Jason,

Re: using GIS-based storage, I'm assuming this is R-Tree based?  If so one could use Pg, MySQL, or SQLite.  Should loop George Hartzell into this, he may have some additional thoughts as he has used RTree quite a bit with his work. I believe more on the SQLite end.

On a related note, I'm not sure whether it's worth pursuing but I have thought for a while now that it would be nice to abstract out the coord indexing in Bio::DB::SeqFeature::Store to allow binning, R-tree, NCL, or whatever (with the particular scheme stored in meta), but maybe have it be a separate set of modules that just creates the tables using an appropriate RDBMS loader for the lower-level stuff.  Is this essentially the same thing being proposed for the GIS-Chado project?

chris

On Apr 27, 2011, at 11:05 AM, Scott Cain wrote:

> Hi Jason,
>
> I was thinking about trying the same thing with SeqFeature::Store's Pg
> adaptor as an option to see if it would bump it's speed relative to
> the MySQL adaptor.  Do you think it would be worthwhile pursuing?
>
> Scott
>
>
> On Wed, Apr 27, 2011 at 11:39 AM, Jason Stajich <[hidden email]> wrote:
>> vid -
>>
>> I did try to do this approach briefly with a short-read-db project I was
>> building that I have abandoned. I found that building the indexes took up so
>> much space and time it was slower than other indexing solutions and
>> prohibitive.Perhaps since the order of magnitude will be smaller for the
>> features in this type of DB it won't be so slow to load the data and build
>> the indexes.
>> The code was here and I basically just added a location field to a feature
>> table which was used in queries over the start/stop fields from before.
>> https://github.com/hyphaltip/short-read-db
>> This was focused around Bio::DB::SeqFeature::Store not Chado so I'll
>> definitely be curious to hear how it goes -- the range queries with the GIS
>> features is easy once you understand the syntax.
>> Jason
>> On Apr 27, 2011, at 2:29 AM, ॥ स्वक्ष ॥ wrote:
>>
>> Hello,
>>
>> I had applied for the "GIS Indexing for Chado"[0] Gsoc project and as
>> per my off-list mail to Scott and Robin yesterday, I propose to work
>> on the same project outside the Gsoc framework. The proposal is
>> available on Melange[1] and I would request you to take a look at the
>> same and provide feedback. I would like to know if mentoring outside
>> of Gsoc is feasible or does this suggestion sound impractical(?).
>>
>> [0] http://gmod.org/wiki/GSoC#IDEA_8:_GIS_Indexing_for_Chado
>> [1] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/o/1
>>
>> Besides there is another requirement -- within Gsoc, google usually
>> handles administrative matters (acceptance letter, CPT certificate,
>> etc..) and since my University has strict project requirements --
>> requires a formal letter (akin to a CPT) from the Organization,
>> including the assigned mentor's CV, etc... ; I am not sure how to go
>> about this outside of Gsoc.  Hence, this email to the gmod-schema and
>> gmod-devel lists for your opinions and also to request for mentors.
>>
>> Thanks for reading.
>>
>> Regards,
>> vid ॥ http://svaksha.com ॥
>>
>> ------------------------------------------------------------------------------
>> WhatsUp Gold - Download Free Network Management Software
>> The most intuitive, comprehensive, and cost-effective network
>> management toolset available today.  Delivers lowest initial
>> acquisition cost and overall TCO of any competing solution.
>> http://p.sf.net/sfu/whatsupgold-sd
>> _______________________________________________
>> Gmod-devel mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-devel
>>
>>
>> ------------------------------------------------------------------------------
>> WhatsUp Gold - Download Free Network Management Software
>> The most intuitive, comprehensive, and cost-effective network
>> management toolset available today.  Delivers lowest initial
>> acquisition cost and overall TCO of any competing solution.
>> http://p.sf.net/sfu/whatsupgold-sd
>> _______________________________________________
>> Gmod-schema mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>
>>
>
>
>
> --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> Ontario Institute for Cancer Research
>
> ------------------------------------------------------------------------------
> WhatsUp Gold - Download Free Network Management Software
> The most intuitive, comprehensive, and cost-effective network
> management toolset available today.  Delivers lowest initial
> acquisition cost and overall TCO of any competing solution.
> http://p.sf.net/sfu/whatsupgold-sd
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema


------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Gmod-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-devel
Reply | Threaded
Open this post in threaded view
|

Re: [Gmod-schema] [x-post] GIS Indexing for Chado

Jason Stajich-4
What I did a while ago was via the post-gis plugin to postgres, http://postgis.org/ - not sure if more of GIS is integrated in postgres now?  I think it might be too much overhead for what you are trying to achieve and the R-tree might be more lightweight as all the GIS locations we'd be querying would be effectively lines, never polygons so I don't know whether it is worth the full blown GIS.  I could be wrong though so it is prob still worth some exploration?

Jason
On Apr 27, 2011, at 9:37 AM, Chris Fields wrote:

Scott, Jason,

Re: using GIS-based storage, I'm assuming this is R-Tree based?  If so one could use Pg, MySQL, or SQLite.  Should loop George Hartzell into this, he may have some additional thoughts as he has used RTree quite a bit with his work. I believe more on the SQLite end.

On a related note, I'm not sure whether it's worth pursuing but I have thought for a while now that it would be nice to abstract out the coord indexing in Bio::DB::SeqFeature::Store to allow binning, R-tree, NCL, or whatever (with the particular scheme stored in meta), but maybe have it be a separate set of modules that just creates the tables using an appropriate RDBMS loader for the lower-level stuff.  Is this essentially the same thing being proposed for the GIS-Chado project?

chris

On Apr 27, 2011, at 11:05 AM, Scott Cain wrote:

Hi Jason,

I was thinking about trying the same thing with SeqFeature::Store's Pg
adaptor as an option to see if it would bump it's speed relative to
the MySQL adaptor.  Do you think it would be worthwhile pursuing?

Scott


On Wed, Apr 27, 2011 at 11:39 AM, Jason Stajich <[hidden email]> wrote:
vid -

I did try to do this approach briefly with a short-read-db project I was
building that I have abandoned. I found that building the indexes took up so
much space and time it was slower than other indexing solutions and
prohibitive.Perhaps since the order of magnitude will be smaller for the
features in this type of DB it won't be so slow to load the data and build
the indexes.
The code was here and I basically just added a location field to a feature
table which was used in queries over the start/stop fields from before.
https://github.com/hyphaltip/short-read-db
This was focused around Bio::DB::SeqFeature::Store not Chado so I'll
definitely be curious to hear how it goes -- the range queries with the GIS
features is easy once you understand the syntax.
Jason
On Apr 27, 2011, at 2:29 AM, ॥ स्वक्ष ॥ wrote:

Hello,

I had applied for the "GIS Indexing for Chado"[0] Gsoc project and as
per my off-list mail to Scott and Robin yesterday, I propose to work
on the same project outside the Gsoc framework. The proposal is
available on Melange[1] and I would request you to take a look at the
same and provide feedback. I would like to know if mentoring outside
of Gsoc is feasible or does this suggestion sound impractical(?).

[0] http://gmod.org/wiki/GSoC#IDEA_8:_GIS_Indexing_for_Chado
[1] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/o/1

Besides there is another requirement -- within Gsoc, google usually
handles administrative matters (acceptance letter, CPT certificate,
etc..) and since my University has strict project requirements --
requires a formal letter (akin to a CPT) from the Organization,
including the assigned mentor's CV, etc... ; I am not sure how to go
about this outside of Gsoc.  Hence, this email to the gmod-schema and
gmod-devel lists for your opinions and also to request for mentors.

Thanks for reading.

Regards,
vid ॥ http://svaksha.com

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Gmod-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-devel


------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema





--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema



------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Gmod-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-devel