Set Theory Operations

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Set Theory Operations

Chris Mitchell
Hey Everyone,

Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.

Thanks,
Chris

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Scott Cain
Hi Chris,

There is not anything built into GBrowse that can do this.  It has
occurred to me before that a plug in to do these sorts of operations
would be really cool, but they don't exist either as far as I know.
It should be possible to write a plugin like that though.

Scott


On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell <[hidden email]> wrote:

> Hey Everyone,
>
> Is there a way/plugin within GBrowse which allows you to perform set theory
> like functions?  An example would be to show union/intersection/etc of
> chosen tracks.  This could be useful for applications such as the union of
> an exon track, an experimental RNAseq track, and an experimental
> SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse,
> but for end users with less know-how of the backend would find these
> features useful.
>
> Thanks,
> Chris
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
> _______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Lincoln Stein
In reply to this post by Chris Mitchell
The short answer is no. The functionality has been on the wish list for years (the user interface would be to drag and drop one track on top of another, and then to choose the desired set operation from a menu that appears). I am unlikely to add this feature myself, but if some adventurous person wants to work on it, I'm happy to provide guidance.

Lincoln

On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell <[hidden email]> wrote:
Hey Everyone,

Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.

Thanks,
Chris

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <[hidden email]>

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Lincoln Stein
In reply to this post by Chris Mitchell
How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.

Lincoln

On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell <[hidden email]> wrote:
Hey Everyone,

Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.

Thanks,
Chris

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <[hidden email]>

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

overlay_track.png (16K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Chris Mitchell
Hey Lincoln,

I think the overlay would be great for combining the xyplots for RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy plots?  I think the semi-transparency may be a good way to visualize overlapping data on a small scale, but from a zoomed out version it would probably be useless because the glyphs would be smashed together.  I'm running some SQL queries right now on my GB MySql database to find the overlap of the sets I'm interested in.  When I make them more efficient I'll see if I can port them into a Perl module which can work as the backend (However, I'm a complete hack with Perl and work mostly with C++/Python/Lua).

Chris

On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein <[hidden email]> wrote:
How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.

Lincoln

On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell <[hidden email]> wrote:
Hey Everyone,

Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.

Thanks,
Chris

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
<a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
Assistant: Renata Musa <[hidden email]>


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Lincoln Stein
Hi Chris,

Yes, the transparent overlap works with any set of glyphs, but as you say is quite useless for viewing large regions. Set operations would be a terrific feature.

Lincoln

On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell <[hidden email]> wrote:
Hey Lincoln,

I think the overlay would be great for combining the xyplots for RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy plots?  I think the semi-transparency may be a good way to visualize overlapping data on a small scale, but from a zoomed out version it would probably be useless because the glyphs would be smashed together.  I'm running some SQL queries right now on my GB MySql database to find the overlap of the sets I'm interested in.  When I make them more efficient I'll see if I can port them into a Perl module which can work as the backend (However, I'm a complete hack with Perl and work mostly with C++/Python/Lua).

Chris


On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein <[hidden email]> wrote:
How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.

Lincoln

On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell <[hidden email]> wrote:
Hey Everyone,

Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.

Thanks,
Chris

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
<a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
Assistant: Renata Musa <[hidden email]>




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <[hidden email]>

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Chris Mitchell
Hey everyone,

I've run into a bit of a wall with set operations which consider overlapping features.  For sets which are equal matches (equal start/end coordinates), the operation is fast and trivial (just use f1.start=f2.start AND f1.end=f2.end).
However, for cases which we are looking for overlapping features, on large datasets it takes a nearly infeasible amount of time. 
Here's an example query which can show this on a sizeable database (my database1 is 5 million rows, database2 is 7 million). 

SELECT f1.id 'Experimental ID', f1.typeid 'Type', f2.id 'Annotation ID', f2.typeid, f1.start 'f1 Start', f1.end 'f1 END', f2.start 'F2 Start', f2.end 'F2 End', il.seqname, gl.seqname
FROM Database1.feature f1
JOIN Database2.feature f2
left join Database1.locationlist il
on il.id=f1.seqid
left join Database2.locationlist gl
on gl.id=f2.seqid
WHERE f1.typeid=5 AND f2.typeid=4 AND il.seqname=gl.seqname AND f1.strand=f2.strand and not (f1.end<f2.start OR f1.start>f2.end)

I added a hash index for typeid and a btree for the end/start to try and speed this up without any luck as well. I debated on converting my database to use spatial indices to use the built-in relationships that MySQL defines (If this problem remains unresolved, I'll see if this has some merit).  In anycase, I'm hoping there is some MySQL prodigy on this list who might have some insight into making queries which find overlapping features tractable.

Chris

On Thu, Jan 5, 2012 at 2:44 PM, Lincoln Stein <[hidden email]> wrote:
Hi Chris,

Yes, the transparent overlap works with any set of glyphs, but as you say is quite useless for viewing large regions. Set operations would be a terrific feature.

Lincoln


On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell <[hidden email]> wrote:
Hey Lincoln,

I think the overlay would be great for combining the xyplots for RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy plots?  I think the semi-transparency may be a good way to visualize overlapping data on a small scale, but from a zoomed out version it would probably be useless because the glyphs would be smashed together.  I'm running some SQL queries right now on my GB MySql database to find the overlap of the sets I'm interested in.  When I make them more efficient I'll see if I can port them into a Perl module which can work as the backend (However, I'm a complete hack with Perl and work mostly with C++/Python/Lua).

Chris


On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein <[hidden email]> wrote:
How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.

Lincoln

On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell <[hidden email]> wrote:
Hey Everyone,

Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.

Thanks,
Chris

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
<a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
Assistant: Renata Musa <[hidden email]>




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
<a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
Assistant: Renata Musa <[hidden email]>


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Scott Cain
Hi Chris,

I can't really help, but I have implemented GIS indexing the Chado
featureloc table, and in this one case, Chado may perform better than
SeqFeature::Store.  I don't think btree indexes are particularly well
suited to solving this sort of problem.

Scott


On Mon, Jan 9, 2012 at 3:42 PM, Chris Mitchell <[hidden email]> wrote:

> Hey everyone,
>
> I've run into a bit of a wall with set operations which consider overlapping
> features.  For sets which are equal matches (equal start/end coordinates),
> the operation is fast and trivial (just use f1.start=f2.start AND
> f1.end=f2.end).
> However, for cases which we are looking for overlapping features, on large
> datasets it takes a nearly infeasible amount of time.
> Here's an example query which can show this on a sizeable database (my
> database1 is 5 million rows, database2 is 7 million).
>
> SELECT f1.id 'Experimental ID', f1.typeid 'Type', f2.id 'Annotation ID',
> f2.typeid, f1.start 'f1 Start', f1.end 'f1 END', f2.start 'F2 Start', f2.end
> 'F2 End', il.seqname, gl.seqname
> FROM Database1.feature f1
> JOIN Database2.feature f2
> left join Database1.locationlist il
> on il.id=f1.seqid
> left join Database2.locationlist gl
> on gl.id=f2.seqid
> WHERE f1.typeid=5 AND f2.typeid=4 AND il.seqname=gl.seqname AND
> f1.strand=f2.strand and not (f1.end<f2.start OR f1.start>f2.end)
>
> I added a hash index for typeid and a btree for the end/start to try and
> speed this up without any luck as well. I debated on converting my database
> to use spatial indices to use the built-in relationships that MySQL defines
> (If this problem remains unresolved, I'll see if this has some merit).  In
> anycase, I'm hoping there is some MySQL prodigy on this list who might have
> some insight into making queries which find overlapping features tractable.
>
> Chris
>
>
> On Thu, Jan 5, 2012 at 2:44 PM, Lincoln Stein <[hidden email]>
> wrote:
>>
>> Hi Chris,
>>
>> Yes, the transparent overlap works with any set of glyphs, but as you say
>> is quite useless for viewing large regions. Set operations would be a
>> terrific feature.
>>
>> Lincoln
>>
>>
>> On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell <[hidden email]> wrote:
>>>
>>> Hey Lincoln,
>>>
>>> I think the overlay would be great for combining the xyplots for
>>> RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy
>>> plots?  I think the semi-transparency may be a good way to visualize
>>> overlapping data on a small scale, but from a zoomed out version it would
>>> probably be useless because the glyphs would be smashed together.  I'm
>>> running some SQL queries right now on my GB MySql database to find the
>>> overlap of the sets I'm interested in.  When I make them more efficient I'll
>>> see if I can port them into a Perl module which can work as the backend
>>> (However, I'm a complete hack with Perl and work mostly with
>>> C++/Python/Lua).
>>>
>>> Chris
>>>
>>>
>>> On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein <[hidden email]>
>>> wrote:
>>>>
>>>> How about the ability to overlay one track on another, making each one
>>>> semi-transparent? I've been working on doing this for xy plot tracks, and
>>>> think that this can go into production soon, but it designed for a single
>>>> track that contains multiple features, as shown in the attached screenshot
>>>> (there are actually overlays of 9 different replicates here). A certain
>>>> amount of UI work would be needed to make it generic for arbitrary sets of
>>>> tracks - probably a couple of weeks.
>>>>
>>>> Lincoln
>>>>
>>>> On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell <[hidden email]>
>>>> wrote:
>>>>>
>>>>> Hey Everyone,
>>>>>
>>>>> Is there a way/plugin within GBrowse which allows you to perform set
>>>>> theory like functions?  An example would be to show union/intersection/etc
>>>>> of chosen tracks.  This could be useful for applications such as the union
>>>>> of an exon track, an experimental RNAseq track, and an experimental
>>>>> SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse,
>>>>> but for end users with less know-how of the backend would find these
>>>>> features useful.
>>>>>
>>>>> Thanks,
>>>>> Chris
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a
>>>>> complex
>>>>> infrastructure or vast IT resources to deliver seamless, secure access
>>>>> to
>>>>> virtual desktops. With this all-in-one solution, easily deploy virtual
>>>>> desktops for less than the cost of PCs and save 60% on VDI
>>>>> infrastructure
>>>>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>>>>> _______________________________________________
>>>>> Gmod-gbrowse mailing list
>>>>> [hidden email]
>>>>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Lincoln D. Stein
>>>> Director, Informatics and Biocomputing Platform
>>>> Ontario Institute for Cancer Research
>>>> 101 College St., Suite 800
>>>> Toronto, ON, Canada M5G0A3
>>>> 416 673-8514
>>>> Assistant: Renata Musa <[hidden email]>
>>>
>>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> 416 673-8514
>> Assistant: Renata Musa <[hidden email]>
>
>
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
> _______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Lincoln Stein
In reply to this post by Chris Mitchell
Are you sure you want to do this at the SQL level? The other approach would be to use the adapter API and then to perform the set operations on the Bioperl features.

Lincoln 

On Mon, Jan 9, 2012 at 3:42 PM, Chris Mitchell <[hidden email]> wrote:
Hey everyone,

I've run into a bit of a wall with set operations which consider overlapping features.  For sets which are equal matches (equal start/end coordinates), the operation is fast and trivial (just use f1.start=f2.start AND f1.end=f2.end).
However, for cases which we are looking for overlapping features, on large datasets it takes a nearly infeasible amount of time. 
Here's an example query which can show this on a sizeable database (my database1 is 5 million rows, database2 is 7 million). 

SELECT f1.id 'Experimental ID', f1.typeid 'Type', f2.id 'Annotation ID', f2.typeid, f1.start 'f1 Start', f1.end 'f1 END', f2.start 'F2 Start', f2.end 'F2 End', il.seqname, gl.seqname
FROM Database1.feature f1
JOIN Database2.feature f2
left join Database1.locationlist il
on il.id=f1.seqid
left join Database2.locationlist gl
on gl.id=f2.seqid
WHERE f1.typeid=5 AND f2.typeid=4 AND il.seqname=gl.seqname AND f1.strand=f2.strand and not (f1.end<f2.start OR f1.start>f2.end)

I added a hash index for typeid and a btree for the end/start to try and speed this up without any luck as well. I debated on converting my database to use spatial indices to use the built-in relationships that MySQL defines (If this problem remains unresolved, I'll see if this has some merit).  In anycase, I'm hoping there is some MySQL prodigy on this list who might have some insight into making queries which find overlapping features tractable.

Chris


On Thu, Jan 5, 2012 at 2:44 PM, Lincoln Stein <[hidden email]> wrote:
Hi Chris,

Yes, the transparent overlap works with any set of glyphs, but as you say is quite useless for viewing large regions. Set operations would be a terrific feature.

Lincoln


On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell <[hidden email]> wrote:
Hey Lincoln,

I think the overlay would be great for combining the xyplots for RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy plots?  I think the semi-transparency may be a good way to visualize overlapping data on a small scale, but from a zoomed out version it would probably be useless because the glyphs would be smashed together.  I'm running some SQL queries right now on my GB MySql database to find the overlap of the sets I'm interested in.  When I make them more efficient I'll see if I can port them into a Perl module which can work as the backend (However, I'm a complete hack with Perl and work mostly with C++/Python/Lua).

Chris


On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein <[hidden email]> wrote:
How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.

Lincoln

On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell <[hidden email]> wrote:
Hey Everyone,

Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.

Thanks,
Chris

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
<a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
Assistant: Renata Musa <[hidden email]>




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
<a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
Assistant: Renata Musa <[hidden email]>




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <[hidden email]>

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Adam Witney
In reply to this post by Chris Mitchell

Does mysql have an equivalent of the EXPLAIN or EXPLAIN ANALYSE from PostgreSQL? i.e. can you tell if it is actually using your indexes or resorting to full table scans?


On 9 Jan 2012, at 20:42, Chris Mitchell wrote:

> Hey everyone,
>
> I've run into a bit of a wall with set operations which consider overlapping features.  For sets which are equal matches (equal start/end coordinates), the operation is fast and trivial (just use f1.start=f2.start AND f1.end=f2.end).
> However, for cases which we are looking for overlapping features, on large datasets it takes a nearly infeasible amount of time.  
> Here's an example query which can show this on a sizeable database (my database1 is 5 million rows, database2 is 7 million).  
>
> SELECT f1.id 'Experimental ID', f1.typeid 'Type', f2.id 'Annotation ID', f2.typeid, f1.start 'f1 Start', f1.end 'f1 END', f2.start 'F2 Start', f2.end 'F2 End', il.seqname, gl.seqname
> FROM Database1.feature f1
> JOIN Database2.feature f2
> left join Database1.locationlist il
> on il.id=f1.seqid
> left join Database2.locationlist gl
> on gl.id=f2.seqid
> WHERE f1.typeid=5 AND f2.typeid=4 AND il.seqname=gl.seqname AND f1.strand=f2.strand and not (f1.end<f2.start OR f1.start>f2.end)
>
> I added a hash index for typeid and a btree for the end/start to try and speed this up without any luck as well. I debated on converting my database to use spatial indices to use the built-in relationships that MySQL defines (If this problem remains unresolved, I'll see if this has some merit).  In anycase, I'm hoping there is some MySQL prodigy on this list who might have some insight into making queries which find overlapping features tractable.
>
> Chris
>
> On Thu, Jan 5, 2012 at 2:44 PM, Lincoln Stein <[hidden email]> wrote:
> Hi Chris,
>
> Yes, the transparent overlap works with any set of glyphs, but as you say is quite useless for viewing large regions. Set operations would be a terrific feature.
>
> Lincoln
>
>
> On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell <[hidden email]> wrote:
> Hey Lincoln,
>
> I think the overlay would be great for combining the xyplots for RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy plots?  I think the semi-transparency may be a good way to visualize overlapping data on a small scale, but from a zoomed out version it would probably be useless because the glyphs would be smashed together.  I'm running some SQL queries right now on my GB MySql database to find the overlap of the sets I'm interested in.  When I make them more efficient I'll see if I can port them into a Perl module which can work as the backend (However, I'm a complete hack with Perl and work mostly with C++/Python/Lua).
>
> Chris
>
>
> On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein <[hidden email]> wrote:
> How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.
>
> Lincoln
>
> On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell <[hidden email]> wrote:
> Hey Everyone,
>
> Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.
>
> Thanks,
> Chris
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
> _______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>
>
>
>
> --
> Lincoln D. Stein
> Director, Informatics and Biocomputing Platform
> Ontario Institute for Cancer Research
> 101 College St., Suite 800
> Toronto, ON, Canada M5G0A3
> 416 673-8514
> Assistant: Renata Musa <[hidden email]>
>
>
>
>
> --
> Lincoln D. Stein
> Director, Informatics and Biocomputing Platform
> Ontario Institute for Cancer Research
> 101 College St., Suite 800
> Toronto, ON, Canada M5G0A3
> 416 673-8514
> Assistant: Renata Musa <[hidden email]>
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox_______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Josh Goodman
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html

On Mon, Jan 9, 2012 at 4:07 PM, Adam Witney <[hidden email]> wrote:

>
> Does mysql have an equivalent of the EXPLAIN or EXPLAIN ANALYSE from PostgreSQL? i.e. can you tell if it is actually using your indexes or resorting to full table scans?
>
>
> On 9 Jan 2012, at 20:42, Chris Mitchell wrote:
>
>> Hey everyone,
>>
>> I've run into a bit of a wall with set operations which consider overlapping features.  For sets which are equal matches (equal start/end coordinates), the operation is fast and trivial (just use f1.start=f2.start AND f1.end=f2.end).
>> However, for cases which we are looking for overlapping features, on large datasets it takes a nearly infeasible amount of time.
>> Here's an example query which can show this on a sizeable database (my database1 is 5 million rows, database2 is 7 million).
>>
>> SELECT f1.id 'Experimental ID', f1.typeid 'Type', f2.id 'Annotation ID', f2.typeid, f1.start 'f1 Start', f1.end 'f1 END', f2.start 'F2 Start', f2.end 'F2 End', il.seqname, gl.seqname
>> FROM Database1.feature f1
>> JOIN Database2.feature f2
>> left join Database1.locationlist il
>> on il.id=f1.seqid
>> left join Database2.locationlist gl
>> on gl.id=f2.seqid
>> WHERE f1.typeid=5 AND f2.typeid=4 AND il.seqname=gl.seqname AND f1.strand=f2.strand and not (f1.end<f2.start OR f1.start>f2.end)
>>
>> I added a hash index for typeid and a btree for the end/start to try and speed this up without any luck as well. I debated on converting my database to use spatial indices to use the built-in relationships that MySQL defines (If this problem remains unresolved, I'll see if this has some merit).  In anycase, I'm hoping there is some MySQL prodigy on this list who might have some insight into making queries which find overlapping features tractable.
>>
>> Chris
>>
>> On Thu, Jan 5, 2012 at 2:44 PM, Lincoln Stein <[hidden email]> wrote:
>> Hi Chris,
>>
>> Yes, the transparent overlap works with any set of glyphs, but as you say is quite useless for viewing large regions. Set operations would be a terrific feature.
>>
>> Lincoln
>>
>>
>> On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell <[hidden email]> wrote:
>> Hey Lincoln,
>>
>> I think the overlay would be great for combining the xyplots for RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy plots?  I think the semi-transparency may be a good way to visualize overlapping data on a small scale, but from a zoomed out version it would probably be useless because the glyphs would be smashed together.  I'm running some SQL queries right now on my GB MySql database to find the overlap of the sets I'm interested in.  When I make them more efficient I'll see if I can port them into a Perl module which can work as the backend (However, I'm a complete hack with Perl and work mostly with C++/Python/Lua).
>>
>> Chris
>>
>>
>> On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein <[hidden email]> wrote:
>> How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.
>>
>> Lincoln
>>
>> On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell <[hidden email]> wrote:
>> Hey Everyone,
>>
>> Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.
>>
>> Thanks,
>> Chris
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>> _______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> 416 673-8514
>> Assistant: Renata Musa <[hidden email]>
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> 416 673-8514
>> Assistant: Renata Musa <[hidden email]>
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox_______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
> _______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Chris Mitchell
Explain/explain extended only gives 'possible' indices, of which my indices are listed.  There are ways to force SQL to use an index which I'll look into.

For the Bioperl, I imagined that would be much slower than accessing the data directly.  Is there any reason you would think Bioperl would outperform SQL?

On Mon, Jan 9, 2012 at 4:12 PM, Josh Goodman <[hidden email]> wrote:
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html

On Mon, Jan 9, 2012 at 4:07 PM, Adam Witney <[hidden email]> wrote:
>
> Does mysql have an equivalent of the EXPLAIN or EXPLAIN ANALYSE from PostgreSQL? i.e. can you tell if it is actually using your indexes or resorting to full table scans?
>
>
> On 9 Jan 2012, at 20:42, Chris Mitchell wrote:
>
>> Hey everyone,
>>
>> I've run into a bit of a wall with set operations which consider overlapping features.  For sets which are equal matches (equal start/end coordinates), the operation is fast and trivial (just use f1.start=f2.start AND f1.end=f2.end).
>> However, for cases which we are looking for overlapping features, on large datasets it takes a nearly infeasible amount of time.
>> Here's an example query which can show this on a sizeable database (my database1 is 5 million rows, database2 is 7 million).
>>
>> SELECT f1.id 'Experimental ID', f1.typeid 'Type', f2.id 'Annotation ID', f2.typeid, f1.start 'f1 Start', f1.end 'f1 END', f2.start 'F2 Start', f2.end 'F2 End', il.seqname, gl.seqname
>> FROM Database1.feature f1
>> JOIN Database2.feature f2
>> left join Database1.locationlist il
>> on il.id=f1.seqid
>> left join Database2.locationlist gl
>> on gl.id=f2.seqid
>> WHERE f1.typeid=5 AND f2.typeid=4 AND il.seqname=gl.seqname AND f1.strand=f2.strand and not (f1.end<f2.start OR f1.start>f2.end)
>>
>> I added a hash index for typeid and a btree for the end/start to try and speed this up without any luck as well. I debated on converting my database to use spatial indices to use the built-in relationships that MySQL defines (If this problem remains unresolved, I'll see if this has some merit).  In anycase, I'm hoping there is some MySQL prodigy on this list who might have some insight into making queries which find overlapping features tractable.
>>
>> Chris
>>
>> On Thu, Jan 5, 2012 at 2:44 PM, Lincoln Stein <[hidden email]> wrote:
>> Hi Chris,
>>
>> Yes, the transparent overlap works with any set of glyphs, but as you say is quite useless for viewing large regions. Set operations would be a terrific feature.
>>
>> Lincoln
>>
>>
>> On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell <[hidden email]> wrote:
>> Hey Lincoln,
>>
>> I think the overlay would be great for combining the xyplots for RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy plots?  I think the semi-transparency may be a good way to visualize overlapping data on a small scale, but from a zoomed out version it would probably be useless because the glyphs would be smashed together.  I'm running some SQL queries right now on my GB MySql database to find the overlap of the sets I'm interested in.  When I make them more efficient I'll see if I can port them into a Perl module which can work as the backend (However, I'm a complete hack with Perl and work mostly with C++/Python/Lua).
>>
>> Chris
>>
>>
>> On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein <[hidden email]> wrote:
>> How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.
>>
>> Lincoln
>>
>> On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell <[hidden email]> wrote:
>> Hey Everyone,
>>
>> Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.
>>
>> Thanks,
>> Chris
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>> _______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> <a href="tel:416%20673-8514" value="+14166738514">416 673-8514
>> Assistant: Renata Musa <[hidden email]>
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> <a href="tel:416%20673-8514" value="+14166738514">416 673-8514
>> Assistant: Renata Musa <[hidden email]>
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox_______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
> _______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Fields, Christopher J
In reply to this post by Adam Witney
As Lincoln alludes to, an API exists to do this, though it may not be
terribly optimal.  One can do simple set operations with the BioPerl API
directly, see Bio::Range/Bio::RangeI for union/intersect/subtract; all
Bio::SeqFeatureI are also Bio::RangeI.

Also, Bio::DB::SF::Store uses binning for storing locations (see
Bio::DB::SF::Store::DBI::mysql's _location_sql() method), maybe that
could be used?

chris

On 01/09/2012 03:07 PM, Adam Witney wrote:

>
> Does mysql have an equivalent of the EXPLAIN or EXPLAIN ANALYSE from PostgreSQL? i.e. can you tell if it is actually using your indexes or resorting to full table scans?
>
>
> On 9 Jan 2012, at 20:42, Chris Mitchell wrote:
>
>> Hey everyone,
>>
>> I've run into a bit of a wall with set operations which consider overlapping features.  For sets which are equal matches (equal start/end coordinates), the operation is fast and trivial (just use f1.start=f2.start AND f1.end=f2.end).
>> However, for cases which we are looking for overlapping features, on large datasets it takes a nearly infeasible amount of time.
>> Here's an example query which can show this on a sizeable database (my database1 is 5 million rows, database2 is 7 million).
>>
>> SELECT f1.id 'Experimental ID', f1.typeid 'Type', f2.id 'Annotation ID', f2.typeid, f1.start 'f1 Start', f1.end 'f1 END', f2.start 'F2 Start', f2.end 'F2 End', il.seqname, gl.seqname
>> FROM Database1.feature f1
>> JOIN Database2.feature f2
>> left join Database1.locationlist il
>> on il.id=f1.seqid
>> left join Database2.locationlist gl
>> on gl.id=f2.seqid
>> WHERE f1.typeid=5 AND f2.typeid=4 AND il.seqname=gl.seqname AND f1.strand=f2.strand and not (f1.end<f2.start OR f1.start>f2.end)
>>
>> I added a hash index for typeid and a btree for the end/start to try and speed this up without any luck as well. I debated on converting my database to use spatial indices to use the built-in relationships that MySQL defines (If this problem remains unresolved, I'll see if this has some merit).  In anycase, I'm hoping there is some MySQL prodigy on this list who might have some insight into making queries which find overlapping features tractable.
>>
>> Chris
>>
>> On Thu, Jan 5, 2012 at 2:44 PM, Lincoln Stein<[hidden email]>  wrote:
>> Hi Chris,
>>
>> Yes, the transparent overlap works with any set of glyphs, but as you say is quite useless for viewing large regions. Set operations would be a terrific feature.
>>
>> Lincoln
>>
>>
>> On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell<[hidden email]>  wrote:
>> Hey Lincoln,
>>
>> I think the overlay would be great for combining the xyplots for RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy plots?  I think the semi-transparency may be a good way to visualize overlapping data on a small scale, but from a zoomed out version it would probably be useless because the glyphs would be smashed together.  I'm running some SQL queries right now on my GB MySql database to find the overlap of the sets I'm interested in.  When I make them more efficient I'll see if I can port them into a Perl module which can work as the backend (However, I'm a complete hack with Perl and work mostly with C++/Python/Lua).
>>
>> Chris
>>
>>
>> On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein<[hidden email]>  wrote:
>> How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.
>>
>> Lincoln
>>
>> On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell<[hidden email]>  wrote:
>> Hey Everyone,
>>
>> Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.
>>
>> Thanks,
>> Chris
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>> _______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> 416 673-8514
>> Assistant: Renata Musa<[hidden email]>
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> 416 673-8514
>> Assistant: Renata Musa<[hidden email]>
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox_______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
> _______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Fields, Christopher J
In reply to this post by Chris Mitchell
Re: bioperl vs SQL-based set methods, have you actually tried using
BioPerl's methods?  The BioPerl API is already implemented; you can
always work out optimisations later, and if you can come up with a
better SQL-based method to do so, then it makes sense to use that instead.

chris

On 01/09/2012 03:31 PM, Chris Mitchell wrote:

> Explain/explain extended only gives 'possible' indices, of which my
> indices are listed. There are ways to force SQL to use an index which
> I'll look into.
>
> For the Bioperl, I imagined that would be much slower than accessing the
> data directly. Is there any reason you would think Bioperl would
> outperform SQL?
>
> On Mon, Jan 9, 2012 at 4:12 PM, Josh Goodman <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
>
>     On Mon, Jan 9, 2012 at 4:07 PM, Adam Witney <[hidden email]
>     <mailto:[hidden email]>> wrote:
>      >
>      > Does mysql have an equivalent of the EXPLAIN or EXPLAIN ANALYSE
>     from PostgreSQL? i.e. can you tell if it is actually using your
>     indexes or resorting to full table scans?
>      >
>      >
>      > On 9 Jan 2012, at 20:42, Chris Mitchell wrote:
>      >
>      >> Hey everyone,
>      >>
>      >> I've run into a bit of a wall with set operations which consider
>     overlapping features. For sets which are equal matches (equal
>     start/end coordinates), the operation is fast and trivial (just use
>     f1.start=f2.start AND f1.end=f2.end).
>      >> However, for cases which we are looking for overlapping
>     features, on large datasets it takes a nearly infeasible amount of time.
>      >> Here's an example query which can show this on a sizeable
>     database (my database1 is 5 million rows, database2 is 7 million).
>      >>
>      >> SELECT f1.id <http://f1.id> 'Experimental ID', f1.typeid 'Type',
>     f2.id <http://f2.id> 'Annotation ID', f2.typeid, f1.start 'f1
>     Start', f1.end 'f1 END', f2.start 'F2 Start', f2.end 'F2 End',
>     il.seqname, gl.seqname
>      >> FROM Database1.feature f1
>      >> JOIN Database2.feature f2
>      >> left join Database1.locationlist il
>      >> on il.id <http://il.id>=f1.seqid
>      >> left join Database2.locationlist gl
>      >> on gl.id <http://gl.id>=f2.seqid
>      >> WHERE f1.typeid=5 AND f2.typeid=4 AND il.seqname=gl.seqname AND
>     f1.strand=f2.strand and not (f1.end<f2.start OR f1.start>f2.end)
>      >>
>      >> I added a hash index for typeid and a btree for the end/start to
>     try and speed this up without any luck as well. I debated on
>     converting my database to use spatial indices to use the built-in
>     relationships that MySQL defines (If this problem remains
>     unresolved, I'll see if this has some merit). In anycase, I'm hoping
>     there is some MySQL prodigy on this list who might have some insight
>     into making queries which find overlapping features tractable.
>      >>
>      >> Chris
>      >>
>      >> On Thu, Jan 5, 2012 at 2:44 PM, Lincoln Stein
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      >> Hi Chris,
>      >>
>      >> Yes, the transparent overlap works with any set of glyphs, but
>     as you say is quite useless for viewing large regions. Set
>     operations would be a terrific feature.
>      >>
>      >> Lincoln
>      >>
>      >>
>      >> On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      >> Hey Lincoln,
>      >>
>      >> I think the overlay would be great for combining the xyplots for
>     RNAseq/WGS data. Is this semi-transparent overlay going to extend
>     beyond xy plots? I think the semi-transparency may be a good way to
>     visualize overlapping data on a small scale, but from a zoomed out
>     version it would probably be useless because the glyphs would be
>     smashed together. I'm running some SQL queries right now on my GB
>     MySql database to find the overlap of the sets I'm interested in.
>     When I make them more efficient I'll see if I can port them into a
>     Perl module which can work as the backend (However, I'm a complete
>     hack with Perl and work mostly with C++/Python/Lua).
>      >>
>      >> Chris
>      >>
>      >>
>      >> On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      >> How about the ability to overlay one track on another, making
>     each one semi-transparent? I've been working on doing this for xy
>     plot tracks, and think that this can go into production soon, but it
>     designed for a single track that contains multiple features, as
>     shown in the attached screenshot (there are actually overlays of 9
>     different replicates here). A certain amount of UI work would be
>     needed to make it generic for arbitrary sets of tracks - probably a
>     couple of weeks.
>      >>
>      >> Lincoln
>      >>
>      >> On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      >> Hey Everyone,
>      >>
>      >> Is there a way/plugin within GBrowse which allows you to perform
>     set theory like functions? An example would be to show
>     union/intersection/etc of chosen tracks. This could be useful for
>     applications such as the union of an exon track, an experimental
>     RNAseq track, and an experimental SNP/indel/etc. track. This is
>     somewhat trivial to do outside of GBrowse, but for end users with
>     less know-how of the backend would find these features useful.
>      >>
>      >> Thanks,
>      >> Chris
>      >>
>      >>
>     ------------------------------------------------------------------------------
>      >> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need
>     a complex
>      >> infrastructure or vast IT resources to deliver seamless, secure
>     access to
>      >> virtual desktops. With this all-in-one solution, easily deploy
>     virtual
>      >> desktops for less than the cost of PCs and save 60% on VDI
>     infrastructure
>      >> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>      >> _______________________________________________
>      >> Gmod-gbrowse mailing list
>      >> [hidden email]
>     <mailto:[hidden email]>
>      >> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>      >>
>      >>
>      >>
>      >>
>      >> --
>      >> Lincoln D. Stein
>      >> Director, Informatics and Biocomputing Platform
>      >> Ontario Institute for Cancer Research
>      >> 101 College St., Suite 800
>      >> Toronto, ON, Canada M5G0A3
>      >> 416 673-8514 <tel:416%20673-8514>
>      >> Assistant: Renata Musa <[hidden email]
>     <mailto:[hidden email]>>
>      >>
>      >>
>      >>
>      >>
>      >> --
>      >> Lincoln D. Stein
>      >> Director, Informatics and Biocomputing Platform
>      >> Ontario Institute for Cancer Research
>      >> 101 College St., Suite 800
>      >> Toronto, ON, Canada M5G0A3
>      >> 416 673-8514 <tel:416%20673-8514>
>      >> Assistant: Renata Musa <[hidden email]
>     <mailto:[hidden email]>>
>      >>
>      >>
>     ------------------------------------------------------------------------------
>      >> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need
>     a complex
>      >> infrastructure or vast IT resources to deliver seamless, secure
>     access to
>      >> virtual desktops. With this all-in-one solution, easily deploy
>     virtual
>      >> desktops for less than the cost of PCs and save 60% on VDI
>     infrastructure
>      >> costs. Try it free!
>     http://p.sf.net/sfu/Citrix-VDIinabox_______________________________________________
>      >> Gmod-gbrowse mailing list
>      >> [hidden email]
>     <mailto:[hidden email]>
>      >> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>      >
>      >
>      >
>     ------------------------------------------------------------------------------
>      > Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a
>     complex
>      > infrastructure or vast IT resources to deliver seamless, secure
>     access to
>      > virtual desktops. With this all-in-one solution, easily deploy
>     virtual
>      > desktops for less than the cost of PCs and save 60% on VDI
>     infrastructure
>      > costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>      > _______________________________________________
>      > Gmod-gbrowse mailing list
>      > [hidden email]
>     <mailto:[hidden email]>
>      > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>
>
>
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>
>
>
> _______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Eccles, David
In reply to this post by Chris Mitchell
On 09/01/12 21:42, Chris Mitchell wrote:
> However, for cases which we are looking for overlapping features, on large
> datasets it takes a nearly infeasible amount of time.
> Here's an example query which can show this on a sizeable database (my
> database1 is 5 million rows, database2 is 7 million).

Looking for overlaps, I think the most efficient way to do it is to sort by the start indexes. I
have some R code floating round here somewhere that looks for overlapping regions:

   ## calculate window regions
   track.stats <- track.stats[order(track.stats$minX),];
   region.stats <- data.frame(region = integer(0), minX = numeric(0), maxX = numeric(0));
   current.region <- track.stats[1,c("minX","maxX")];
   region.id <- 1;
   for(current.track in 1:(dim(track.stats)[1])){
     if(all(sign(track.stats[current.track,c("minX","maxX")] - rev(current.region)) == c(-1,1)) ||
        all(sign(track.stats[current.track,c("minX","maxX")] - rev(current.region)) == c(0,0))){
       current.region["minX"] <- min(current.region["minX"],track.stats$minX[current.track]);
       current.region["maxX"] <- max(current.region["maxX"],track.stats$maxX[current.track]);
       track.stats$region[current.track] <- region.id;
     } else {
       region.stats[region.id,] <-
         data.frame(region = region.id, minX = current.region["minX"], maxX = current.region["maxX"]);
       current.region <- track.stats[current.track,c("minX","maxX")];
       region.id <- region.id + 1;
       track.stats$region[current.track] <- region.id;
     }
   }

Here's a rough idea of how it works:
1) order by feature start point
2) store a variable indicating the current region range
3) loop through the ordered list...
3a) extend the current region if the next feature starts before the end of the current region range,
and finishes after the end of the current region range
3b) otherwise, declare the end of a region, and start a new region

- David

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Chris Mitchell
I managed to get my query time down significantly by adding the following index to my databases:
create index `typeAndLocation` ON db.table (`typeid`, `strand`, `start`, `end`)

A query returning about a million overlapping features took around 10 minutes to complete without optimizing my storage structure.  Do you have any comparable numbers using your approach David?  My current time frame I can deal with, but I'm interested if you have any benchmarks.

Chris



On Tue, Jan 10, 2012 at 5:06 AM, David Eccles (gringer) <[hidden email]> wrote:
On 09/01/12 21:42, Chris Mitchell wrote:
However, for cases which we are looking for overlapping features, on large
datasets it takes a nearly infeasible amount of time.
Here's an example query which can show this on a sizeable database (my
database1 is 5 million rows, database2 is 7 million).

Looking for overlaps, I think the most efficient way to do it is to sort by the start indexes. I have some R code floating round here somewhere that looks for overlapping regions:

 ## calculate window regions
 track.stats <- track.stats[order(track.stats$minX),];
 region.stats <- data.frame(region = integer(0), minX = numeric(0), maxX = numeric(0));
 current.region <- track.stats[1,c("minX","maxX")];
 region.id <- 1;
 for(current.track in 1:(dim(track.stats)[1])){
   if(all(sign(track.stats[current.track,c("minX","maxX")] - rev(current.region)) == c(-1,1)) ||
      all(sign(track.stats[current.track,c("minX","maxX")] - rev(current.region)) == c(0,0))){
     current.region["minX"] <- min(current.region["minX"],track.stats$minX[current.track]);
     current.region["maxX"] <- max(current.region["maxX"],track.stats$maxX[current.track]);
     track.stats$region[current.track] <- region.id;
   } else {
     region.stats[region.id,] <-
       data.frame(region = region.id, minX = current.region["minX"], maxX = current.region["maxX"]);
     current.region <- track.stats[current.track,c("minX","maxX")];
     region.id <- region.id + 1;
     track.stats$region[current.track] <- region.id;
   }
 }

Here's a rough idea of how it works:
1) order by feature start point
2) store a variable indicating the current region range
3) loop through the ordered list...
3a) extend the current region if the next feature starts before the end of the current region range, and finishes after the end of the current region range
3b) otherwise, declare the end of a region, and start a new region

- David


------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Chris Mitchell
Oops, I forgot the location id in that index.  The proper keys are:
`typeid`,`strand`,`seqid`,`start`,`end`

On Tue, Jan 10, 2012 at 9:29 AM, Chris Mitchell <[hidden email]> wrote:
I managed to get my query time down significantly by adding the following index to my databases:
create index `typeAndLocation` ON db.table (`typeid`, `strand`, `start`, `end`)

A query returning about a million overlapping features took around 10 minutes to complete without optimizing my storage structure.  Do you have any comparable numbers using your approach David?  My current time frame I can deal with, but I'm interested if you have any benchmarks.

Chris




On Tue, Jan 10, 2012 at 5:06 AM, David Eccles (gringer) <[hidden email]> wrote:
On 09/01/12 21:42, Chris Mitchell wrote:
However, for cases which we are looking for overlapping features, on large
datasets it takes a nearly infeasible amount of time.
Here's an example query which can show this on a sizeable database (my
database1 is 5 million rows, database2 is 7 million).

Looking for overlaps, I think the most efficient way to do it is to sort by the start indexes. I have some R code floating round here somewhere that looks for overlapping regions:

 ## calculate window regions
 track.stats <- track.stats[order(track.stats$minX),];
 region.stats <- data.frame(region = integer(0), minX = numeric(0), maxX = numeric(0));
 current.region <- track.stats[1,c("minX","maxX")];
 region.id <- 1;
 for(current.track in 1:(dim(track.stats)[1])){
   if(all(sign(track.stats[current.track,c("minX","maxX")] - rev(current.region)) == c(-1,1)) ||
      all(sign(track.stats[current.track,c("minX","maxX")] - rev(current.region)) == c(0,0))){
     current.region["minX"] <- min(current.region["minX"],track.stats$minX[current.track]);
     current.region["maxX"] <- max(current.region["maxX"],track.stats$maxX[current.track]);
     track.stats$region[current.track] <- region.id;
   } else {
     region.stats[region.id,] <-
       data.frame(region = region.id, minX = current.region["minX"], maxX = current.region["maxX"]);
     current.region <- track.stats[current.track,c("minX","maxX")];
     region.id <- region.id + 1;
     track.stats$region[current.track] <- region.id;
   }
 }

Here's a rough idea of how it works:
1) order by feature start point
2) store a variable indicating the current region range
3) loop through the ordered list...
3a) extend the current region if the next feature starts before the end of the current region range, and finishes after the end of the current region range
3b) otherwise, declare the end of a region, and start a new region

- David



------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Lincoln Stein
In reply to this post by Fields, Christopher J
I'm thinking that the set operation tracks are done dynamically as needed:

  1. fetch features from track A across visible region
  2. fetch features from track B across visible region
  3. perform set operation, and store result as a temporary database
  4. display contents of temporary database in synthetic track
Lincoln

On Mon, Jan 9, 2012 at 4:48 PM, Chris Fields <[hidden email]> wrote:
As Lincoln alludes to, an API exists to do this, though it may not be
terribly optimal.  One can do simple set operations with the BioPerl API
directly, see Bio::Range/Bio::RangeI for union/intersect/subtract; all
Bio::SeqFeatureI are also Bio::RangeI.

Also, Bio::DB::SF::Store uses binning for storing locations (see
Bio::DB::SF::Store::DBI::mysql's _location_sql() method), maybe that
could be used?

chris

On 01/09/2012 03:07 PM, Adam Witney wrote:
>
> Does mysql have an equivalent of the EXPLAIN or EXPLAIN ANALYSE from PostgreSQL? i.e. can you tell if it is actually using your indexes or resorting to full table scans?
>
>
> On 9 Jan 2012, at 20:42, Chris Mitchell wrote:
>
>> Hey everyone,
>>
>> I've run into a bit of a wall with set operations which consider overlapping features.  For sets which are equal matches (equal start/end coordinates), the operation is fast and trivial (just use f1.start=f2.start AND f1.end=f2.end).
>> However, for cases which we are looking for overlapping features, on large datasets it takes a nearly infeasible amount of time.
>> Here's an example query which can show this on a sizeable database (my database1 is 5 million rows, database2 is 7 million).
>>
>> SELECT f1.id 'Experimental ID', f1.typeid 'Type', f2.id 'Annotation ID', f2.typeid, f1.start 'f1 Start', f1.end 'f1 END', f2.start 'F2 Start', f2.end 'F2 End', il.seqname, gl.seqname
>> FROM Database1.feature f1
>> JOIN Database2.feature f2
>> left join Database1.locationlist il
>> on il.id=f1.seqid
>> left join Database2.locationlist gl
>> on gl.id=f2.seqid
>> WHERE f1.typeid=5 AND f2.typeid=4 AND il.seqname=gl.seqname AND f1.strand=f2.strand and not (f1.end<f2.start OR f1.start>f2.end)
>>
>> I added a hash index for typeid and a btree for the end/start to try and speed this up without any luck as well. I debated on converting my database to use spatial indices to use the built-in relationships that MySQL defines (If this problem remains unresolved, I'll see if this has some merit).  In anycase, I'm hoping there is some MySQL prodigy on this list who might have some insight into making queries which find overlapping features tractable.
>>
>> Chris
>>
>> On Thu, Jan 5, 2012 at 2:44 PM, Lincoln Stein<[hidden email]>  wrote:
>> Hi Chris,
>>
>> Yes, the transparent overlap works with any set of glyphs, but as you say is quite useless for viewing large regions. Set operations would be a terrific feature.
>>
>> Lincoln
>>
>>
>> On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell<[hidden email]>  wrote:
>> Hey Lincoln,
>>
>> I think the overlay would be great for combining the xyplots for RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy plots?  I think the semi-transparency may be a good way to visualize overlapping data on a small scale, but from a zoomed out version it would probably be useless because the glyphs would be smashed together.  I'm running some SQL queries right now on my GB MySql database to find the overlap of the sets I'm interested in.  When I make them more efficient I'll see if I can port them into a Perl module which can work as the backend (However, I'm a complete hack with Perl and work mostly with C++/Python/Lua).
>>
>> Chris
>>
>>
>> On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein<[hidden email]>  wrote:
>> How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.
>>
>> Lincoln
>>
>> On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell<[hidden email]>  wrote:
>> Hey Everyone,
>>
>> Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.
>>
>> Thanks,
>> Chris
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>> _______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> <a href="tel:416%20673-8514" value="+14166738514">416 673-8514
>> Assistant: Renata Musa<[hidden email]>
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> <a href="tel:416%20673-8514" value="+14166738514">416 673-8514
>> Assistant: Renata Musa<[hidden email]>
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox_______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
> _______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse



--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <[hidden email]>

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Chris Mitchell
I was hoping to implement it to support either variant.  For a persistent track I would want to eat the cost of set operations desired one time only.  For dynamic viewing, looking at the overlap of ~6 million features on chromosome 1 took about 30s, 1 million features on chromosome 15 took 1.5s.  I have no idea how these times scale without the new index I created however.  I'll start digging through the plugins to see what code I could adapt for use in this -- any suggestions are welcome.

Chris

On Tue, Jan 10, 2012 at 12:05 PM, Lincoln Stein <[hidden email]> wrote:
I'm thinking that the set operation tracks are done dynamically as needed:

  1. fetch features from track A across visible region
  2. fetch features from track B across visible region
  3. perform set operation, and store result as a temporary database
  4. display contents of temporary database in synthetic track
Lincoln

On Mon, Jan 9, 2012 at 4:48 PM, Chris Fields <[hidden email]> wrote:
As Lincoln alludes to, an API exists to do this, though it may not be
terribly optimal.  One can do simple set operations with the BioPerl API
directly, see Bio::Range/Bio::RangeI for union/intersect/subtract; all
Bio::SeqFeatureI are also Bio::RangeI.

Also, Bio::DB::SF::Store uses binning for storing locations (see
Bio::DB::SF::Store::DBI::mysql's _location_sql() method), maybe that
could be used?

chris

On 01/09/2012 03:07 PM, Adam Witney wrote:
>
> Does mysql have an equivalent of the EXPLAIN or EXPLAIN ANALYSE from PostgreSQL? i.e. can you tell if it is actually using your indexes or resorting to full table scans?
>
>
> On 9 Jan 2012, at 20:42, Chris Mitchell wrote:
>
>> Hey everyone,
>>
>> I've run into a bit of a wall with set operations which consider overlapping features.  For sets which are equal matches (equal start/end coordinates), the operation is fast and trivial (just use f1.start=f2.start AND f1.end=f2.end).
>> However, for cases which we are looking for overlapping features, on large datasets it takes a nearly infeasible amount of time.
>> Here's an example query which can show this on a sizeable database (my database1 is 5 million rows, database2 is 7 million).
>>
>> SELECT f1.id 'Experimental ID', f1.typeid 'Type', f2.id 'Annotation ID', f2.typeid, f1.start 'f1 Start', f1.end 'f1 END', f2.start 'F2 Start', f2.end 'F2 End', il.seqname, gl.seqname
>> FROM Database1.feature f1
>> JOIN Database2.feature f2
>> left join Database1.locationlist il
>> on il.id=f1.seqid
>> left join Database2.locationlist gl
>> on gl.id=f2.seqid
>> WHERE f1.typeid=5 AND f2.typeid=4 AND il.seqname=gl.seqname AND f1.strand=f2.strand and not (f1.end<f2.start OR f1.start>f2.end)
>>
>> I added a hash index for typeid and a btree for the end/start to try and speed this up without any luck as well. I debated on converting my database to use spatial indices to use the built-in relationships that MySQL defines (If this problem remains unresolved, I'll see if this has some merit).  In anycase, I'm hoping there is some MySQL prodigy on this list who might have some insight into making queries which find overlapping features tractable.
>>
>> Chris
>>
>> On Thu, Jan 5, 2012 at 2:44 PM, Lincoln Stein<[hidden email]>  wrote:
>> Hi Chris,
>>
>> Yes, the transparent overlap works with any set of glyphs, but as you say is quite useless for viewing large regions. Set operations would be a terrific feature.
>>
>> Lincoln
>>
>>
>> On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell<[hidden email]>  wrote:
>> Hey Lincoln,
>>
>> I think the overlay would be great for combining the xyplots for RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy plots?  I think the semi-transparency may be a good way to visualize overlapping data on a small scale, but from a zoomed out version it would probably be useless because the glyphs would be smashed together.  I'm running some SQL queries right now on my GB MySql database to find the overlap of the sets I'm interested in.  When I make them more efficient I'll see if I can port them into a Perl module which can work as the backend (However, I'm a complete hack with Perl and work mostly with C++/Python/Lua).
>>
>> Chris
>>
>>
>> On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein<[hidden email]>  wrote:
>> How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.
>>
>> Lincoln
>>
>> On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell<[hidden email]>  wrote:
>> Hey Everyone,
>>
>> Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.
>>
>> Thanks,
>> Chris
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>> _______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> <a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
>> Assistant: Renata Musa<[hidden email]>
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> <a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
>> Assistant: Renata Musa<[hidden email]>
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox_______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
> _______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse



--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
<a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
Assistant: Renata Musa <[hidden email]>


------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Set Theory Operations

Lincoln Stein
That's really nice performance. I'm pleasantly surprised.

Lincoln

On Tue, Jan 10, 2012 at 12:59 PM, Chris Mitchell <[hidden email]> wrote:
I was hoping to implement it to support either variant.  For a persistent track I would want to eat the cost of set operations desired one time only.  For dynamic viewing, looking at the overlap of ~6 million features on chromosome 1 took about 30s, 1 million features on chromosome 15 took 1.5s.  I have no idea how these times scale without the new index I created however.  I'll start digging through the plugins to see what code I could adapt for use in this -- any suggestions are welcome.

Chris


On Tue, Jan 10, 2012 at 12:05 PM, Lincoln Stein <[hidden email]> wrote:
I'm thinking that the set operation tracks are done dynamically as needed:

  1. fetch features from track A across visible region
  2. fetch features from track B across visible region
  3. perform set operation, and store result as a temporary database
  4. display contents of temporary database in synthetic track
Lincoln

On Mon, Jan 9, 2012 at 4:48 PM, Chris Fields <[hidden email]> wrote:
As Lincoln alludes to, an API exists to do this, though it may not be
terribly optimal.  One can do simple set operations with the BioPerl API
directly, see Bio::Range/Bio::RangeI for union/intersect/subtract; all
Bio::SeqFeatureI are also Bio::RangeI.

Also, Bio::DB::SF::Store uses binning for storing locations (see
Bio::DB::SF::Store::DBI::mysql's _location_sql() method), maybe that
could be used?

chris

On 01/09/2012 03:07 PM, Adam Witney wrote:
>
> Does mysql have an equivalent of the EXPLAIN or EXPLAIN ANALYSE from PostgreSQL? i.e. can you tell if it is actually using your indexes or resorting to full table scans?
>
>
> On 9 Jan 2012, at 20:42, Chris Mitchell wrote:
>
>> Hey everyone,
>>
>> I've run into a bit of a wall with set operations which consider overlapping features.  For sets which are equal matches (equal start/end coordinates), the operation is fast and trivial (just use f1.start=f2.start AND f1.end=f2.end).
>> However, for cases which we are looking for overlapping features, on large datasets it takes a nearly infeasible amount of time.
>> Here's an example query which can show this on a sizeable database (my database1 is 5 million rows, database2 is 7 million).
>>
>> SELECT f1.id 'Experimental ID', f1.typeid 'Type', f2.id 'Annotation ID', f2.typeid, f1.start 'f1 Start', f1.end 'f1 END', f2.start 'F2 Start', f2.end 'F2 End', il.seqname, gl.seqname
>> FROM Database1.feature f1
>> JOIN Database2.feature f2
>> left join Database1.locationlist il
>> on il.id=f1.seqid
>> left join Database2.locationlist gl
>> on gl.id=f2.seqid
>> WHERE f1.typeid=5 AND f2.typeid=4 AND il.seqname=gl.seqname AND f1.strand=f2.strand and not (f1.end<f2.start OR f1.start>f2.end)
>>
>> I added a hash index for typeid and a btree for the end/start to try and speed this up without any luck as well. I debated on converting my database to use spatial indices to use the built-in relationships that MySQL defines (If this problem remains unresolved, I'll see if this has some merit).  In anycase, I'm hoping there is some MySQL prodigy on this list who might have some insight into making queries which find overlapping features tractable.
>>
>> Chris
>>
>> On Thu, Jan 5, 2012 at 2:44 PM, Lincoln Stein<[hidden email]>  wrote:
>> Hi Chris,
>>
>> Yes, the transparent overlap works with any set of glyphs, but as you say is quite useless for viewing large regions. Set operations would be a terrific feature.
>>
>> Lincoln
>>
>>
>> On Thu, Jan 5, 2012 at 1:59 PM, Chris Mitchell<[hidden email]>  wrote:
>> Hey Lincoln,
>>
>> I think the overlay would be great for combining the xyplots for RNAseq/WGS data.  Is this semi-transparent overlay going to extend beyond xy plots?  I think the semi-transparency may be a good way to visualize overlapping data on a small scale, but from a zoomed out version it would probably be useless because the glyphs would be smashed together.  I'm running some SQL queries right now on my GB MySql database to find the overlap of the sets I'm interested in.  When I make them more efficient I'll see if I can port them into a Perl module which can work as the backend (However, I'm a complete hack with Perl and work mostly with C++/Python/Lua).
>>
>> Chris
>>
>>
>> On Thu, Jan 5, 2012 at 12:44 PM, Lincoln Stein<[hidden email]>  wrote:
>> How about the ability to overlay one track on another, making each one semi-transparent? I've been working on doing this for xy plot tracks, and think that this can go into production soon, but it designed for a single track that contains multiple features, as shown in the attached screenshot (there are actually overlays of 9 different replicates here). A certain amount of UI work would be needed to make it generic for arbitrary sets of tracks - probably a couple of weeks.
>>
>> Lincoln
>>
>> On Thu, Jan 5, 2012 at 12:31 PM, Chris Mitchell<[hidden email]>  wrote:
>> Hey Everyone,
>>
>> Is there a way/plugin within GBrowse which allows you to perform set theory like functions?  An example would be to show union/intersection/etc of chosen tracks.  This could be useful for applications such as the union of an exon track, an experimental RNAseq track, and an experimental SNP/indel/etc. track.  This is somewhat trivial to do outside of GBrowse, but for end users with less know-how of the backend would find these features useful.
>>
>> Thanks,
>> Chris
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>> _______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> <a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
>> Assistant: Renata Musa<[hidden email]>
>>
>>
>>
>>
>> --
>> Lincoln D. Stein
>> Director, Informatics and Biocomputing Platform
>> Ontario Institute for Cancer Research
>> 101 College St., Suite 800
>> Toronto, ON, Canada M5G0A3
>> <a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
>> Assistant: Renata Musa<[hidden email]>
>>
>> ------------------------------------------------------------------------------
>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
>> infrastructure or vast IT resources to deliver seamless, secure access to
>> virtual desktops. With this all-in-one solution, easily deploy virtual
>> desktops for less than the cost of PCs and save 60% on VDI infrastructure
>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox_______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>
>
> ------------------------------------------------------------------------------
> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
> infrastructure or vast IT resources to deliver seamless, secure access to
> virtual desktops. With this all-in-one solution, easily deploy virtual
> desktops for less than the cost of PCs and save 60% on VDI infrastructure
> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
> _______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse



--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
<a href="tel:416%20673-8514" value="+14166738514" target="_blank">416 673-8514
Assistant: Renata Musa <[hidden email]>




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <[hidden email]>

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create
new or port existing apps to sell to consumers worldwide. Explore the
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
12