[Gmod-ajax] Speed up VCF loads?

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[Gmod-ajax] Speed up VCF loads?

Hans Vasquez-Gross
Hello JBrowsers,


Is there any estimate on how long an indexed VCF file should take to load?  I have the following VCFs which are enabled tracks by default when a user clicks a link from my BLAST results page.  It's taking anywhere from 3-10 seconds to load the data.  


10M Kronos.mapspart2.HetMinCov4HetMinPer15.ktc1-129_ktc133-203.removed.vep.vcf.gz
 7.6M Kronos.mapspart2.HetMinCov5HetMinPer15.ktc1-129_ktc133-203.removed.vep.vcf.gz
80M  Kronos.mapspart2.HetMinCov6HetMinPer15.ktc1-129_ktc133-203.vep.vcf.gz


Is there anyway to speed up the process?

Cheers,
-Hans

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Speed up VCF loads?

Colin
Hi Hans,

Any updates since posting? If you can describe more about which part seems frozen or have an instance available for testing it could help

Also, I guess it's hard to recommend specific things without testing, but I stumbled on an interesting section in the JBrowse config guide yesterday that might be useful. This section actually suggested to break up the VCF by refseq, and then use the {refseq} replacement function in the urlTemplate, e.g.


"urlTemplate": "myvcf_{refseq}.vcf.gz"
"tbiUrlTemplate": "myvcf_{refseq}.vcf.gz.tbi"


This way, your tabix indexes would remain smaller, since it only has one chromosome at a time. 

Let me know if that helps at all, but there could of course could be other things happening!

-Colin

On Fri, Feb 26, 2016 at 7:18 PM, Hans Vasquez-Gross <[hidden email]> wrote:
Hello JBrowsers,


Is there any estimate on how long an indexed VCF file should take to load?  I have the following VCFs which are enabled tracks by default when a user clicks a link from my BLAST results page.  It's taking anywhere from 3-10 seconds to load the data.  


10M Kronos.mapspart2.HetMinCov4HetMinPer15.ktc1-129_ktc133-203.removed.vep.vcf.gz
 7.6M Kronos.mapspart2.HetMinCov5HetMinPer15.ktc1-129_ktc133-203.removed.vep.vcf.gz
80M  Kronos.mapspart2.HetMinCov6HetMinPer15.ktc1-129_ktc133-203.vep.vcf.gz


Is there anyway to speed up the process?

Cheers,
-Hans

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax



------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Speed up VCF loads?

Hans Vasquez-Gross
Hi Colin,

No update, but your suggestion should definitely speed up the loads.  Basically, the JBrowse interface loads quickly, but then when fetching and loading the VCF files for different tracks, users have to wait anywhere from 3-8 seconds.  So, my PI asked if there was any way to speed it up ... and now it sounds like there is a way!  

Thank you.  I'll try implementing a test case this weekend.  The one caveat with working with an incomplete reference is that we don't have chromosomes.  We have 7,389,869 contigs/'refseqs'.  So, with our 7 VCF files, this will generate roughly 7x 7million files if I split by refseq.

Either way, I'm excited that there is a possible solution.  Thanks again.

Cheers,
-Hans

------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Speed up VCF loads?

vkrishna
In reply to this post by Colin
Hi Colin,

Thanks for reporting this functionality, wasn’t aware that the documentation said so! BTW, is this the section of the config document you were referring to: http://gmod.org/wiki/JBrowse_Configuration_Guide#Configuring_track_locations_with_Apache?

On a related note, would this work with BAM files as well? Or is this implementation specific to the VCFTabix store? (Hans, sorry for hijacking your thread!)

Thank you.

Regards,
Vivek

On Mar 4, 2016, at 1:05 PM, Colin <[hidden email]> wrote:

Hi Hans,

Any updates since posting? If you can describe more about which part seems frozen or have an instance available for testing it could help

Also, I guess it's hard to recommend specific things without testing, but I stumbled on an interesting section in the JBrowse config guide yesterday that might be useful. This section actually suggested to break up the VCF by refseq, and then use the {refseq} replacement function in the urlTemplate, e.g.


"urlTemplate": "myvcf_{refseq}.vcf.gz"
"tbiUrlTemplate": "myvcf_{refseq}.vcf.gz.tbi"


This way, your tabix indexes would remain smaller, since it only has one chromosome at a time. 

Let me know if that helps at all, but there could of course could be other things happening!

-Colin

On Fri, Feb 26, 2016 at 7:18 PM, Hans Vasquez-Gross <[hidden email]> wrote:
Hello JBrowsers,


Is there any estimate on how long an indexed VCF file should take to load?  I have the following VCFs which are enabled tracks by default when a user clicks a link from my BLAST results page.  It's taking anywhere from 3-10 seconds to load the data.  


10M Kronos.mapspart2.HetMinCov4HetMinPer15.ktc1-129_ktc133-203.removed.vep.vcf.gz
 7.6M Kronos.mapspart2.HetMinCov5HetMinPer15.ktc1-129_ktc133-203.removed.vep.vcf.gz
80M  Kronos.mapspart2.HetMinCov6HetMinPer15.ktc1-129_ktc133-203.vep.vcf.gz


Is there anyway to speed up the process?

Cheers,
-Hans

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax


------------------------------------------------------------------------------
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax


------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Speed up VCF loads?

Colin
I believe that would work fine on BAM files too!





-Colin

On Fri, Mar 4, 2016 at 4:14 PM, Krishnakumar, Vivek <[hidden email]> wrote:
Hi Colin,

Thanks for reporting this functionality, wasn’t aware that the documentation said so! BTW, is this the section of the config document you were referring to: http://gmod.org/wiki/JBrowse_Configuration_Guide#Configuring_track_locations_with_Apache?

On a related note, would this work with BAM files as well? Or is this implementation specific to the VCFTabix store? (Hans, sorry for hijacking your thread!)

Thank you.

Regards,
Vivek

On Mar 4, 2016, at 1:05 PM, Colin <[hidden email]> wrote:

Hi Hans,

Any updates since posting? If you can describe more about which part seems frozen or have an instance available for testing it could help

Also, I guess it's hard to recommend specific things without testing, but I stumbled on an interesting section in the JBrowse config guide yesterday that might be useful. This section actually suggested to break up the VCF by refseq, and then use the {refseq} replacement function in the urlTemplate, e.g.


"urlTemplate": "myvcf_{refseq}.vcf.gz"
"tbiUrlTemplate": "myvcf_{refseq}.vcf.gz.tbi"


This way, your tabix indexes would remain smaller, since it only has one chromosome at a time. 

Let me know if that helps at all, but there could of course could be other things happening!

-Colin

On Fri, Feb 26, 2016 at 7:18 PM, Hans Vasquez-Gross <[hidden email]> wrote:
Hello JBrowsers,


Is there any estimate on how long an indexed VCF file should take to load?  I have the following VCFs which are enabled tracks by default when a user clicks a link from my BLAST results page.  It's taking anywhere from 3-10 seconds to load the data.  


10M Kronos.mapspart2.HetMinCov4HetMinPer15.ktc1-129_ktc133-203.removed.vep.vcf.gz
 7.6M Kronos.mapspart2.HetMinCov5HetMinPer15.ktc1-129_ktc133-203.removed.vep.vcf.gz
80M  Kronos.mapspart2.HetMinCov6HetMinPer15.ktc1-129_ktc133-203.vep.vcf.gz


Is there anyway to speed up the process?

Cheers,
-Hans

------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax


------------------------------------------------------------------------------
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax



------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Speed up VCF loads?

Hans Vasquez-Gross
In reply to this post by Hans Vasquez-Gross
Hi Colin,

It took me a while, but I finally got around to implementing this change.  It has definitely sped up load times for most of the tracks.


However, for some tracks where we don't have data on a particular {refseq}, I get the following error message.  Basically ,its complaining that a file doesn't exist (which it shouldn't since there are no called mutations for that particular {refseq} for that tracks.  Is there a way to default to a blank track/squelch the error message if the VCF file doesn't exist?

Inline image 1

Thank you,
-Hans

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Speed up VCF loads?

Colin
Not sure if there's a good fix here. In NCList for example, where using the "{refseq}" urlTemplate params is much more common (it is how flatfile-to-json operates by default), it has a check for a 404 and then just has a empty track in that case.

In VCF, it doesn't have that check though, so you get this error message.


-Colin

On Fri, Apr 15, 2016 at 7:21 PM, Hans Vasquez-Gross <[hidden email]> wrote:
Hi Colin,

It took me a while, but I finally got around to implementing this change.  It has definitely sped up load times for most of the tracks.


However, for some tracks where we don't have data on a particular {refseq}, I get the following error message.  Basically ,its complaining that a file doesn't exist (which it shouldn't since there are no called mutations for that particular {refseq} for that tracks.  Is there a way to default to a blank track/squelch the error message if the VCF file doesn't exist?

Inline image 1

Thank you,
-Hans


------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Speed up VCF loads?

Hans Vasquez-Gross
Hi Colin,

There has to be a way to mimic that process.  Is there anyway to make this VCF NCList like to default to that behaviour?  Would I need to develop a plugin to get this to work?  

The speed increase is pretty significant.  It goes from loading in about 90seconds to loading in 5 seconds, so I would really like to get this {refseq} vcf loading working.

Cheers,
-Hans

On Sun, Apr 17, 2016 at 11:47 PM, Colin <[hidden email]> wrote:
Not sure if there's a good fix here. In NCList for example, where using the "{refseq}" urlTemplate params is much more common (it is how flatfile-to-json operates by default), it has a check for a 404 and then just has a empty track in that case.

In VCF, it doesn't have that check though, so you get this error message.


-Colin

On Fri, Apr 15, 2016 at 7:21 PM, Hans Vasquez-Gross <[hidden email]> wrote:
Hi Colin,

It took me a while, but I finally got around to implementing this change.  It has definitely sped up load times for most of the tracks.


However, for some tracks where we don't have data on a particular {refseq}, I get the following error message.  Basically ,its complaining that a file doesn't exist (which it shouldn't since there are no called mutations for that particular {refseq} for that tracks.  Is there a way to default to a blank track/squelch the error message if the VCF file doesn't exist?

Inline image 1

Thank you,
-Hans




------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Speed up VCF loads?

Colin
I totally agree that the load time gains are quite good when the tabix gets broken up. I thought a quick plugin might be able to help with the problem so I tested this example out (really quickly) with a VCF track. Let me know if it helps!




-Colin

On Wed, Apr 20, 2016 at 4:33 PM, Hans Vasquez-Gross <[hidden email]> wrote:
Hi Colin,

There has to be a way to mimic that process.  Is there anyway to make this VCF NCList like to default to that behaviour?  Would I need to develop a plugin to get this to work?  

The speed increase is pretty significant.  It goes from loading in about 90seconds to loading in 5 seconds, so I would really like to get this {refseq} vcf loading working.

Cheers,
-Hans


On Sun, Apr 17, 2016 at 11:47 PM, Colin <[hidden email]> wrote:
Not sure if there's a good fix here. In NCList for example, where using the "{refseq}" urlTemplate params is much more common (it is how flatfile-to-json operates by default), it has a check for a 404 and then just has a empty track in that case.

In VCF, it doesn't have that check though, so you get this error message.


-Colin

On Fri, Apr 15, 2016 at 7:21 PM, Hans Vasquez-Gross <[hidden email]> wrote:
Hi Colin,

It took me a while, but I finally got around to implementing this change.  It has definitely sped up load times for most of the tracks.


However, for some tracks where we don't have data on a particular {refseq}, I get the following error message.  Basically ,its complaining that a file doesn't exist (which it shouldn't since there are no called mutations for that particular {refseq} for that tracks.  Is there a way to default to a blank track/squelch the error message if the VCF file doesn't exist?

Inline image 1

Thank you,
-Hans





------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Speed up VCF loads?

Hans Vasquez-Gross
Wow, thanks Colin.  I'll test this out today.  I'll follow up with my thoughts.

Cheers,
-Hans

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Speed up VCF loads?

Hans Vasquez-Gross
In reply to this post by Colin
I'm trying this out now.  In addition to changing the storeClass line, do I also need to change the type?

from: 
type = JBrowse/View/Track/CanvasVariants
to:
type = CanvasVariants


Cheers,
-Hans

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Speed up VCF loads?

Hans Vasquez-Gross
In reply to this post by Colin
Hi JBrowsers,

I just wanted to comment saying that this plugin is working great for VCFs!  Thanks Colin.

Cheers,
-Hans

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax