Quantcast

[biomart-users] problems of getting SNPs within a genome range

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[biomart-users] problems of getting SNPs within a genome range

Zhaoming Wu
Dear all,

I'm using biomart to extract some SNPs within a range in the chicken genome.
what I found is when I search the beginning of a chrome even in a large range, the results return in a few seconds, e.g.
> getBM(attributes = c('snp','refsnp_id'),filters = c('chr_name','start','end'),values = list(15,1,20000),mart = gal5_snp)

but when it comes to my regions of interest (15:12291177,12296403)
> getBM(attributes = c('snp','refsnp_id'),filters = c('chr_name','start','end'),values = list(15,12291177,12296403),mart = gal5_snp)

Although it's only a range of 5000 bp, it takes forever and always ends up with an error :
Error in value[[3L]](cond) : 
  Request to BioMart web service failed. Verify if you are still connected to the internet.  Alternatively the BioMart web service is temporarily down.

Any idea how to resolve this issue?

Thank you & best regards,

Zhaoming

--
You received this message because you are subscribed to the Google Groups "biomart-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
Visit this group at https://groups.google.com/group/biomart-users.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [biomart-users] problems of getting SNPs within a genome range

Thomas Maurel
Dear Zhaoming,


I also get a timeout when using the BiomaRt package. I’ve cced Mike Smith as he might be able to help you further.

Kind Regards,
Thomas
On 18 Apr 2017, at 05:02, Zhaoming Wu <[hidden email]> wrote:

Dear all,

I'm using biomart to extract some SNPs within a range in the chicken genome.
what I found is when I search the beginning of a chrome even in a large range, the results return in a few seconds, e.g.
> getBM(attributes = c('snp','refsnp_id'),filters = c('chr_name','start','end'),values = list(15,1,20000),mart = gal5_snp)

but when it comes to my regions of interest (15:12291177,12296403)
> getBM(attributes = c('snp','refsnp_id'),filters = c('chr_name','start','end'),values = list(15,12291177,12296403),mart = gal5_snp)

Although it's only a range of 5000 bp, it takes forever and always ends up with an error :
Error in value[[3L]](cond) : 
  Request to BioMart web service failed. Verify if you are still connected to the internet.  Alternatively the BioMart web service is temporarily down.

Any idea how to resolve this issue?

Thank you & best regards,

Zhaoming

--
You received this message because you are subscribed to the Google Groups "biomart-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
Visit this group at https://groups.google.com/group/biomart-users.
For more options, visit https://groups.google.com/d/optout.

--
Thomas Maurel
Bioinformatician - Ensembl Production Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

--
You received this message because you are subscribed to the Google Groups "biomart-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
Visit this group at https://groups.google.com/group/biomart-users.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [biomart-users] problems of getting SNPs within a genome range

Mike Smith
Hi Thomas and Zhaoming,

I think there's a little confusion regarding what attributes are being requested here.  Zhaoming is asking for 'snp' and 'refsnp_id', but the link you provide Thomas returns 'refsnp_id' and 'refsnp_source'. 

If I try search for the 'snp' attribute in the browser session, I sometimes get results and sometimes get a time out and the message 'There was a problem with the request' e.g.

http://www.ensembl.org/biomart/martview/e9db7c908e9f459b4eeae9abf723cc6b?VIRTUALSCHEMANAME=default&ATTRIBUTES=ggallus_snp.default.sequences.refsnp_id|ggallus_snp.default.sequences.snp&FILTERS=ggallus_snp.default.filters.chr_name.%2215%22&VISIBLEPANEL=resultspanel

The 'snp' attribute is actually asking for the genomic sequence, and you can extended it to more than one base with the upstream/downstream flank filters.  Presumably querying for multiple sequences takes a long time, although I have no idea why the performance decreases as you move along a chromosome.

As a work around, since you seem to only be interested in the alleles for the snp locations, we can get around the problem by asking for the 'allele' attribute instead. e.g.

library(biomaRt)
gal5_snp <- useMart(biomart = "ENSEMBL_MART_SNP", 
                   host = "www.ensembl.org",
                   dataset = "ggallus_snp")

getBM(attributes = c('refsnp_id', 'allele'),
      filters = c('chr_name','start','end'),
      values = list(15,12291177,12296403),
      mart = gal5_snp)


Cheers,

Mike


On Friday, 21 April 2017 11:51:05 UTC+2, Thomas Maurel wrote:
Dear Zhaoming,

I’ve tried your query on the <a href="http://www.ensembl.org" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.ensembl.org\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEhRkMjGzJ0Ux5vBUiT-NWHb0ycIg&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.ensembl.org\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEhRkMjGzJ0Ux5vBUiT-NWHb0ycIg&#39;;return true;">www.ensembl.org browser and the result comes back in seconds: <a href="http://www.ensembl.org/biomart/martview?VIRTUALSCHEMANAME=default&amp;ATTRIBUTES=ggallus_snp.default.snp.refsnp_id%7Cggallus_snp.default.snp.refsnp_source%7Cggallus_snp.default.snp.chr_name%7Cggallus_snp.default.snp.chrom_start%7Cggallus_snp.default.snp.chrom_end&amp;FILTERS=ggallus_snp.default.filters.chromosomal_region.%E2%80%9D15:12291177:12296403%22&amp;VISIBLEPANEL=resultspanel" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.ensembl.org%2Fbiomart%2Fmartview%3FVIRTUALSCHEMANAME%3Ddefault%26ATTRIBUTES%3Dggallus_snp.default.snp.refsnp_id%257Cggallus_snp.default.snp.refsnp_source%257Cggallus_snp.default.snp.chr_name%257Cggallus_snp.default.snp.chrom_start%257Cggallus_snp.default.snp.chrom_end%26FILTERS%3Dggallus_snp.default.filters.chromosomal_region.%25E2%2580%259D15%3A12291177%3A12296403%2522%26VISIBLEPANEL%3Dresultspanel\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNE9LP94QJMrJXT7y--6hJisyQ1Mxw&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.ensembl.org%2Fbiomart%2Fmartview%3FVIRTUALSCHEMANAME%3Ddefault%26ATTRIBUTES%3Dggallus_snp.default.snp.refsnp_id%257Cggallus_snp.default.snp.refsnp_source%257Cggallus_snp.default.snp.chr_name%257Cggallus_snp.default.snp.chrom_start%257Cggallus_snp.default.snp.chrom_end%26FILTERS%3Dggallus_snp.default.filters.chromosomal_region.%25E2%2580%259D15%3A12291177%3A12296403%2522%26VISIBLEPANEL%3Dresultspanel\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNE9LP94QJMrJXT7y--6hJisyQ1Mxw&#39;;return true;">http://www.ensembl.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=ggallus_snp.default.snp.refsnp_id|ggallus_snp.default.snp.refsnp_source|ggallus_snp.default.snp.chr_name|ggallus_snp.default.snp.chrom_start|ggallus_snp.default.snp.chrom_end&FILTERS=ggallus_snp.default.filters.chromosomal_region.”15:12291177:12296403"&VISIBLEPANEL=resultspanel

I also get a timeout when using the BiomaRt package. I’ve cced Mike Smith as he might be able to help you further.

Kind Regards,
Thomas
On 18 Apr 2017, at 05:02, Zhaoming Wu <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="z40trtx1EAAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">ohthat...@...> wrote:

Dear all,

I'm using biomart to extract some SNPs within a range in the chicken genome.
what I found is when I search the beginning of a chrome even in a large range, the results return in a few seconds, e.g.
> getBM(attributes = c('snp','refsnp_id'),filters = c('chr_name','start','end'),values = list(15,1,20000),mart = gal5_snp)

but when it comes to my regions of interest (15:12291177,12296403)
> getBM(attributes = c('snp','refsnp_id'),filters = c('chr_name','start','end'),values = list(15,12291177,12296403),mart = gal5_snp)

Although it's only a range of 5000 bp, it takes forever and always ends up with an error :
Error in value[[3L]](cond) : 
  Request to BioMart web service failed. Verify if you are still connected to the internet.  Alternatively the BioMart web service is temporarily down.

Any idea how to resolve this issue?

Thank you & best regards,

Zhaoming

--
You received this message because you are subscribed to the Google Groups "biomart-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="z40trtx1EAAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">biomart-user...@googlegroups.com.
Visit this group at <a href="https://groups.google.com/group/biomart-users" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/group/biomart-users&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/group/biomart-users&#39;;return true;">https://groups.google.com/group/biomart-users.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
Thomas Maurel
Bioinformatician - Ensembl Production Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

--
You received this message because you are subscribed to the Google Groups "biomart-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
Visit this group at https://groups.google.com/group/biomart-users.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [biomart-users] problems of getting SNPs within a genome range

Thomas Maurel
Hi Mike and Zhaoming,

On 24 Apr 2017, at 12:44, Mike Smith <[hidden email]> wrote:

Hi Thomas and Zhaoming,

I think there's a little confusion regarding what attributes are being requested here.  Zhaoming is asking for ‘snp' and 'refsnp_id', but the link you provide Thomas returns 'refsnp_id' and 'refsnp_source'. 
Very well spotted, sorry, you are right.

If I try search for the 'snp' attribute in the browser session, I sometimes get results and sometimes get a time out and the message 'There was a problem with the request' e.g.

<a href="http://www.ensembl.org/biomart/martview/e9db7c908e9f459b4eeae9abf723cc6b?VIRTUALSCHEMANAME=default&amp;ATTRIBUTES=ggallus_snp.default.sequences.refsnp_id|ggallus_snp.default.sequences.snp&amp;FILTERS=ggallus_snp.default.filters.chr_name.%2215%22&amp;VISIBLEPANEL=resultspanel" class="">http://www.ensembl.org/biomart/martview/e9db7c908e9f459b4eeae9abf723cc6b?VIRTUALSCHEMANAME=default&ATTRIBUTES=ggallus_snp.default.sequences.refsnp_id|ggallus_snp.default.sequences.snp&FILTERS=ggallus_snp.default.filters.chr_name.%2215%22&VISIBLEPANEL=resultspanel

The 'snp' attribute is actually asking for the genomic sequence, and you can extended it to more than one base with the upstream/downstream flank filters.  Presumably querying for multiple sequences takes a long time, although I have no idea why the performance decreases as you move along a chromosome.
Requesting variation sequences without upstream/downstream flank filters is quite resource intensive, especially in a region with many variants. 

As a work around, since you seem to only be interested in the alleles for the snp locations, we can get around the problem by asking for the 'allele' attribute instead. e.g.

library(biomaRt)
gal5_snp <- useMart(biomart = "ENSEMBL_MART_SNP", 
                   host = "www.ensembl.org",
                   dataset = "ggallus_snp")

getBM(attributes = c('refsnp_id', 'allele'),
      filters = c('chr_name','start','end'),
      values = list(15,12291177,12296403),
      mart = gal5_snp)


Sounds good. Zhaoming, if you are interested in variant sequences then I would suggest to add both upstream/downstream flank filters.

Kind Regards,
Thomas
Cheers,

Mike


On Friday, 21 April 2017 11:51:05 UTC+2, Thomas Maurel wrote:
Dear Zhaoming,

I’ve tried your query on the <a href="http://www.ensembl.org/" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.ensembl.org\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEhRkMjGzJ0Ux5vBUiT-NWHb0ycIg';return true;" onclick="this.href='http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.ensembl.org\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEhRkMjGzJ0Ux5vBUiT-NWHb0ycIg';return true;" class="">www.ensembl.org browser and the result comes back in seconds: <a href="http://www.ensembl.org/biomart/martview?VIRTUALSCHEMANAME=default&amp;ATTRIBUTES=ggallus_snp.default.snp.refsnp_id%7Cggallus_snp.default.snp.refsnp_source%7Cggallus_snp.default.snp.chr_name%7Cggallus_snp.default.snp.chrom_start%7Cggallus_snp.default.snp.chrom_end&amp;FILTERS=ggallus_snp.default.filters.chromosomal_region.%E2%80%9D15:12291177:12296403%22&amp;VISIBLEPANEL=resultspanel" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.ensembl.org%2Fbiomart%2Fmartview%3FVIRTUALSCHEMANAME%3Ddefault%26ATTRIBUTES%3Dggallus_snp.default.snp.refsnp_id%257Cggallus_snp.default.snp.refsnp_source%257Cggallus_snp.default.snp.chr_name%257Cggallus_snp.default.snp.chrom_start%257Cggallus_snp.default.snp.chrom_end%26FILTERS%3Dggallus_snp.default.filters.chromosomal_region.%25E2%2580%259D15%3A12291177%3A12296403%2522%26VISIBLEPANEL%3Dresultspanel\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNE9LP94QJMrJXT7y--6hJisyQ1Mxw';return true;" onclick="this.href='http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.ensembl.org%2Fbiomart%2Fmartview%3FVIRTUALSCHEMANAME%3Ddefault%26ATTRIBUTES%3Dggallus_snp.default.snp.refsnp_id%257Cggallus_snp.default.snp.refsnp_source%257Cggallus_snp.default.snp.chr_name%257Cggallus_snp.default.snp.chrom_start%257Cggallus_snp.default.snp.chrom_end%26FILTERS%3Dggallus_snp.default.filters.chromosomal_region.%25E2%2580%259D15%3A12291177%3A12296403%2522%26VISIBLEPANEL%3Dresultspanel\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNE9LP94QJMrJXT7y--6hJisyQ1Mxw';return true;" class="">http://www.ensembl.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=ggallus_snp.default.snp.refsnp_id|ggallus_snp.default.snp.refsnp_source|ggallus_snp.default.snp.chr_name|ggallus_snp.default.snp.chrom_start|ggallus_snp.default.snp.chrom_end&FILTERS=ggallus_snp.default.filters.chromosomal_region.”15:12291177:12296403"&VISIBLEPANEL=resultspanel

I also get a timeout when using the BiomaRt package. I’ve cced Mike Smith as he might be able to help you further.

Kind Regards,
Thomas
On 18 Apr 2017, at 05:02, Zhaoming Wu <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="z40trtx1EAAJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;" class="">ohthat...@...> wrote:

Dear all,

I'm using biomart to extract some SNPs within a range in the chicken genome.
what I found is when I search the beginning of a chrome even in a large range, the results return in a few seconds, e.g.
> getBM(attributes = c('snp','refsnp_id'),filters = c('chr_name','start','end'),values = list(15,1,20000),mart = gal5_snp)

but when it comes to my regions of interest (15:12291177,12296403)
> getBM(attributes = c('snp','refsnp_id'),filters = c('chr_name','start','end'),values = list(15,12291177,12296403),mart = gal5_snp)

Although it's only a range of 5000 bp, it takes forever and always ends up with an error :
Error in value[[3L]](cond) : 
  Request to BioMart web service failed. Verify if you are still connected to the internet.  Alternatively the BioMart web service is temporarily down.

Any idea how to resolve this issue?

Thank you & best regards,

Zhaoming

--
You received this message because you are subscribed to the Google Groups "biomart-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="z40trtx1EAAJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;" class="">biomart-user...@googlegroups.com.
Visit this group at <a href="https://groups.google.com/group/biomart-users" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/group/biomart-users';return true;" onclick="this.href='https://groups.google.com/group/biomart-users';return true;" class="">https://groups.google.com/group/biomart-users.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;" class="">https://groups.google.com/d/optout.

--
Thomas Maurel
Bioinformatician - Ensembl Production Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom


--
You received this message because you are subscribed to the Google Groups "biomart-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
Visit this group at https://groups.google.com/group/biomart-users.
For more options, visit https://groups.google.com/d/optout.

--
Thomas Maurel
Bioinformatician - Ensembl Production Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

--
You received this message because you are subscribed to the Google Groups "biomart-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
Visit this group at https://groups.google.com/group/biomart-users.
For more options, visit https://groups.google.com/d/optout.
Loading...