[BioMart Users] BioMart 0.8 RC3/4 Bug - Query re-generation from URL

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[BioMart Users] BioMart 0.8 RC3/4 Bug - Query re-generation from URL

Esposito, Anthony
BioMart 0.8 RC3/4 Bug - Query re-generation from URL

Hi everyone,

Here’s a bug we found, tested in RC3 and RC4 code on our end, as well as at the main DCC site.  Kind of long but it’s a little complex and wanted to document everything we tried and found. 

Overview

The problem is with the generation of XML from a URL on the front end, to POST a query to the server, during page refresh or when emailing a results URL to a colleague.

If I generate a query in, e.g., Advanced search, the URL that shows up on the final results page has all the attributes in it.  So does the XML if I click “View XML”. 

However, if I try to send this query to someone as a link, or simply hit F5 to refresh the page, stuff starts to break. 

It looks like when you try to go to such a URL, after the static content on the page loads an asynchronous POST is made with the query XML; the response to this query gets loaded as the results you see on the page; on a page refresh, or load from copy/pasting the URL, some attributes in the original query are lost and even though they remain in the URL, the XML that is posted does not contain them; nor does the XML shown in “View XML”

Full details and reproduction of bug:

From the main DCC page (http://dcc.icgc.org), select Database Search à Advanced à Genes

Dataset:

Breast Cancer (JHU, US)

Select Filters:

(No filters)

Select Attributes:

Gene: Gene Symbol

Experimental Data: Tumour sample ID, Mutation typ, Amino acid mutation

Donor: Donor ID

Diagnosis: Diagnosis ID, ICD 10

View Results (which show up as expected):

Gene Symbol     Tumour Sample ID        Mutation type                   Amino acid mutation     Donor ID        Diagnosis ID    ICD10

CHD5            HCC1008         single base substitution        p.D119N         HCC1599 HCC1599 Breast

Then, copy/paste the URL or simply hit F5 to reload the page based on the generated URL:

(http://dcc.icgc.org/martwizard/#!/SimpleMutation?mart=hsapiens_gene_ensembl_7_1&step=4&datasets=hsapiens_gene_ensembl_hopkinsBreast3&attributes=external_gene_id%2Csimple_somatic_mutation__dm_sample_id%2Csimple_somatic_mutation__dm_mutation_type%2Chsapiens_gene_ensembl__simple_somatic_mutation__dm__aa_mutation%2Cdonor_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3%2Cdiagnosis_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3%2Csample__diagnosis__main__icd_10_copy_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3)

Results now show up missing all Experimental (Simple Somatic Mutation) fields (Tumour Sample ID, Mutation type, Amino acid mutation)

        Gene Symbol     Donor ID        Diagnosis ID    ICD10

CHD5            HCC1599 HCC1599 Breast

Click the blue “Back” button to go back to the Attributes selection page, and indeed the Experimental Data section is gone, and the respective Attributes do not show up on the Summary list on the right side.

The XML shown in “View XML” changes accordingly as well; at first it has all the fields selected (first XML blurb below), then the SSM-specific ones disappear after refreshing the page or copy/pasting the URL to a new window (second XML blurb below).  Also of note, and this might be of significance, in the second query the TCGA AML dataset is being added (dataset name goes from hsapiens_gene_ensembl_hopkinsBreast3 to hsapiens_gene_ensembl_tcgaLAML,hsapiens_gene_ensembl_hopkinsBreast3 Since the TCGA AML dataset does not have any SSM data, this is likely forcing those fields to be dropped from the initial query.  The TCGA AML dataset has methylation data, so if I try a methylation query with a non-AML dataset, the addition of the TCGA AML dataset to the query on page refresh results in reordered, yet correct, columns in the results

        Original:

      <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query client="true" processor="TSVX" limit="1000" header="1">

      <Dataset name="hsapiens_gene_ensembl_hopkinsBreast3" config="hsapiens_gene_ensembl_7_1">

      <Attribute name="external_gene_id"/>

      <Attribute name="simple_somatic_mutation__dm_sample_id"/>

      <Attribute name="simple_somatic_mutation__dm_mutation_type"/>

      <Attribute name="hsapiens_gene_ensembl__simple_somatic_mutation__dm__aa_mutation"/>

      <Attribute name="donor_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>

      <Attribute name="diagnosis_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>

      <Attribute name="sample__diagnosis__main__icd_10_copy_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>

      </Dataset>

      </Query>

        Generated on page refresh:

      <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query client="true" processor="TSV" limit="-1" header="1">

      <Dataset name="hsapiens_gene_ensembl_tcgaLAML,hsapiens_gene_ensembl_hopkinsBreast3" config="hsapiens_gene_ensembl_7_1">

      <Attribute name="external_gene_id"/>

      <Attribute name="donor_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>

      <Attribute name="diagnosis_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>

      <Attribute name="sample__diagnosis__main__icd_10_copy_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>

      </Dataset></Query>

Finally, the

tj
                                                                                                 _____
T.J. Esposito
La Jolla Research Business Technology - Bioinformatics & Innovation
Please consider the environment before printing email.

_____________________________________________
From: Estrella, Heather
Sent: Wednesday, February 02, 2011 8:09 AM
To: Esposito, Anthony
Subject: RE: contact info -- problem with copy/paste

Hi TJ,

Thanks for fixing the donor/diagnosis table problem.

There still seems to be a problem with copy/pasting URL’s for queries that contain donor/diagnosis fields. I’ve tested this using quick search. For example, I can run the following query and the table comes back great.

 << OLE Object: Picture (Device Independent Bitmap) >>

I than copy and paste the URL into a new browser and it just keeps spinning (no results shown) after several minutes.

 << OLE Object: Picture (Device Independent Bitmap) >>

http://biomart-stg.pfizer.com/martwizard/#!/SimpleMutation?mart=hsapiens_gene_ensembl_7_1&datasets=hsapiens_gene_ensembl_pfeBoe&step=4&attributes=simple_somatic_mutation__dm_sample_id%2Csimple_somatic_mutation__dm_mutation_type%2Chsapiens_gene_ensembl__simple_somatic_mutation__dm__aa_mutation%2Csimple_somatic_mutation__dm_chromosome%2Csimple_somatic_mutation__dm_consequence_type%2Cdonor_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3%2Cdiagnosis_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3%2Csample__diagnosis__main__icd_10_copy_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3

On a similar note, when I run a query without donor/diagnosis the tables again comes back fine.

 << OLE Object: Picture (Device Independent Bitmap) >>

I copy/paste the URL into a new browser and get back a list of goofy genes.

http://biomart-stg.pfizer.com/martwizard/#!/SimpleMutation?mart=hsapiens_gene_ensembl_7_1&datasets=hsapiens_gene_ensembl_pfeBoe&step=4&attributes=external_gene_id%2Csimple_somatic_mutation__dm_sample_id%2Csimple_somatic_mutation__dm_mutation_type%2Chsapiens_gene_ensembl__simple_somatic_mutation__dm__aa_mutation

 << OLE Object: Picture (Device Independent Bitmap) >>

The URL stays the same, but when I look at the XML, I see that it’s adding in the tcgaLAML dataset (the first dataset in the list) and only shows the “external_gene_id” as the attribute to return. That’s probably because it’s the only field in common between the two datasets. I just don’t know  why it’s adding in the additional dataset.

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query client="true" processor="TSV" limit="-1" header="1"><Dataset name="hsapiens_gene_ensembl_tcgaLAML,hsapiens_gene_ensembl_pfeBoe" config="hsapiens_gene_ensembl_7_1"><Attribute name="external_gene_id"/></Dataset></Query>

I’m hoping that there is a common problem that when fixed will solve both of these issues. Let me know how I can help.

Thanks,

Heather

_____________________________________________
From: Esposito, Anthony
Sent: Tuesday, February 01, 2011 5:49 PM
To: Estrella, Heather
Cc: Ching, Keith
Subject: RE: contact info

Heather, Keith-

Everything should be working now!

The solution wasn’t what we did last time, I still forget what we did but anyway, this time there was a missing comma in one of the classes that handled subquery generation, generating SQL errors which in turn caused the queries to fail in general.  I can now select donor/diagnosis information in addition to snp/cnv info from the Genes advanced search. 

Let me know if anything else weird comes up!

tj
                                                                                                 _____
T.J. Esposito
La Jolla Research Business Technology - Bioinformatics & Innovation
Please consider the environment before printing email.

_____________________________________________
From: Estrella, Heather
Sent: Tuesday, February 01, 2011 5:00 PM
To: Esposito, Anthony
Subject: contact info

Hi TJ,

Thanks for looking into this. I have to head out soon, but can dial-in from home if needed. I showed Keith a work around for now (i.e. not selecting any attributes under donor, diagnosis or sample).

Thanks!

Heather


_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] BioMart 0.8 RC3/4 Bug - Query re-generation from URL

Syed Haider
Thanks TJ, quite right, let us fix this and update the servers as well.

Syed

On 02/02/2011 20:46, Esposito, Anthony wrote:

> Hi everyone,
>
> Here’s a bug we found, tested in RC3 and RC4 code on our end, as well as at the main DCC site.  Kind of long but it’s a little complex and wanted to document everything we tried and found.
>
> Overview
>
> The problem is with the generation of XML from a URL on the front end, to POST a query to the server, during page refresh or when emailing a results URL to a colleague.
>
> If I generate a query in, e.g., Advanced search, the URL that shows up on the final results page has all the attributes in it.  So does the XML if I click “View XML”.
>
> However, if I try to send this query to someone as a link, or simply hit F5 to refresh the page, stuff starts to break.
>
> It looks like when you try to go to such a URL, after the static content on the page loads an asynchronous POST is made with the query XML; the response to this query gets loaded as the results you see on the page; on a page refresh, or load from copy/pasting the URL, some attributes in the original query are lost and even though they remain in the URL, the XML that is posted does not contain them; nor does the XML shown in “View XML”
>
> Full details and reproduction of bug:
>
>  From the main DCC page (http://dcc.icgc.org), select Database Search -->  Advanced -->  Genes
>
> Dataset:
>
> Breast Cancer (JHU, US)
>
> Select Filters:
>
> (No filters)
>
> Select Attributes:
>
> Gene: Gene Symbol
>
> Experimental Data: Tumour sample ID, Mutation typ, Amino acid mutation
>
> Donor: Donor ID
>
> Diagnosis: Diagnosis ID, ICD 10
>
> View Results (which show up as expected):
>
> Gene Symbol     Tumour Sample ID        Mutation type                   Amino acid mutation     Donor ID        Diagnosis ID    ICD10
>
> CHD5            HCC1008         single base substitution        p.D119N         HCC1599 HCC1599 Breast
>
> Then, copy/paste the URL or simply hit F5 to reload the page based on the generated URL:
>
> (http://dcc.icgc.org/martwizard/#!/SimpleMutation?mart=hsapiens_gene_ensembl_7_1&step=4&datasets=hsapiens_gene_ensembl_hopkinsBreast3&attributes=external_gene_id%2Csimple_somatic_mutation__dm_sample_id%2Csimple_somatic_mutation__dm_mutation_type%2Chsapiens_gene_ensembl__simple_somatic_mutation__dm__aa_mutation%2Cdonor_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3%2Cdiagnosis_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3%2Csample__diagnosis__main__icd_10_copy_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3)
>
> Results now show up missing all Experimental (Simple Somatic Mutation) fields (Tumour Sample ID, Mutation type, Amino acid mutation)
>
>          Gene Symbol     Donor ID        Diagnosis ID    ICD10
>
> CHD5            HCC1599 HCC1599 Breast
>
> Click the blue “Back” button to go back to the Attributes selection page, and indeed the Experimental Data section is gone, and the respective Attributes do not show up on the Summary list on the right side.
>
> The XML shown in “View XML” changes accordingly as well; at first it has all the fields selected (first XML blurb below), then the SSM-specific ones disappear after refreshing the page or copy/pasting the URL to a new window (second XML blurb below).  Also of note, and this might be of significance, in the second query the TCGA AML dataset is being added (dataset name goes from “hsapiens_gene_ensembl_hopkinsBreast3” to “hsapiens_gene_ensembl_tcgaLAML,hsapiens_gene_ensembl_hopkinsBreast3”.  Since the TCGA AML dataset does not have any SSM data, this is likely forcing those fields to be dropped from the initial query.  The TCGA AML dataset has methylation data, so if I try a methylation query with a non-AML dataset, the addition of the TCGA AML dataset to the query on page refresh results in reordered, yet correct, columns in the results.
>
>          Original:
>
> <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query client="true" processor="TSVX" limit="1000" header="1">
>
> <Dataset name="hsapiens_gene_ensembl_hopkinsBreast3" config="hsapiens_gene_ensembl_7_1">
>
> <Attribute name="external_gene_id"/>
>
> <Attribute name="simple_somatic_mutation__dm_sample_id"/>
>
> <Attribute name="simple_somatic_mutation__dm_mutation_type"/>
>
> <Attribute name="hsapiens_gene_ensembl__simple_somatic_mutation__dm__aa_mutation"/>
>
> <Attribute name="donor_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>
>
> <Attribute name="diagnosis_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>
>
> <Attribute name="sample__diagnosis__main__icd_10_copy_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>
>
> </Dataset>
>
> </Query>
>
>          Generated on page refresh:
>
> <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query client="true" processor="TSV" limit="-1" header="1">
>
> <Dataset name="hsapiens_gene_ensembl_tcgaLAML,hsapiens_gene_ensembl_hopkinsBreast3" config="hsapiens_gene_ensembl_7_1">
>
> <Attribute name="external_gene_id"/>
>
> <Attribute name="donor_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>
>
> <Attribute name="diagnosis_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>
>
> <Attribute name="sample__diagnosis__main__icd_10_copy_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3"/>
>
> </Dataset></Query>
>
> Finally, the
>
> tj
>                                                                                                   _____
> T.J. Esposito
> La Jolla Research Business Technology - Bioinformatics&  Innovation
> Please consider the environment before printing email.
>
> _____________________________________________
> From: Estrella, Heather
> Sent: Wednesday, February 02, 2011 8:09 AM
> To: Esposito, Anthony
> Subject: RE: contact info -- problem with copy/paste
>
> Hi TJ,
>
> Thanks for fixing the donor/diagnosis table problem.
>
> There still seems to be a problem with copy/pasting URL’s for queries that contain donor/diagnosis fields. I’ve tested this using quick search. For example, I can run the following query and the table comes back great.
>
>   <<  OLE Object: Picture (Device Independent Bitmap)>>
>
> I than copy and paste the URL into a new browser and it just keeps spinning (no results shown) after several minutes.
>
>   <<  OLE Object: Picture (Device Independent Bitmap)>>
>
> http://biomart-stg.pfizer.com/martwizard/#!/SimpleMutation?mart=hsapiens_gene_ensembl_7_1&datasets=hsapiens_gene_ensembl_pfeBoe&step=4&attributes=simple_somatic_mutation__dm_sample_id%2Csimple_somatic_mutation__dm_mutation_type%2Chsapiens_gene_ensembl__simple_somatic_mutation__dm__aa_mutation%2Csimple_somatic_mutation__dm_chromosome%2Csimple_somatic_mutation__dm_consequence_type%2Cdonor_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3%2Cdiagnosis_id_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3%2Csample__diagnosis__main__icd_10_copy_sample_garvanPanc3_hsapiens_gene_ensembl_tcgaOV3
>
> On a similar note, when I run a query without donor/diagnosis the tables again comes back fine.
>
>   <<  OLE Object: Picture (Device Independent Bitmap)>>
>
> I copy/paste the URL into a new browser and get back a list of goofy genes.
>
> http://biomart-stg.pfizer.com/martwizard/#!/SimpleMutation?mart=hsapiens_gene_ensembl_7_1&datasets=hsapiens_gene_ensembl_pfeBoe&step=4&attributes=external_gene_id%2Csimple_somatic_mutation__dm_sample_id%2Csimple_somatic_mutation__dm_mutation_type%2Chsapiens_gene_ensembl__simple_somatic_mutation__dm__aa_mutation
>
>   <<  OLE Object: Picture (Device Independent Bitmap)>>
>
> The URL stays the same, but when I look at the XML, I see that it’s adding in the tcgaLAML dataset (the first dataset in the list) and only shows the “external_gene_id” as the attribute to return. That’s probably because it’s the only field in common between the two datasets. I just don’t know  why it’s adding in the additional dataset.
>
> <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query client="true" processor="TSV" limit="-1" header="1"><Dataset name="hsapiens_gene_ensembl_tcgaLAML,hsapiens_gene_ensembl_pfeBoe" config="hsapiens_gene_ensembl_7_1"><Attribute name="external_gene_id"/></Dataset></Query>
>
> I’m hoping that there is a common problem that when fixed will solve both of these issues. Let me know how I can help.
>
> Thanks,
>
> Heather
>
> _____________________________________________
> From: Esposito, Anthony
> Sent: Tuesday, February 01, 2011 5:49 PM
> To: Estrella, Heather
> Cc: Ching, Keith
> Subject: RE: contact info
>
> Heather, Keith-
>
> Everything should be working now!
>
> The solution wasn’t what we did last time, I still forget what we did but anyway, this time there was a missing comma in one of the classes that handled subquery generation, generating SQL errors which in turn caused the queries to fail in general.  I can now select donor/diagnosis information in addition to snp/cnv info from the Genes advanced search.
>
> Let me know if anything else weird comes up!
>
> tj
>                                                                                                   _____
> T.J. Esposito
> La Jolla Research Business Technology - Bioinformatics&  Innovation
> Please consider the environment before printing email.
>
> _____________________________________________
> From: Estrella, Heather
> Sent: Tuesday, February 01, 2011 5:00 PM
> To: Esposito, Anthony
> Subject: contact info
>
> Hi TJ,
>
> Thanks for looking into this. I have to head out soon, but can dial-in from home if needed. I showed Keith a work around for now (i.e. not selecting any attributes under donor, diagnosis or sample).
>
> Thanks!
>
> Heather
>
_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users