[BioMart Users] Fwd: Weird issue with multi-select filter

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[BioMart Users] Fwd: Weird issue with multi-select filter

Darren Oakley
Hi All,

Hope you can help me with this one...

I have a filter on one of our marts (www.knockoutmouse.org, idcc_targ_rep dataset) configured as a multi-select like so:

---
    <FilterCollection displayName="Pipeline" internalName="pipeline">
      <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list">
        <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/>
        <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/>
        <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/>
        <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/>
        <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/>
      </FilterDescription>
    </FilterCollection>
---

But, it's producing unexpected results when you query for either of the pipelines separately, or together...

i.e. the following xml:

---
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query  virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6" >

  <Dataset name = "idcc_targ_rep" interface = "default" >
      <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/>
      <Attribute name = "escell_clone" />
  </Dataset>
</Query>
---

retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number...

KOMP-CSD on its own is 59440
EUCOMM on its own is 76076

So, combined they give 135516 rows of results - 5700 more than the first query!

The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries.

Any ideas as to why i'm getting this shortfall in the first query?  Is there something wrong with the way I set up the filter?

Any help is much appreciated. :)

Thanks,

Daz

_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users

smime.p7s (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] Fwd: Weird issue with multi-select filter

Junjun Zhang
Re: [BioMart Users] Fwd: Weird issue with multi-select filter Hi Darren,

It may help if you turn on the log to exam the actual SQL queries being executed. To do that please modify log4perl.conf file under conf folder, set:

log4perl.logger = INFO,logFile
log4perl.appender.logFile.filename = /full/path/to/the/log/file.log [put the real full path to the log file as your choice]

Restart Apache and fire three queries with filter set to one of the follows respectively:

<Filter name = "pipeline" value = "KOMP-CSD"/>
or
<Filter name = "pipeline" value = "EUCOMM"/>
or
<Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/>

Then look into the log file for SQL statements which should be able to give you some more information.

Help this helps!

Junjun



On 11-01-31 6:19 AM, "Darren Oakley" <do2@...> wrote:

Hi All,

Hope you can help me with this one...

I have a filter on one of our marts (www.knockoutmouse.org <http://www.knockoutmouse.org/> , idcc_targ_rep dataset) configured as a multi-select like so:

---
    <FilterCollection displayName="Pipeline" internalName="pipeline">
      <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list">
        <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/>
        <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/>
        <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/>
        <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/>
        <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/>
      </FilterDescription>
    </FilterCollection>
---

But, it's producing unexpected results when you query for either of the pipelines separately, or together...

i.e. the following xml:

---
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query  virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6" >

  <Dataset name = "idcc_targ_rep" interface = "default" >
      <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/>
      <Attribute name = "escell_clone" />
  </Dataset>
</Query>
---

retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number...

KOMP-CSD on its own is 59440
EUCOMM on its own is 76076

So, combined they give 135516 rows of results - 5700 more than the first query!

The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries.

Any ideas as to why i'm getting this shortfall in the first query?  Is there something wrong with the way I set up the filter?

Any help is much appreciated. :)

Thanks,

Daz


_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] Fwd: Weird issue with multi-select filter

Syed Haider
In reply to this post by Darren Oakley
Hi Daz,

Please turn on the logging in log file:

conf/log4perl.conf

by setting:

log4perl.logger = DEBUG,Screen

Please check if the SQL compilation is as per your expectation. That
will give us more pointers as in how to debug it further.

Best,
Syed

On 31/01/2011 11:19, Darren Oakley wrote:

> Hi All,
>
> Hope you can help me with this one...
>
> I have a filter on one of our marts (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep dataset) configured as a multi-select like so:
>
> ---
>      <FilterCollection displayName="Pipeline" internalName="pipeline">
>        <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list">
>          <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/>
>          <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/>
>          <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/>
>          <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/>
>          <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/>
>        </FilterDescription>
>      </FilterCollection>
> ---
>
> But, it's producing unexpected results when you query for either of the pipelines separately, or together...
>
> i.e. the following xml:
>
> ---
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE Query>
> <Query  virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6">
>
>    <Dataset name = "idcc_targ_rep" interface = "default">
>        <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/>
>        <Attribute name = "escell_clone" />
>    </Dataset>
> </Query>
> ---
>
> retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number...
>
> KOMP-CSD on its own is 59440
> EUCOMM on its own is 76076
>
> So, combined they give 135516 rows of results - 5700 more than the first query!
>
> The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries.
>
> Any ideas as to why i'm getting this shortfall in the first query?  Is there something wrong with the way I set up the filter?
>
> Any help is much appreciated. :)
>
> Thanks,
>
> Daz
>
_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] Fwd: Weird issue with multi-select filter

Darren Oakley
Hi,

Sorry for the slow response on this one - the help is much appreciated. :)

Here's the output from the logs for the three separate queries.  It all looks sensible to me...

Cheers,

Daz

----

KOMP-CSD Only:

2011/02/02 09:07:21 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="KOMP-CSD" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 09:07:21 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main
2011/02/02 09:07:21 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200,200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 400,400
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 800,800
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 1600,1600
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 3200,3200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 6400,6400
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 12800,12800
2011/02/02 09:07:22 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 25600,25600
2011/02/02 09:07:23 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 51200,50000

----

EUCOMM Only:

2011/02/02 09:07:29 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="EUCOMM" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 09:07:29 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main
2011/02/02 09:07:29 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200,200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 400,400
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 800,800
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 1600,1600
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 3200,3200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 6400,6400
2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 12800,12800
2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 25600,25600
2011/02/02 09:07:31 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 51200,50000
2011/02/02 09:07:33 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 101200,50000
2011/02/02 09:07:34 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 117710,50000

----

Both projects together:

2011/02/02 08:59:14 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="KOMP-CSD,EUCOMM" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 08:59:14 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main
2011/02/02 08:59:14 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200
2011/02/02 08:59:14 BioMart.Configuration.URLLocation:100:WARN> RESPONSE:  0.6
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200,200
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 400,400
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 800,800
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 1600,1600
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 3200,3200
2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 6400,6400
2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 12800,12800
2011/02/02 08:59:16 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 25600,25600
2011/02/02 08:59:17 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 51200,50000
2011/02/02 08:59:20 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 101200,50000
2011/02/02 08:59:24 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 151200,50000
2011/02/02 08:59:27 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 201200,50000
2011/02/02 08:59:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 218452,50000

-----

On 2 Feb 2011, at 00:23, Syed Haider wrote:

> Hi Daz,
>
> Please turn on the logging in log file:
>
> conf/log4perl.conf
>
> by setting:
>
> log4perl.logger = DEBUG,Screen
>
> Please check if the SQL compilation is as per your expectation. That will give us more pointers as in how to debug it further.
>
> Best,
> Syed
>
> On 31/01/2011 11:19, Darren Oakley wrote:
>> Hi All,
>>
>> Hope you can help me with this one...
>>
>> I have a filter on one of our marts (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep dataset) configured as a multi-select like so:
>>
>> ---
>>     <FilterCollection displayName="Pipeline" internalName="pipeline">
>>       <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list">
>>         <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/>
>>         <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/>
>>         <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/>
>>         <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/>
>>         <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/>
>>       </FilterDescription>
>>     </FilterCollection>
>> ---
>>
>> But, it's producing unexpected results when you query for either of the pipelines separately, or together...
>>
>> i.e. the following xml:
>>
>> ---
>> <?xml version="1.0" encoding="UTF-8"?>
>> <!DOCTYPE Query>
>> <Query  virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6">
>>
>>   <Dataset name = "idcc_targ_rep" interface = "default">
>>       <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/>
>>       <Attribute name = "escell_clone" />
>>   </Dataset>
>> </Query>
>> ---
>>
>> retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number...
>>
>> KOMP-CSD on its own is 59440
>> EUCOMM on its own is 76076
>>
>> So, combined they give 135516 rows of results - 5700 more than the first query!
>>
>> The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries.
>>
>> Any ideas as to why i'm getting this shortfall in the first query?  Is there something wrong with the way I set up the filter?
>>
>> Any help is much appreciated. :)
>>
>> Thanks,
>>
>> Daz
>>

_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users

smime.p7s (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] Fwd: Weird issue with multi-select filter

Syed Haider
Oh Daz,

this might be because your attribute "escell_clone" has nulls for
certain values, not sure though. Why dont you run all three queries
against your database server as it is and see whats the row count you
get, BTW, is it a public visible database ?

Syed

On 02/02/2011 09:10, Darren Oakley wrote:

> Hi,
>
> Sorry for the slow response on this one - the help is much appreciated. :)
>
> Here's the output from the logs for the three separate queries.  It all looks sensible to me...
>
> Cheers,
>
> Daz
>
> ----
>
> KOMP-CSD Only:
>
> 2011/02/02 09:07:21 martservice:1016:WARN>  RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE Query>
> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
>    <Dataset name="idcc_targ_rep" interface="default">
>      <Filter value="KOMP-CSD" name="pipeline"/>
>      <Attribute name="escell_clone"/>
>    </Dataset>
> </Query>
>
> 2011/02/02 09:07:21 BioMart.QueryRunner:163:WARN>  ATTRIBUTE: idcc_targ_rep escell_clone main
> 2011/02/02 09:07:21 BioMart.QueryRunner:174:WARN>  FILTER TABLE: idcc_targ_rep pipeline main
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200,200
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 400,400
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 800,800
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 1600,1600
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 3200,3200
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 6400,6400
> 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 12800,12800
> 2011/02/02 09:07:22 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 25600,25600
> 2011/02/02 09:07:23 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 51200,50000
>
> ----
>
> EUCOMM Only:
>
> 2011/02/02 09:07:29 martservice:1016:WARN>  RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE Query>
> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
>    <Dataset name="idcc_targ_rep" interface="default">
>      <Filter value="EUCOMM" name="pipeline"/>
>      <Attribute name="escell_clone"/>
>    </Dataset>
> </Query>
>
> 2011/02/02 09:07:29 BioMart.QueryRunner:163:WARN>  ATTRIBUTE: idcc_targ_rep escell_clone main
> 2011/02/02 09:07:29 BioMart.QueryRunner:174:WARN>  FILTER TABLE: idcc_targ_rep pipeline main
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200,200
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 400,400
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 800,800
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 1600,1600
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 3200,3200
> 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 6400,6400
> 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 12800,12800
> 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 25600,25600
> 2011/02/02 09:07:31 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 51200,50000
> 2011/02/02 09:07:33 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 101200,50000
> 2011/02/02 09:07:34 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 117710,50000
>
> ----
>
> Both projects together:
>
> 2011/02/02 08:59:14 martservice:1016:WARN>  RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE Query>
> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
>    <Dataset name="idcc_targ_rep" interface="default">
>      <Filter value="KOMP-CSD,EUCOMM" name="pipeline"/>
>      <Attribute name="escell_clone"/>
>    </Dataset>
> </Query>
>
> 2011/02/02 08:59:14 BioMart.QueryRunner:163:WARN>  ATTRIBUTE: idcc_targ_rep escell_clone main
> 2011/02/02 08:59:14 BioMart.QueryRunner:174:WARN>  FILTER TABLE: idcc_targ_rep pipeline main
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200
> 2011/02/02 08:59:14 BioMart.Configuration.URLLocation:100:WARN>  RESPONSE:  0.6
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200,200
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 400,400
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 800,800
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 1600,1600
> 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 3200,3200
> 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 6400,6400
> 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 12800,12800
> 2011/02/02 08:59:16 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 25600,25600
> 2011/02/02 08:59:17 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 51200,50000
> 2011/02/02 08:59:20 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 101200,50000
> 2011/02/02 08:59:24 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 151200,50000
> 2011/02/02 08:59:27 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 201200,50000
> 2011/02/02 08:59:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 218452,50000
>
> -----
>
> On 2 Feb 2011, at 00:23, Syed Haider wrote:
>
>> Hi Daz,
>>
>> Please turn on the logging in log file:
>>
>> conf/log4perl.conf
>>
>> by setting:
>>
>> log4perl.logger = DEBUG,Screen
>>
>> Please check if the SQL compilation is as per your expectation. That will give us more pointers as in how to debug it further.
>>
>> Best,
>> Syed
>>
>> On 31/01/2011 11:19, Darren Oakley wrote:
>>> Hi All,
>>>
>>> Hope you can help me with this one...
>>>
>>> I have a filter on one of our marts (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep dataset) configured as a multi-select like so:
>>>
>>> ---
>>>      <FilterCollection displayName="Pipeline" internalName="pipeline">
>>>        <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list">
>>>          <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/>
>>>          <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/>
>>>          <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/>
>>>          <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/>
>>>          <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/>
>>>        </FilterDescription>
>>>      </FilterCollection>
>>> ---
>>>
>>> But, it's producing unexpected results when you query for either of the pipelines separately, or together...
>>>
>>> i.e. the following xml:
>>>
>>> ---
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <!DOCTYPE Query>
>>> <Query  virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6">
>>>
>>>    <Dataset name = "idcc_targ_rep" interface = "default">
>>>        <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/>
>>>        <Attribute name = "escell_clone" />
>>>    </Dataset>
>>> </Query>
>>> ---
>>>
>>> retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number...
>>>
>>> KOMP-CSD on its own is 59440
>>> EUCOMM on its own is 76076
>>>
>>> So, combined they give 135516 rows of results - 5700 more than the first query!
>>>
>>> The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries.
>>>
>>> Any ideas as to why i'm getting this shortfall in the first query?  Is there something wrong with the way I set up the filter?
>>>
>>> Any help is much appreciated. :)
>>>
>>> Thanks,
>>>
>>> Daz
>>>
>
_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] Fwd: Weird issue with multi-select filter

Junjun Zhang
In reply to this post by Darren Oakley
Re: [BioMart Users] Fwd: Weird issue with multi-select filter Hi Daz,

The inconsistent results might be caused by batching. You mentioned escell_clone are unique in your database, query through batching on certain RDBMS platform might result in rows being skipped or being duplicated.

Find out possible duplicates or missing rows in the output files should help figuring out the problem.

sort result1.txt |uniq -d  ### this will give you possible duplicates

You can use sort and diff to find out possible missing items.

BTW, which RDBMS platform are you using?

Hope this helps,

Junjun


On 11-02-02 4:10 AM, "Darren Oakley" <do2@...> wrote:

Hi,

Sorry for the slow response on this one - the help is much appreciated. :)

Here's the output from the logs for the three separate queries.  It all looks sensible to me...

Cheers,

Daz

----

KOMP-CSD Only:

2011/02/02 09:07:21 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="KOMP-CSD" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 09:07:21 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep      escell_clone    main
2011/02/02 09:07:21 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep   pipeline        main
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200,200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 400,400
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 800,800
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 1600,1600
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 3200,3200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 6400,6400
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 12800,12800
2011/02/02 09:07:22 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 25600,25600
2011/02/02 09:07:23 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 51200,50000

----

EUCOMM Only:

2011/02/02 09:07:29 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="EUCOMM" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 09:07:29 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep      escell_clone    main
2011/02/02 09:07:29 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep   pipeline        main
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200,200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 400,400
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 800,800
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 1600,1600
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 3200,3200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 6400,6400
2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 12800,12800
2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 25600,25600
2011/02/02 09:07:31 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 51200,50000
2011/02/02 09:07:33 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 101200,50000
2011/02/02 09:07:34 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 117710,50000

----

Both projects together:

2011/02/02 08:59:14 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="KOMP-CSD,EUCOMM" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 08:59:14 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep      escell_clone    main
2011/02/02 08:59:14 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep   pipeline        main
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200
2011/02/02 08:59:14 BioMart.Configuration.URLLocation:100:WARN> RESPONSE:  0.6
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200,200
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 400,400
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 800,800
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 1600,1600
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 3200,3200
2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 6400,6400
2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 12800,12800
2011/02/02 08:59:16 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 25600,25600
2011/02/02 08:59:17 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 51200,50000
2011/02/02 08:59:20 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 101200,50000
2011/02/02 08:59:24 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 151200,50000
2011/02/02 08:59:27 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 201200,50000
2011/02/02 08:59:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 218452,50000

-----

On 2 Feb 2011, at 00:23, Syed Haider wrote:

> Hi Daz,
>
> Please turn on the logging in log file:
>
> conf/log4perl.conf
>
> by setting:
>
> log4perl.logger = DEBUG,Screen
>
> Please check if the SQL compilation is as per your expectation. That will give us more pointers as in how to debug it further.
>
> Best,
> Syed
>
> On 31/01/2011 11:19, Darren Oakley wrote:
>> Hi All,
>>
>> Hope you can help me with this one...
>>
>> I have a filter on one of our marts (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep dataset) configured as a multi-select like so:
>>
>> ---
>>     <FilterCollection displayName="Pipeline" internalName="pipeline">
>>       <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list">
>>         <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/>
>>         <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/>
>>         <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/>
>>         <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/>
>>         <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/>
>>       </FilterDescription>
>>     </FilterCollection>
>> ---
>>
>> But, it's producing unexpected results when you query for either of the pipelines separately, or together...
>>
>> i.e. the following xml:
>>
>> ---
>> <?xml version="1.0" encoding="UTF-8"?>
>> <!DOCTYPE Query>
>> <Query  virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6">
>>
>>   <Dataset name = "idcc_targ_rep" interface = "default">
>>       <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/>
>>       <Attribute name = "escell_clone" />
>>   </Dataset>
>> </Query>
>> ---
>>
>> retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number...
>>
>> KOMP-CSD on its own is 59440
>> EUCOMM on its own is 76076
>>
>> So, combined they give 135516 rows of results - 5700 more than the first query!
>>
>> The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries.
>>
>> Any ideas as to why i'm getting this shortfall in the first query?  Is there something wrong with the way I set up the filter?
>>
>> Any help is much appreciated. :)
>>
>> Thanks,
>>
>> Daz
>>



_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] Fwd: Weird issue with multi-select filter

Darren Oakley
In reply to this post by Syed Haider
Hi Syed,

Yep, i've ran the queries against the database (both the original database and the one that the mart reads from) and it gives me counts the same as the single queries.

Yep, it's a public biomart:

http://www.knockoutmouse.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=idcc_targ_rep.default.attributes.escell_clone&FILTERS=idcc_targ_rep.default.filters.pipeline."KOMP-CSD,EUCOMM"&VISIBLEPANEL=resultspanel

Thanks,

Daz

On 2 Feb 2011, at 12:58, Syed Haider wrote:

Oh Daz,

this might be because your attribute "escell_clone" has nulls for certain values, not sure though. Why dont you run all three queries against your database server as it is and see whats the row count you get, BTW, is it a public visible database ?

Syed

On 02/02/2011 09:10, Darren Oakley wrote:
Hi,

Sorry for the slow response on this one - the help is much appreciated. :)

Here's the output from the logs for the three separate queries.  It all looks sensible to me...

Cheers,

Daz

----

KOMP-CSD Only:

2011/02/02 09:07:21 martservice:1016:WARN>  RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="KOMP-CSD" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 09:07:21 BioMart.QueryRunner:163:WARN>  ATTRIBUTE: idcc_targ_rep escell_clone main
2011/02/02 09:07:21 BioMart.QueryRunner:174:WARN>  FILTER TABLE: idcc_targ_rep pipeline main
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200,200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 400,400
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 800,800
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 1600,1600
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 3200,3200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 6400,6400
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 12800,12800
2011/02/02 09:07:22 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 25600,25600
2011/02/02 09:07:23 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 51200,50000

----

EUCOMM Only:

2011/02/02 09:07:29 martservice:1016:WARN>  RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="EUCOMM" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 09:07:29 BioMart.QueryRunner:163:WARN>  ATTRIBUTE: idcc_targ_rep escell_clone main
2011/02/02 09:07:29 BioMart.QueryRunner:174:WARN>  FILTER TABLE: idcc_targ_rep pipeline main
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200,200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 400,400
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 800,800
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 1600,1600
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 3200,3200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 6400,6400
2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 12800,12800
2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 25600,25600
2011/02/02 09:07:31 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 51200,50000
2011/02/02 09:07:33 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 101200,50000
2011/02/02 09:07:34 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 117710,50000

----

Both projects together:

2011/02/02 08:59:14 martservice:1016:WARN>  RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="KOMP-CSD,EUCOMM" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 08:59:14 BioMart.QueryRunner:163:WARN>  ATTRIBUTE: idcc_targ_rep escell_clone main
2011/02/02 08:59:14 BioMart.QueryRunner:174:WARN>  FILTER TABLE: idcc_targ_rep pipeline main
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200
2011/02/02 08:59:14 BioMart.Configuration.URLLocation:100:WARN>  RESPONSE:  0.6
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200,200
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 400,400
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 800,800
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 1600,1600
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 3200,3200
2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 6400,6400
2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 12800,12800
2011/02/02 08:59:16 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 25600,25600
2011/02/02 08:59:17 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 51200,50000
2011/02/02 08:59:20 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 101200,50000
2011/02/02 08:59:24 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 151200,50000
2011/02/02 08:59:27 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 201200,50000
2011/02/02 08:59:29 BioMart.Dataset.TableSet:736:INFO>  QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 218452,50000

-----

On 2 Feb 2011, at 00:23, Syed Haider wrote:

Hi Daz,

Please turn on the logging in log file:

conf/log4perl.conf

by setting:

log4perl.logger = DEBUG,Screen

Please check if the SQL compilation is as per your expectation. That will give us more pointers as in how to debug it further.

Best,
Syed

On 31/01/2011 11:19, Darren Oakley wrote:
Hi All,

Hope you can help me with this one...

I have a filter on one of our marts (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep dataset) configured as a multi-select like so:

---
    <FilterCollection displayName="Pipeline" internalName="pipeline">
      <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list">
        <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/>
        <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/>
        <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/>
        <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/>
        <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/>
      </FilterDescription>
    </FilterCollection>
---

But, it's producing unexpected results when you query for either of the pipelines separately, or together...

i.e. the following xml:

---
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query  virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6">

  <Dataset name = "idcc_targ_rep" interface = "default">
      <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/>
      <Attribute name = "escell_clone" />
  </Dataset>
</Query>
---

retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number...

KOMP-CSD on its own is 59440
EUCOMM on its own is 76076

So, combined they give 135516 rows of results - 5700 more than the first query!

The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries.

Any ideas as to why i'm getting this shortfall in the first query?  Is there something wrong with the way I set up the filter?

Any help is much appreciated. :)

Thanks,

Daz




_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users

smime.p7s (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] Fwd: Weird issue with multi-select filter

Darren Oakley
In reply to this post by Junjun Zhang
Hi Junjun,

Okey doke, will run through the files as suggested.  Back with some results soon. :)

It's running on MySQL 5.1.39.

Cheers,

Daz

On 2 Feb 2011, at 14:30, Junjun Zhang wrote:

Hi Daz,

The inconsistent results might be caused by batching. You mentioned escell_clone are unique in your database, query through batching on certain RDBMS platform might result in rows being skipped or being duplicated.

Find out possible duplicates or missing rows in the output files should help figuring out the problem.

sort result1.txt |uniq -d  ### this will give you possible duplicates

You can use sort and diff to find out possible missing items.

BTW, which RDBMS platform are you using?

Hope this helps,

Junjun


On 11-02-02 4:10 AM, "Darren Oakley" <<a href="x-msg://124/do2@sanger.ac.uk">do2@...> wrote:

Hi,

Sorry for the slow response on this one - the help is much appreciated. :)

Here's the output from the logs for the three separate queries.  It all looks sensible to me...

Cheers,

Daz

----

KOMP-CSD Only:

2011/02/02 09:07:21 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="KOMP-CSD" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 09:07:21 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep      escell_clone    main
2011/02/02 09:07:21 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep   pipeline        main
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200,200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 400,400
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 800,800
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 1600,1600
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 3200,3200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 6400,6400
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 12800,12800
2011/02/02 09:07:22 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 25600,25600
2011/02/02 09:07:23 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 51200,50000

----

EUCOMM Only:

2011/02/02 09:07:29 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="EUCOMM" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 09:07:29 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep      escell_clone    main
2011/02/02 09:07:29 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep   pipeline        main
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200,200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 400,400
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 800,800
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 1600,1600
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 3200,3200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 6400,6400
2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 12800,12800
2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 25600,25600
2011/02/02 09:07:31 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 51200,50000
2011/02/02 09:07:33 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 101200,50000
2011/02/02 09:07:34 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 117710,50000

----

Both projects together:

2011/02/02 08:59:14 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="KOMP-CSD,EUCOMM" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 08:59:14 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep      escell_clone    main
2011/02/02 08:59:14 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep   pipeline        main
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200
2011/02/02 08:59:14 BioMart.Configuration.URLLocation:100:WARN> RESPONSE:  0.6
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200,200
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 400,400
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 800,800
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 1600,1600
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 3200,3200
2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 6400,6400
2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 12800,12800
2011/02/02 08:59:16 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 25600,25600
2011/02/02 08:59:17 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 51200,50000
2011/02/02 08:59:20 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 101200,50000
2011/02/02 08:59:24 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 151200,50000
2011/02/02 08:59:27 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 201200,50000
2011/02/02 08:59:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 218452,50000

-----

On 2 Feb 2011, at 00:23, Syed Haider wrote:

> Hi Daz,
>
> Please turn on the logging in log file:
>
> conf/log4perl.conf
>
> by setting:
>
> log4perl.logger = DEBUG,Screen
>
> Please check if the SQL compilation is as per your expectation. That will give us more pointers as in how to debug it further.
>
> Best,
> Syed
>
> On 31/01/2011 11:19, Darren Oakley wrote:
>> Hi All,
>>
>> Hope you can help me with this one...
>>
>> I have a filter on one of our marts (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep dataset) configured as a multi-select like so:
>>
>> ---
>>     <FilterCollection displayName="Pipeline" internalName="pipeline">
>>       <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list">
>>         <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/>
>>         <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/>
>>         <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/>
>>         <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/>
>>         <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/>
>>       </FilterDescription>
>>     </FilterCollection>
>> ---
>>
>> But, it's producing unexpected results when you query for either of the pipelines separately, or together...
>>
>> i.e. the following xml:
>>
>> ---
>> <?xml version="1.0" encoding="UTF-8"?>
>> <!DOCTYPE Query>
>> <Query  virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6">
>>
>>   <Dataset name = "idcc_targ_rep" interface = "default">
>>       <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/>
>>       <Attribute name = "escell_clone" />
>>   </Dataset>
>> </Query>
>> ---
>>
>> retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number...
>>
>> KOMP-CSD on its own is 59440
>> EUCOMM on its own is 76076
>>
>> So, combined they give 135516 rows of results - 5700 more than the first query!
>>
>> The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries.
>>
>> Any ideas as to why i'm getting this shortfall in the first query?  Is there something wrong with the way I set up the filter?
>>
>> Any help is much appreciated. :)
>>
>> Thanks,
>>
>> Daz
>>




_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users

smime.p7s (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] Fwd: Weird issue with multi-select filter

Darren Oakley
In reply to this post by Junjun Zhang
Hi Junjun,

Unfortunately that didn't give me anything to go on... :(

[ do2@guest178 ~/projects/personal-scripts/biomart ] sort EUCOMM.txt | uniq -d 
[ do2@guest178 ~/projects/personal-scripts/biomart ] sort KOMP-CSD.txt | uniq -d 
[ do2@guest178 ~/projects/personal-scripts/biomart ] sort TWO-FILTERS.txt | uniq -d 
[ do2@guest178 ~/projects/personal-scripts/biomart ] sort DIFF.txt | uniq -d
[ do2@guest178 ~/projects/personal-scripts/biomart ] wc -l *.txt
   5760 DIFF.txt
  76216 EUCOMM.txt
  59486 KOMP-CSD.txt
 129942 TWO-FILTERS.txt
[ do2@guest178 ~/projects/personal-scripts/biomart ]

I've attached the output files I get:

EUCOMM.txt - query with only the "EUCOMM" filter
KOMP-CSD.txt - query with only the "KOMP-CSD" filter
TWO-FILTERS.txt - query with both "EUCOMM" and "KOMP-CSD" selected in the multi-select
DIFF.txt - what I have determined is missing from the TWO-FILTERS.txt response (weirdly, these all fall into the KOMP-CSD pipeline - could this mean anything?)

Cheers,

Daz



On 2 Feb 2011, at 14:30, Junjun Zhang wrote:

Hi Daz,

The inconsistent results might be caused by batching. You mentioned escell_clone are unique in your database, query through batching on certain RDBMS platform might result in rows being skipped or being duplicated.

Find out possible duplicates or missing rows in the output files should help figuring out the problem.

sort result1.txt |uniq -d  ### this will give you possible duplicates

You can use sort and diff to find out possible missing items.

BTW, which RDBMS platform are you using?

Hope this helps,

Junjun


On 11-02-02 4:10 AM, "Darren Oakley" <<a href="x-msg://126/do2@sanger.ac.uk">do2@...> wrote:

Hi,

Sorry for the slow response on this one - the help is much appreciated. :)

Here's the output from the logs for the three separate queries.  It all looks sensible to me...

Cheers,

Daz

----

KOMP-CSD Only:

2011/02/02 09:07:21 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="KOMP-CSD" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 09:07:21 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep      escell_clone    main
2011/02/02 09:07:21 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep   pipeline        main
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200,200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 400,400
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 800,800
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 1600,1600
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 3200,3200
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 6400,6400
2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 12800,12800
2011/02/02 09:07:22 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 25600,25600
2011/02/02 09:07:23 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 51200,50000

----

EUCOMM Only:

2011/02/02 09:07:29 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="EUCOMM" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 09:07:29 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep      escell_clone    main
2011/02/02 09:07:29 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep   pipeline        main
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200,200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 400,400
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 800,800
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 1600,1600
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 3200,3200
2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 6400,6400
2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 12800,12800
2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 25600,25600
2011/02/02 09:07:31 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 51200,50000
2011/02/02 09:07:33 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 101200,50000
2011/02/02 09:07:34 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 117710,50000

----

Both projects together:

2011/02/02 08:59:14 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6">
  <Dataset name="idcc_targ_rep" interface="default">
    <Filter value="KOMP-CSD,EUCOMM" name="pipeline"/>
    <Attribute name="escell_clone"/>
  </Dataset>
</Query>

2011/02/02 08:59:14 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep      escell_clone    main
2011/02/02 08:59:14 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep   pipeline        main
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200
2011/02/02 08:59:14 BioMart.Configuration.URLLocation:100:WARN> RESPONSE:  0.6
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200,200
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 400,400
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 800,800
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 1600,1600
2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 3200,3200
2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 6400,6400
2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 12800,12800
2011/02/02 08:59:16 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 25600,25600
2011/02/02 08:59:17 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 51200,50000
2011/02/02 08:59:20 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 101200,50000
2011/02/02 08:59:24 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 151200,50000
2011/02/02 08:59:27 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 201200,50000
2011/02/02 08:59:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL:  SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 218452,50000

-----

On 2 Feb 2011, at 00:23, Syed Haider wrote:

> Hi Daz,
>
> Please turn on the logging in log file:
>
> conf/log4perl.conf
>
> by setting:
>
> log4perl.logger = DEBUG,Screen
>
> Please check if the SQL compilation is as per your expectation. That will give us more pointers as in how to debug it further.
>
> Best,
> Syed
>
> On 31/01/2011 11:19, Darren Oakley wrote:
>> Hi All,
>>
>> Hope you can help me with this one...
>>
>> I have a filter on one of our marts (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep dataset) configured as a multi-select like so:
>>
>> ---
>>     <FilterCollection displayName="Pipeline" internalName="pipeline">
>>       <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list">
>>         <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/>
>>         <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/>
>>         <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/>
>>         <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/>
>>         <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/>
>>       </FilterDescription>
>>     </FilterCollection>
>> ---
>>
>> But, it's producing unexpected results when you query for either of the pipelines separately, or together...
>>
>> i.e. the following xml:
>>
>> ---
>> <?xml version="1.0" encoding="UTF-8"?>
>> <!DOCTYPE Query>
>> <Query  virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6">
>>
>>   <Dataset name = "idcc_targ_rep" interface = "default">
>>       <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/>
>>       <Attribute name = "escell_clone" />
>>   </Dataset>
>> </Query>
>> ---
>>
>> retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number...
>>
>> KOMP-CSD on its own is 59440
>> EUCOMM on its own is 76076
>>
>> So, combined they give 135516 rows of results - 5700 more than the first query!
>>
>> The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries.
>>
>> Any ideas as to why i'm getting this shortfall in the first query?  Is there something wrong with the way I set up the filter?
>>
>> Any help is much appreciated. :)
>>
>> Thanks,
>>
>> Daz
>>




_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users

output.zip (963K) Download Attachment
smime.p7s (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] Fwd: Weird issue with multi-select filter

Junjun Zhang
Hi Daz,

It looks like there are rows missed in the third query, the one with both items selected in the dropdown list.

Can you try to run all the three SQL queries directly against mysql db? This time do it in one sql statement without 'limit' at all.

Now compare the above results with what you get through biomart (where single query breaks into multiple sql queries).

Let us know what you get.

Junjun

 
From: Darren Oakley [mailto:[hidden email]]
Sent: Thursday, February 03, 2011 08:53 AM
To: Junjun Zhang
Cc: Syed Haider; [hidden email] <[hidden email]>
Subject: Re: [BioMart Users] Fwd: Weird issue with multi-select filter
 
Hi Junjun,

Unfortunately that didn't give me anything to go on... :(

[ do2@guest178 ~/projects/personal-scripts/biomart ] sort EUCOMM.txt | uniq -d 
[ do2@guest178 ~/projects/personal-scripts/biomart ] sort KOMP-CSD.txt | uniq -d 
[ do2@guest178 ~/projects/personal-scripts/biomart ] sort TWO-FILTERS.txt | uniq -d 
[ do2@guest178 ~/projects/personal-scripts/biomart ] sort DIFF.txt | uniq -d
[ do2@guest178 ~/projects/personal-scripts/biomart ] wc -l *.txt
   5760 DIFF.txt
  76216 EUCOMM.txt
  59486 KOMP-CSD.txt
 129942 TWO-FILTERS.txt
[ do2@guest178 ~/projects/personal-scripts/biomart ]

I've attached the output files I get:

EUCOMM.txt - query with only the "EUCOMM" filter
KOMP-CSD.txt - query with only the "KOMP-CSD" filter
TWO-FILTERS.txt - query with both "EUCOMM" and "KOMP-CSD" selected in the multi-select
DIFF.txt - what I have determined is missing from the TWO-FILTERS.txt response (weirdly, these all fall into the KOMP-CSD pipeline - could this mean anything?)

Cheers,

Daz


_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] Fwd: Weird issue with multi-select filter

Darren Oakley
Hi Junjun,

No probs - running the following queries:

1 - select count(`escell_clone`) from `idcc_targ_rep__idcc_targ_rep__main` where `pipeline` = "EUCOMM"
2 - select count(`escell_clone`) from `idcc_targ_rep__idcc_targ_rep__main` where `pipeline` = "KOMP-CSD"
3 - select count(`escell_clone`) from `idcc_targ_rep__idcc_targ_rep__main` where `pipeline` in ("EUCOMM","KOMP-CSD")

gives:

1 - 76216 (same as the martservice call)
2 - 59486 (same as the martservice call)
3 - 135702 (different to the martservice call)

So the data is there...  This must be something in the batching causing the miss?

Cheers,

Daz

On 4 Feb 2011, at 03:27, Junjun Zhang wrote:

Hi Daz,

It looks like there are rows missed in the third query, the one with both items selected in the dropdown list.

Can you try to run all the three SQL queries directly against mysql db? This time do it in one sql statement without 'limit' at all.

Now compare the above results with what you get through biomart (where single query breaks into multiple sql queries).

Let us know what you get.

Junjun

 
From: Darren Oakley [mailto:[hidden email]]
Sent: Thursday, February 03, 2011 08:53 AM
To: Junjun Zhang
Cc: Syed Haider; [hidden email] <[hidden email]>
Subject: Re: [BioMart Users] Fwd: Weird issue with multi-select filter
 
Hi Junjun,

Unfortunately that didn't give me anything to go on... :(

[ do2@guest178 ~/projects/personal-scripts/biomart ] sort EUCOMM.txt | uniq -d 
[ do2@guest178 ~/projects/personal-scripts/biomart ] sort KOMP-CSD.txt | uniq -d 
[ do2@guest178 ~/projects/personal-scripts/biomart ] sort TWO-FILTERS.txt | uniq -d 
[ do2@guest178 ~/projects/personal-scripts/biomart ] sort DIFF.txt | uniq -d
[ do2@guest178 ~/projects/personal-scripts/biomart ] wc -l *.txt
   5760 DIFF.txt
  76216 EUCOMM.txt
  59486 KOMP-CSD.txt
 129942 TWO-FILTERS.txt
[ do2@guest178 ~/projects/personal-scripts/biomart ]

I've attached the output files I get:

EUCOMM.txt - query with only the "EUCOMM" filter
KOMP-CSD.txt - query with only the "KOMP-CSD" filter
TWO-FILTERS.txt - query with both "EUCOMM" and "KOMP-CSD" selected in the multi-select
DIFF.txt - what I have determined is missing from the TWO-FILTERS.txt response (weirdly, these all fall into the KOMP-CSD pipeline - could this mean anything?)

Cheers,

Daz



_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users

smime.p7s (2K) Download Attachment