Hi All,
Hope you can help me with this one... I have a filter on one of our marts (www.knockoutmouse.org, idcc_targ_rep dataset) configured as a multi-select like so: --- <FilterCollection displayName="Pipeline" internalName="pipeline"> <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list"> <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/> <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/> <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/> <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/> <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/> </FilterDescription> </FilterCollection> --- But, it's producing unexpected results when you query for either of the pipelines separately, or together... i.e. the following xml: --- <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE Query> <Query virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6" > <Dataset name = "idcc_targ_rep" interface = "default" > <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/> <Attribute name = "escell_clone" /> </Dataset> </Query> --- retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number... KOMP-CSD on its own is 59440 EUCOMM on its own is 76076 So, combined they give 135516 rows of results - 5700 more than the first query! The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries. Any ideas as to why i'm getting this shortfall in the first query? Is there something wrong with the way I set up the filter? Any help is much appreciated. :) Thanks, Daz _______________________________________________ Users mailing list [hidden email] https://lists.biomart.org/mailman/listinfo/users |
It may help if you turn on the log to exam the actual SQL queries being executed. To do that please modify log4perl.conf file under conf folder, set: log4perl.logger = INFO,logFile log4perl.appender.logFile.filename = /full/path/to/the/log/file.log [put the real full path to the log file as your choice] Restart Apache and fire three queries with filter set to one of the follows respectively: <Filter name = "pipeline" value = "KOMP-CSD"/> or <Filter name = "pipeline" value = "EUCOMM"/> or <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/> Then look into the log file for SQL statements which should be able to give you some more information. Help this helps! Junjun On 11-01-31 6:19 AM, "Darren Oakley" <do2@...> wrote: Hi All, _______________________________________________ Users mailing list [hidden email] https://lists.biomart.org/mailman/listinfo/users |
In reply to this post by Darren Oakley
Hi Daz,
Please turn on the logging in log file: conf/log4perl.conf by setting: log4perl.logger = DEBUG,Screen Please check if the SQL compilation is as per your expectation. That will give us more pointers as in how to debug it further. Best, Syed On 31/01/2011 11:19, Darren Oakley wrote: > Hi All, > > Hope you can help me with this one... > > I have a filter on one of our marts (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep dataset) configured as a multi-select like so: > > --- > <FilterCollection displayName="Pipeline" internalName="pipeline"> > <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list"> > <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/> > <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/> > <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/> > <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/> > <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/> > </FilterDescription> > </FilterCollection> > --- > > But, it's producing unexpected results when you query for either of the pipelines separately, or together... > > i.e. the following xml: > > --- > <?xml version="1.0" encoding="UTF-8"?> > <!DOCTYPE Query> > <Query virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6"> > > <Dataset name = "idcc_targ_rep" interface = "default"> > <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/> > <Attribute name = "escell_clone" /> > </Dataset> > </Query> > --- > > retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number... > > KOMP-CSD on its own is 59440 > EUCOMM on its own is 76076 > > So, combined they give 135516 rows of results - 5700 more than the first query! > > The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries. > > Any ideas as to why i'm getting this shortfall in the first query? Is there something wrong with the way I set up the filter? > > Any help is much appreciated. :) > > Thanks, > > Daz > Users mailing list [hidden email] https://lists.biomart.org/mailman/listinfo/users |
Hi,
Sorry for the slow response on this one - the help is much appreciated. :) Here's the output from the logs for the three separate queries. It all looks sensible to me... Cheers, Daz ---- KOMP-CSD Only: 2011/02/02 09:07:21 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE Query> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6"> <Dataset name="idcc_targ_rep" interface="default"> <Filter value="KOMP-CSD" name="pipeline"/> <Attribute name="escell_clone"/> </Dataset> </Query> 2011/02/02 09:07:21 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main 2011/02/02 09:07:21 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200,200 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 400,400 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 800,800 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 1600,1600 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 3200,3200 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 6400,6400 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 12800,12800 2011/02/02 09:07:22 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 25600,25600 2011/02/02 09:07:23 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 51200,50000 ---- EUCOMM Only: 2011/02/02 09:07:29 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE Query> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6"> <Dataset name="idcc_targ_rep" interface="default"> <Filter value="EUCOMM" name="pipeline"/> <Attribute name="escell_clone"/> </Dataset> </Query> 2011/02/02 09:07:29 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main 2011/02/02 09:07:29 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200,200 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 400,400 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 800,800 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 1600,1600 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 3200,3200 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 6400,6400 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 12800,12800 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 25600,25600 2011/02/02 09:07:31 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 51200,50000 2011/02/02 09:07:33 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 101200,50000 2011/02/02 09:07:34 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 117710,50000 ---- Both projects together: 2011/02/02 08:59:14 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE Query> <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6"> <Dataset name="idcc_targ_rep" interface="default"> <Filter value="KOMP-CSD,EUCOMM" name="pipeline"/> <Attribute name="escell_clone"/> </Dataset> </Query> 2011/02/02 08:59:14 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main 2011/02/02 08:59:14 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200 2011/02/02 08:59:14 BioMart.Configuration.URLLocation:100:WARN> RESPONSE: 0.6 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200,200 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 400,400 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 800,800 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 1600,1600 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 3200,3200 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 6400,6400 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 12800,12800 2011/02/02 08:59:16 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 25600,25600 2011/02/02 08:59:17 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 51200,50000 2011/02/02 08:59:20 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 101200,50000 2011/02/02 08:59:24 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 151200,50000 2011/02/02 08:59:27 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 201200,50000 2011/02/02 08:59:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 218452,50000 ----- On 2 Feb 2011, at 00:23, Syed Haider wrote: > Hi Daz, > > Please turn on the logging in log file: > > conf/log4perl.conf > > by setting: > > log4perl.logger = DEBUG,Screen > > Please check if the SQL compilation is as per your expectation. That will give us more pointers as in how to debug it further. > > Best, > Syed > > On 31/01/2011 11:19, Darren Oakley wrote: >> Hi All, >> >> Hope you can help me with this one... >> >> I have a filter on one of our marts (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep dataset) configured as a multi-select like so: >> >> --- >> <FilterCollection displayName="Pipeline" internalName="pipeline"> >> <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list"> >> <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/> >> <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/> >> <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/> >> <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/> >> <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/> >> </FilterDescription> >> </FilterCollection> >> --- >> >> But, it's producing unexpected results when you query for either of the pipelines separately, or together... >> >> i.e. the following xml: >> >> --- >> <?xml version="1.0" encoding="UTF-8"?> >> <!DOCTYPE Query> >> <Query virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6"> >> >> <Dataset name = "idcc_targ_rep" interface = "default"> >> <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/> >> <Attribute name = "escell_clone" /> >> </Dataset> >> </Query> >> --- >> >> retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number... >> >> KOMP-CSD on its own is 59440 >> EUCOMM on its own is 76076 >> >> So, combined they give 135516 rows of results - 5700 more than the first query! >> >> The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries. >> >> Any ideas as to why i'm getting this shortfall in the first query? Is there something wrong with the way I set up the filter? >> >> Any help is much appreciated. :) >> >> Thanks, >> >> Daz >> _______________________________________________ Users mailing list [hidden email] https://lists.biomart.org/mailman/listinfo/users |
Oh Daz,
this might be because your attribute "escell_clone" has nulls for certain values, not sure though. Why dont you run all three queries against your database server as it is and see whats the row count you get, BTW, is it a public visible database ? Syed On 02/02/2011 09:10, Darren Oakley wrote: > Hi, > > Sorry for the slow response on this one - the help is much appreciated. :) > > Here's the output from the logs for the three separate queries. It all looks sensible to me... > > Cheers, > > Daz > > ---- > > KOMP-CSD Only: > > 2011/02/02 09:07:21 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?> > <!DOCTYPE Query> > <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6"> > <Dataset name="idcc_targ_rep" interface="default"> > <Filter value="KOMP-CSD" name="pipeline"/> > <Attribute name="escell_clone"/> > </Dataset> > </Query> > > 2011/02/02 09:07:21 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main > 2011/02/02 09:07:21 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 200,200 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 400,400 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 800,800 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 1600,1600 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 3200,3200 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 6400,6400 > 2011/02/02 09:07:21 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 12800,12800 > 2011/02/02 09:07:22 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 25600,25600 > 2011/02/02 09:07:23 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'KOMP-CSD') LIMIT 51200,50000 > > ---- > > EUCOMM Only: > > 2011/02/02 09:07:29 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?> > <!DOCTYPE Query> > <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6"> > <Dataset name="idcc_targ_rep" interface="default"> > <Filter value="EUCOMM" name="pipeline"/> > <Attribute name="escell_clone"/> > </Dataset> > </Query> > > 2011/02/02 09:07:29 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main > 2011/02/02 09:07:29 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 200,200 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 400,400 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 800,800 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 1600,1600 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 3200,3200 > 2011/02/02 09:07:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 6400,6400 > 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 12800,12800 > 2011/02/02 09:07:30 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 25600,25600 > 2011/02/02 09:07:31 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 51200,50000 > 2011/02/02 09:07:33 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 101200,50000 > 2011/02/02 09:07:34 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline = 'EUCOMM') LIMIT 117710,50000 > > ---- > > Both projects together: > > 2011/02/02 08:59:14 martservice:1016:WARN> RECEIVED QUERY:<?xml version="1.0" encoding="UTF-8"?> > <!DOCTYPE Query> > <Query formatter="TSV" header="0" virtualSchemaName="default" count="" uniqueRows="1" datasetConfigVersion="0.6"> > <Dataset name="idcc_targ_rep" interface="default"> > <Filter value="KOMP-CSD,EUCOMM" name="pipeline"/> > <Attribute name="escell_clone"/> > </Dataset> > </Query> > > 2011/02/02 08:59:14 BioMart.QueryRunner:163:WARN> ATTRIBUTE: idcc_targ_rep escell_clone main > 2011/02/02 08:59:14 BioMart.QueryRunner:174:WARN> FILTER TABLE: idcc_targ_rep pipeline main > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200 > 2011/02/02 08:59:14 BioMart.Configuration.URLLocation:100:WARN> RESPONSE: 0.6 > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 200,200 > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 400,400 > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 800,800 > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 1600,1600 > 2011/02/02 08:59:14 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 3200,3200 > 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 6400,6400 > 2011/02/02 08:59:15 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 12800,12800 > 2011/02/02 08:59:16 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 25600,25600 > 2011/02/02 08:59:17 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 51200,50000 > 2011/02/02 08:59:20 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 101200,50000 > 2011/02/02 08:59:24 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 151200,50000 > 2011/02/02 08:59:27 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 201200,50000 > 2011/02/02 08:59:29 BioMart.Dataset.TableSet:736:INFO> QUERY SQL: SELECT main.escell_clone FROM ikmc_mart_alt.idcc_targ_rep__idcc_targ_rep__main main WHERE (main.pipeline IN('KOMP-CSD','EUCOMM')) LIMIT 218452,50000 > > ----- > > On 2 Feb 2011, at 00:23, Syed Haider wrote: > >> Hi Daz, >> >> Please turn on the logging in log file: >> >> conf/log4perl.conf >> >> by setting: >> >> log4perl.logger = DEBUG,Screen >> >> Please check if the SQL compilation is as per your expectation. That will give us more pointers as in how to debug it further. >> >> Best, >> Syed >> >> On 31/01/2011 11:19, Darren Oakley wrote: >>> Hi All, >>> >>> Hope you can help me with this one... >>> >>> I have a filter on one of our marts (www.knockoutmouse.org<http://www.knockoutmouse.org/>, idcc_targ_rep dataset) configured as a multi-select like so: >>> >>> --- >>> <FilterCollection displayName="Pipeline" internalName="pipeline"> >>> <FilterDescription displayName="Pipeline" displayType="list" field="pipeline" internalName="pipeline" key="id_1022_key" legal_qualifiers="=" multipleValues="1" qualifier="=" style="menu" tableConstraint="main" type="list"> >>> <Option displayName="KOMP-CSD" internalName="KOMP-CSD" isSelectable="true" value="KOMP-CSD"/> >>> <Option displayName="KOMP-Regeneron" internalName="KOMP-Regeneron" isSelectable="true" value="KOMP-Regeneron"/> >>> <Option displayName="EUCOMM" internalName="EUCOMM" isSelectable="true" value="EUCOMM"/> >>> <Option displayName="NorCOMM" internalName="NorCOMM" isSelectable="true" value="NorCOMM"/> >>> <Option displayName="mirKO" internalName="mirKO" isSelectable="true" value="mirKO"/> >>> </FilterDescription> >>> </FilterCollection> >>> --- >>> >>> But, it's producing unexpected results when you query for either of the pipelines separately, or together... >>> >>> i.e. the following xml: >>> >>> --- >>> <?xml version="1.0" encoding="UTF-8"?> >>> <!DOCTYPE Query> >>> <Query virtualSchemaName = "default" formatter = "CSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6"> >>> >>> <Dataset name = "idcc_targ_rep" interface = "default"> >>> <Filter name = "pipeline" value = "KOMP-CSD,EUCOMM"/> >>> <Attribute name = "escell_clone" /> >>> </Dataset> >>> </Query> >>> --- >>> >>> retrieves 129816 rows of data, but when you do the different pipelines separately, they do not add up to this number... >>> >>> KOMP-CSD on its own is 59440 >>> EUCOMM on its own is 76076 >>> >>> So, combined they give 135516 rows of results - 5700 more than the first query! >>> >>> The attribute i'm selecting ('escell_clone') is unique for every row in the database, so it's not that i'm getting duplicates for the separate queries. >>> >>> Any ideas as to why i'm getting this shortfall in the first query? Is there something wrong with the way I set up the filter? >>> >>> Any help is much appreciated. :) >>> >>> Thanks, >>> >>> Daz >>> > Users mailing list [hidden email] https://lists.biomart.org/mailman/listinfo/users |
In reply to this post by Darren Oakley
The inconsistent results might be caused by batching. You mentioned escell_clone are unique in your database, query through batching on certain RDBMS platform might result in rows being skipped or being duplicated. Find out possible duplicates or missing rows in the output files should help figuring out the problem. sort result1.txt |uniq -d ### this will give you possible duplicates You can use sort and diff to find out possible missing items. BTW, which RDBMS platform are you using? Hope this helps, Junjun On 11-02-02 4:10 AM, "Darren Oakley" <do2@...> wrote: Hi, _______________________________________________ Users mailing list [hidden email] https://lists.biomart.org/mailman/listinfo/users |
In reply to this post by Syed Haider
Hi Syed,
Yep, i've ran the queries against the database (both the original database and the one that the mart reads from) and it gives me counts the same as the single queries. Yep, it's a public biomart: http://www.knockoutmouse.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=idcc_targ_rep.default.attributes.escell_clone&FILTERS=idcc_targ_rep.default.filters.pipeline."KOMP-CSD,EUCOMM"&VISIBLEPANEL=resultspanel Thanks, Daz On 2 Feb 2011, at 12:58, Syed Haider wrote:
_______________________________________________ Users mailing list [hidden email] https://lists.biomart.org/mailman/listinfo/users |
In reply to this post by Junjun Zhang
Hi Junjun, Okey doke, will run through the files as suggested. Back with some results soon. :) It's running on MySQL 5.1.39. Cheers, Daz On 2 Feb 2011, at 14:30, Junjun Zhang wrote:
_______________________________________________ Users mailing list [hidden email] https://lists.biomart.org/mailman/listinfo/users |
In reply to this post by Junjun Zhang
Hi Junjun,
Unfortunately that didn't give me anything to go on... :( [ do2@guest178 ~/projects/personal-scripts/biomart ] sort EUCOMM.txt | uniq -d [ do2@guest178 ~/projects/personal-scripts/biomart ] sort KOMP-CSD.txt | uniq -d [ do2@guest178 ~/projects/personal-scripts/biomart ] sort TWO-FILTERS.txt | uniq -d [ do2@guest178 ~/projects/personal-scripts/biomart ] sort DIFF.txt | uniq -d [ do2@guest178 ~/projects/personal-scripts/biomart ] wc -l *.txt 5760 DIFF.txt 76216 EUCOMM.txt 59486 KOMP-CSD.txt 129942 TWO-FILTERS.txt [ do2@guest178 ~/projects/personal-scripts/biomart ] I've attached the output files I get: EUCOMM.txt - query with only the "EUCOMM" filter KOMP-CSD.txt - query with only the "KOMP-CSD" filter TWO-FILTERS.txt - query with both "EUCOMM" and "KOMP-CSD" selected in the multi-select DIFF.txt - what I have determined is missing from the TWO-FILTERS.txt response (weirdly, these all fall into the KOMP-CSD pipeline - could this mean anything?) Cheers, Daz On 2 Feb 2011, at 14:30, Junjun Zhang wrote:
_______________________________________________ Users mailing list [hidden email] https://lists.biomart.org/mailman/listinfo/users |
Hi Daz,
It looks like there are rows missed in the third query, the one with both items selected in the dropdown list. Can you try to run all the three SQL queries directly against mysql db? This time do it in one sql statement without 'limit' at all. Now compare the above results with what you get through biomart (where single query breaks into multiple sql queries). Let us know what you get. Junjun
From: Darren Oakley [mailto:[hidden email]]
Hi Junjun,Sent: Thursday, February 03, 2011 08:53 AM To: Junjun Zhang Cc: Syed Haider; [hidden email] <[hidden email]> Subject: Re: [BioMart Users] Fwd: Weird issue with multi-select filter Unfortunately that didn't give me anything to go on... :( [ do2@guest178 ~/projects/personal-scripts/biomart ] sort EUCOMM.txt | uniq -d [ do2@guest178 ~/projects/personal-scripts/biomart ] sort KOMP-CSD.txt | uniq -d [ do2@guest178 ~/projects/personal-scripts/biomart ] sort TWO-FILTERS.txt | uniq -d [ do2@guest178 ~/projects/personal-scripts/biomart ] sort DIFF.txt | uniq -d [ do2@guest178 ~/projects/personal-scripts/biomart ] wc -l *.txt 5760 DIFF.txt 76216 EUCOMM.txt 59486 KOMP-CSD.txt 129942 TWO-FILTERS.txt [ do2@guest178 ~/projects/personal-scripts/biomart ] I've attached the output files I get: EUCOMM.txt - query with only the "EUCOMM" filter KOMP-CSD.txt - query with only the "KOMP-CSD" filter TWO-FILTERS.txt - query with both "EUCOMM" and "KOMP-CSD" selected in the multi-select DIFF.txt - what I have determined is missing from the TWO-FILTERS.txt response (weirdly, these all fall into the KOMP-CSD pipeline - could this mean anything?) Cheers, Daz _______________________________________________ Users mailing list [hidden email] https://lists.biomart.org/mailman/listinfo/users |
Hi Junjun,
No probs - running the following queries: 1 - select count(`escell_clone`) from `idcc_targ_rep__idcc_targ_rep__main` where `pipeline` = "EUCOMM" 2 - select count(`escell_clone`) from `idcc_targ_rep__idcc_targ_rep__main` where `pipeline` = "KOMP-CSD" 3 - select count(`escell_clone`) from `idcc_targ_rep__idcc_targ_rep__main` where `pipeline` in ("EUCOMM","KOMP-CSD") gives: 1 - 76216 (same as the martservice call) 2 - 59486 (same as the martservice call) 3 - 135702 (different to the martservice call) So the data is there... This must be something in the batching causing the miss? Cheers, Daz On 4 Feb 2011, at 03:27, Junjun Zhang wrote:
_______________________________________________ Users mailing list [hidden email] https://lists.biomart.org/mailman/listinfo/users |
Free forum by Nabble | Edit this page |