Re: [BioMart Users] biomaRt confusion

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: [BioMart Users] biomaRt confusion

Elena Rivkin
Hi Natasha,
I repeated your query using BioMart Martview, with the following query
syntax:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query  virtualSchemaName = "default" formatter = "TSV" header = "0"
uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
                       
        <Dataset name = "hsapiens_gene_ensembl" interface = "default" >
                <Filter name = "hgnc_symbol" value = "GAGE12F,GAGE12G,Gage12I"/>
                <Attribute name = "entrezgene" />
                <Attribute name = "hgnc_symbol" />
                <Attribute name = "chromosome_name" />
                <Attribute name = "illumina_humanht_12" />
                <Attribute name = "ensembl_gene_id" />
        </Dataset>
</Query>


I can confirm that it seems that all EntrezGeneIDs are returned for all
the genes (Gage12I, GAGE12G, GAGE12F).
For example, GAGE12G should only retrun EntrezGeneID 645073, but it
returns 2576, 2577, 2578, etc.  I don't see an overalp between Ensembl
Gene id (there are only 3 Ensembl Gene ID, corresponding to each of the
HGNC symbols), I think the issue is only with EntrezGene Ids.

I am forwardingt this email to Ensembl helpdesk. They should be able to
help with this. We will take at a look at this at our end as well.

Thank you.


Elena Rivkin, PhD
Outreach and Training Coordinator, Informatics and Bio-computing

Ontario Institute for Cancer Research
MaRS Centre, South Tower
101 College Street, Suite 800
Toronto, Ontario, Canada M5G 0A3

Tel: 647-258-4316
Toll-free: 1-866-678-6427
www.oicr.on.ca

This message and any attachments may contain confidential and/or
privileged information for the sole use of the intended recipient. Any
review or distribution by anyone other than the person for whom it was
originally intended is strictly prohibited. If you have received this
message in error, please contact the sender and delete all copies.
Opinions, conclusions or other information contained in this message may
not be that of the organization.
 





On 11-08-23 8:23 AM, "Natasha Sahgal" <[hidden email]> wrote:

>Dear Elena,
>
>I am hoping you could help me with another biomaRt question (rather
>confusion I have).
>
>I queried the biomart website and extracted the following attributes from
>Homosapiens Dataset (GCh37):
>Ensembl Gene Id
>Associated gene name
>Entrez Gene id
>HGNC Symbol
>
>I am using a microarray IlluminaHT12 v4 (so I can't extract these from
>biomart, as I was told the biomart were mapped to the HT12 v3 chip).
>
>I wanted to extract ensemble gene ids for my array data, but the output
>does not make sense. The entrez gene ids are not unique, and the same
>gene has multiple ensemble gene ids and entrez gene ids.
>
>As an example: GAGE12F, GAE12G, GAGe12I genes
>
>Microarray: Illumina HT12 v4 output:
>
> Entrez_Gene_ID  Symbol Chromosome     Probe_Id Probe_Type Cytoband
>          26748 GAGE12I          X ILMN_1691563          A Xp11.23b
>      100008586 GAGE12F          X ILMN_3242920          S Xp11.23b
>         645073 GAGE12G          X ILMN_1664660          S Xp11.23b
>                                     Definition
> Homo sapiens G antigen 12I (GAGE12I), mRNA.
> Homo sapiens G antigen 12F (GAGE12F), mRNA.
> Homo sapiens G antigen 12G (GAGE12G), mRNA.
>
>
>Biomart output:
>
>     entrezgene ensembl_gene_id hgnc_symbol
>1     100008586 ENSG00000241465     GAGE12I
>2     100008586 ENSG00000236362     GAGE12F
>3     100008586 ENSG00000215269     GAGE12G
>1022      26748 ENSG00000241465     GAGE12I
>1023      26748 ENSG00000236362     GAGE12F
>1024      26748 ENSG00000215269     GAGE12G
>2392     645073 ENSG00000241465     GAGE12I
>2393     645073 ENSG00000236362     GAGE12F
>2394     645073 ENSG00000215269     GAGE12G
>
>So please help me understand, why are there multiple results rather than
>true unique results.
>
>
>Best Regards,
>Natasha
>
>-----Original Message-----
>From: Elena Rivkin [mailto:[hidden email]]
>Sent: 19 August 2011 14:18
>To: Natasha Sahgal
>Subject: Re: [BioMart Users] biomaRt query [Ensembl #228923] biomaRt query
>
>Unfotunately no, however another ensembl ticket was created and they
>should reply to it shortly.
>I'll be happy to assist you if you would like to query the biomart
>interface (martview). From there you can also generated the xml syntax
>that you can send to ensembl helpdesk for assistance.
>
>Regards,
>
>Elena Rivkin, PhD
>Outreach and Training Coordinator, Informatics and Bio-computing
>
>Ontario Institute for Cancer Research
>MaRS Centre, South Tower
>101 College Street, Suite 800
>Toronto, Ontario, Canada M5G 0A3
>
>Tel: 647-258-4316
>Toll-free: 1-866-678-6427
>www.oicr.on.ca
>
>This message and any attachments may contain confidential and/or
>privileged information for the sole use of the intended recipient. Any
>review or distribution by anyone other than the person for whom it was
>originally intended is strictly prohibited. If you have received this
>message in error, please contact the sender and delete all copies.
>Opinions, conclusions or other information contained in this message may
>not be that of the organization.
>
>
>
>
>
>
>On 11-08-19 9:02 AM, "Natasha Sahgal" <[hidden email]> wrote:
>
>>HI Elena,
>>
>>I was told by them to get in touch with you. (ticket # 228923).
>>
>>I don't suppose you would then know which version of the HT12 chip is
>>used either?
>>
>>Many Thanks,
>>Natasha
>>
>>
>>On 19/08/2011 13:58, Elena Rivkin wrote:
>>> Hi Natasha,
>>>
>>> 1. These are two different filters that give you the following options:
>>> The first filter 'with_illumina_humanht_12' restricts the output to all
>>> the genes with 'illumina_humanht_12' identifier. The second filter
>>> 'illumina_humanht_12' allows you to restrict your output to specific
>>> 'illumina_humanht_12' identifiers, which you need to specify.
>>>
>>> 2. It is best to address these data specific questions to the Ensembl
>>> helpdesk: [hidden email]. I am cc'ing them in my response.
>>>
>>> Thank you.
>>> Elena Rivkin, PhD
>>> Outreach and Training Coordinator, Informatics and Bio-computing
>>>
>>> Ontario Institute for Cancer Research
>>> MaRS Centre, South Tower
>>> 101 College Street, Suite 800
>>> Toronto, Ontario, Canada M5G 0A3
>>>
>>> Tel: 647-258-4316
>>> Toll-free: 1-866-678-6427
>>> www.oicr.on.ca
>>>
>>> This message and any attachments may contain confidential and/or
>>> privileged information for the sole use of the intended recipient. Any
>>> review or distribution by anyone other than the person for whom it was
>>> originally intended is strictly prohibited. If you have received this
>>> message in error, please contact the sender and delete all copies.
>>> Opinions, conclusions or other information contained in this message
>>>may
>>> not be that of the organization.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 11-08-19 8:48 AM, "Natasha Sahgal"<[hidden email]>  wrote:
>>>
>>>> Hello,
>>>>
>>>> I am new to using biomaRt, and I use it via the R/Bioconductor
>>>>package.
>>>>
>>>> However, there are some things I do not understand.
>>>>
>>>> 1) What is the difference between: "with_illumina_humanht_12" and
>>>> "illumina_humanht_12" in the filters,
>>>>
>>>> 2) What version of the Illumina HT12 chip is used? This is unclear. As
>>>> for the WG6 chips it is mentioned V1,V2 or V3, HT12 have 4 versions
>>>>out
>>>> so far, so which version does the information in the BiomaRt
>>>>correspond
>>>> to?
>>>>
>>>> R code:
>>>> listMarts(host="www.ensembl.org")
>>>> ensembl =  useMart("ENSEMBL_MART_ENSEMBL", host="www.ensembl.org",
>>>> dataset="hsapiens_gene_ensembl")
>>>> filt = listFilters(ensembl)
>>>>
>>>> filt[grep("illumina", filt[[1]], ignore.case=TRUE), ]
>>>>                            name
>>>> description
>>>> 12  with_illumina_humanwg_6_v1               with Illumina HumanWG 6
>>>>v1
>>>> ID(s))
>>>> 13  with_illumina_humanwg_6_v2                with Illumina HumanWG 6
>>>>v2
>>>> ID(s)
>>>> 14  with_illumina_humanwg_6_v3                with Illumina HumanWG 6
>>>>v3
>>>> ID(s)
>>>> 15    with_illumina_humanht_12                 with Illumina Human HT
>>>>12
>>>> ID(s)
>>>> 125      illumina_humanwg_6_v1   Illumina HumanWG 6 V1 ID(s) [e.g.
>>>> 0000940471]
>>>> 126      illumina_humanwg_6_v2 Illumina HumanWG 6 V2 ID(s) [e.g.
>>>> ILMN_1748182]
>>>> 127      illumina_humanwg_6_v3 Illumina HumanWG 6 v3 ID(s) [e.g.
>>>> ILMN_2103362]
>>>> 128        illumina_humanht_12  Illumina Human HT 12 ID(s) [e.g.
>>>> ILMN_1672925]
>>>>
>>>>
>>>> Many Thanks in advance,
>>>> Natasha
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> [hidden email]
>>>> https://lists.biomart.org/mailman/listinfo/users
>>
>>--
>>
>>Natasha Sahgal, PhD
>>
>>Research Associate in Functional Genomics
>>Bioinformatics&  Statistical Genetics Core
>>Wellcome Trust Center for Human Genetics
>>University of Oxford
>>Roosevelt Drive
>>Oxford OX3 7BN
>>U.K.
>>
>>Tel: + 44 -(0)1865 287 609
>>Fax: + 44 -(0)1865 287 664
>>
>>-------------------------------------------
>>
>

_______________________________________________
Users mailing list
[hidden email]
https://lists.biomart.org/mailman/listinfo/users