Intermine (flymine) bug report

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Intermine (flymine) bug report

Owen Marshall-2
Hi,

Apologies for the email, but a bug on case matching / primaryIdentifier matching appears to have been recently introduced into the intermine API service.

The issue is that generation of a new list from symbols fails to match on case (leading to ambiguous matches in the Drosophila gene annotations that then do not get incorporated into the final processed list.  The web flymine service has an option to match on case; the intermine API (as far as I can tell) lacks this option, but used to automatically match on case.  It now no longer does.

I thought I had worked around any possible such issues by using resolve_ids to generate a list of primaryIdentifiers that should have been robust against any such issues.  Unfortunately, even this has recently ceased to work!  The problem is that some Drosophila genes appear to be annotated such that their symbol annotation contains both the genuine symbol and the primaryIdentifier.  Take a look at FBgn0003317 (sax) as an illustration of this.

The end result is that creating a new list of primaryIdentifiers fails to match around half the genes (because the multiple symbol/primaryIdentifier annotations seem to lead to rejection -- I'm actually not sure why these should be rejected, as the ID is still unique to the gene in question, but that's very clearly what's happening).

Is there a way around this problem, please?  Any suggestions or fixes would be greatly appreciated!

Best regards,
Owen


Owen Marshall, PhD
Wellcome Trust/Cancer Research UK Gurdon Institute
The Henry Wellcome Building of Cancer and Developmental Biology
University of Cambridge
Tennis Court Road
Cambridge CB2 1QN
UK

T:
+44 (0)1223 334 140

F:
+44 (0)1223 334 089


_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: Intermine (flymine) bug report

Justin Clark-Casey
Hi Owen.  On which release of InterMine did this change for you?

--
Justin Clark-Casey, Synbiomine/InterMine Developer
http://synbiomine.org
http://twitter.com/justincc

On 04/01/16 18:12, Owen Marshall wrote:

> Hi,
>
> Apologies for the email, but a bug on case matching / primaryIdentifier matching appears to have been recently introduced into the intermine API service.
>
> The issue is that generation of a new list from symbols fails to match on case (leading to ambiguous matches in the Drosophila gene annotations that then do not
> get incorporated into the final processed list.  The web flymine service has an option to match on case; the intermine API (as far as I can tell) lacks this
> option, but used to automatically match on case.  It now no longer does.
>
> I thought I had worked around any possible such issues by using resolve_ids to generate a list of primaryIdentifiers that should have been robust against any
> such issues.  Unfortunately, even this has recently ceased to work!  The problem is that some Drosophila genes appear to be annotated such that their symbol
> annotation contains both the genuine symbol /and/ the primaryIdentifier.  Take a look at FBgn0003317 (sax) as an illustration of this.
>
> The end result is that creating a new list of primaryIdentifiers fails to match around half the genes (because the multiple symbol/primaryIdentifier annotations
> seem to lead to rejection -- I'm actually not sure why these should be rejected, as the ID is still unique to the gene in question, but that's very clearly
> what's happening).
>
> Is there a way around this problem, please?  Any suggestions or fixes would be greatly appreciated!
>
> Best regards,
> Owen
>
>
> *Owen Marshall, PhD*
> Wellcome Trust/Cancer Research UK Gurdon Institute
> The Henry Wellcome Building of Cancer and Developmental Biology
> University of Cambridge
> Tennis Court Road
> Cambridge CB2 1QN
> UK
>
> T: +44 (0)1223 334 140
> F: +44 (0)1223 334 089
>
>
>
> _______________________________________________
> dev mailing list
> [hidden email]
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>

_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: Intermine (flymine) bug report

Julie Sullivan
In reply to this post by Owen Marshall-2
Owen

This is my fault, it seems the genes didn't merge properly (e.g.
FBgn0003317) during data loading.

I hope to release a fixed version of the database on Friday. I'll let
you know. Apologies!

Julie

On 04/01/16 18:12, Owen Marshall wrote:

> Hi,
>
> Apologies for the email, but a bug on case matching / primaryIdentifier
> matching appears to have been recently introduced into the intermine API
> service.
>
> The issue is that generation of a new list from symbols fails to match
> on case (leading to ambiguous matches in the Drosophila gene annotations
> that then do not get incorporated into the final processed list.  The
> web flymine service has an option to match on case; the intermine API
> (as far as I can tell) lacks this option, but used to automatically
> match on case.  It now no longer does.
>
> I thought I had worked around any possible such issues by using
> resolve_ids to generate a list of primaryIdentifiers that should have
> been robust against any such issues.  Unfortunately, even this has
> recently ceased to work!  The problem is that some Drosophila genes
> appear to be annotated such that their symbol annotation contains both
> the genuine symbol /and/ the primaryIdentifier.  Take a look at
> FBgn0003317 (sax) as an illustration of this.
>
> The end result is that creating a new list of primaryIdentifiers fails
> to match around half the genes (because the multiple
> symbol/primaryIdentifier annotations seem to lead to rejection -- I'm
> actually not sure why these should be rejected, as the ID is still
> unique to the gene in question, but that's very clearly what's happening).
>
> Is there a way around this problem, please?  Any suggestions or fixes
> would be greatly appreciated!
>
> Best regards,
> Owen
>
>
> *Owen Marshall, PhD*
> Wellcome Trust/Cancer Research UK Gurdon Institute
> The Henry Wellcome Building of Cancer and Developmental Biology
> University of Cambridge
> Tennis Court Road
> Cambridge CB2 1QN
> UK
>
> T: +44 (0)1223 334 140
> F: +44 (0)1223 334 089
>
>
>
> _______________________________________________
> dev mailing list
> [hidden email]
> http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
>

_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev