Re: m4m grant II : data loaded so far

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: m4m grant II : data loaded so far

Vallejos, Andrew
I cannot find an ortholog.txt.table file from Treefam.  Where is that
available?

Also, TreeFam uses Ensembl.  Are there docs on how to the us ID resolver
to map these ids to the RGD ids that RatMine uses?

Thanks,

Andrew

-----Original Message-----
From: Julie Sullivan [mailto:[hidden email]]
Sent: Tuesday, June 15, 2010 10:14 AM
To: Sierra Moxon; Vallejos, Andrew; Kalpana Karra
Subject: m4m grant II : data loaded so far

Hi devs,

I am just filling in my chart for the new grant and I was hoping to add
a little
more color.

Do you plan on loading TreeFam any time soon?

Andrew, I know you were having issues with BioGRID's identifiers.  Is
that ever
going to be resolved?

Cheers,
Julie

_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

TreeFam data sources, ID resolvers for ensembl --> RGD ID

Julie Sullivan
Andrew,

I took the file directly from a table in their MySQL database:

        ftp://ftp.sanger.ac.uk/pub/treefam/release-7.0/MySQL/

Sorry that wasn't clear, I've updated the documentation with this info:

        http://intermine.org/wiki/TreeFam

If you have other work to keep you busy (ha ha), please wait to load TreeFam
until after the next release of InterMine.  Currently the TreeFam converter
loads all of the orthologs for ALL of the organisms you specify in your
project.xml, which takes a really long time as there is a lot of data.

Sierra pointed out that she was just interested in Fish <--> Rat and Fish <-->
Fly orthologs, NOT Fly <--> Rat orthologs (also true of interactions).  So I am
going to update the converter to take in a `organism` parameter (rat) and a
`homolog organisms` parameter (fish, fly, etc).  This will cut down on the
amount of (unwanted) data loaded and speed up the build process.  0.94 should be
ready in a few weeks.

~~~

There isn't any documentation yet on ID resolvers.  Here is the javadoc:

http://www.flymine.org/api/index.html?org/intermine/objectstore/ObjectStoreFactory.html

Specifically, you want to copy the FlyBase ID resolver factory:

http://www.intermine.org/browser/trunk/bio/core/main/src/org/intermine/bio/dataconversion/FlyBaseIdResolverFactory.java

You want to populate a data structure that consists of taxonId,
primaryIdentifier, and a list of other identifiers.  All of our converters
implement the FlyBase ID resolver; let me know if you have questions getting
this to work.

Cheers,
Julie

Vallejos, Andrew wrote:

> I cannot find an ortholog.txt.table file from Treefam.  Where is that
> available?
>
> Also, TreeFam uses Ensembl.  Are there docs on how to the us ID resolver
> to map these ids to the RGD ids that RatMine uses?
>
> Thanks,
>
> Andrew
>

_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev