Organisms XML

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Organisms XML

Paulo Nuin
Hi

I am having some problems with organisms.xml generation and getting the data from Entrez. I have a couple of species that don't have a taxonid and I think they are causing the error. These are my questions:

- at what stage the organism table is created in the build process?
- how to set the actual species that are composing this table?
- is it possible to modify it manually?
- also, past versions of InterMine allowed for a manual setup of a organism.xml file in the entrez-organisms folder. Is this still possible?

Thanks in advance

Paulo

PS I am getting a 503 error trying to access the list's archive online.

_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: Organisms XML

Paul Browne-2
Hi Paulo,

Unfortunately I can't answer the main body of your query, except for
that last part;

On 2016-05-06 16:17, Paulo Nuin wrote:
>
> PS I am getting a 503 error trying to access the list's archive
> online.

The archive for this dev list can be found at this link;

https://lists.intermine.org/pipermail/dev/

as the list archive recently migrated URL.

Thanks,
Paul

--
*******************
Paul Browne
Systems Administrator
Cambridge Systems Biology Centre
80 Tennis Court Road
University of Cambridge
Cambridge
United Kingdom
CB2 1GA
*******************
_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: Organisms XML

Paulo Nuin
Thanks, I found the solution. And the URL does not work for the lists.

Cheers
Paulo


> On May 8, 2016, at 1:09 PM, [hidden email] wrote:
>
> Hi Paulo,
>
> Unfortunately I can't answer the main body of your query, except for that last part;
>
> On 2016-05-06 16:17, Paulo Nuin wrote:
>> PS I am getting a 503 error trying to access the list's archive
>> online.
>
> The archive for this dev list can be found at this link;
>
> https://lists.intermine.org/pipermail/dev/
>
> as the list archive recently migrated URL.
>
> Thanks,
> Paul
>
> --
> *******************
> Paul Browne
> Systems Administrator
> Cambridge Systems Biology Centre
> 80 Tennis Court Road
> University of Cambridge
> Cambridge
> United Kingdom
> CB2 1GA
> *******************
> _______________________________________________
> dev mailing list
> [hidden email]
> https://lists.intermine.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: Organisms XML

Paul Browne-2
On 2016-05-09 16:39, Paulo Nuin wrote:
> Thanks, I found the solution. And the URL does not work for the lists.
>
> Cheers
> Paulo
>
>

You may have had the bad luck of checking during a schedule downtime for
the physical server room running the mail server the list is hosted on
today. If you check that link now, it does work.

Cheers,
Paul
_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: Organisms XML

Justin Clark-Casey-2
In reply to this post by Paulo Nuin
On 06/05/16 16:17, Paulo Nuin wrote:
> Hi
>
> I am having some problems with organisms.xml generation and getting the data from Entrez. I have a couple of species that don't have a taxonid and I think they
> are causing the error. These are my questions:
>
> - at what stage the organism table is created in the build process?

The organism entries are created by different sources during the build process.  For instance, with Synbiomine, the GFF3 sources are the first to create
organism entries.  The entrez-organism source actually

1.  In the -pre-retrieve build step looks for all the Organism entries created by previous sources in the object store
2.  Pulls data from Entrez for each organism and writes it to organisms.xml
3.  In the -retrieve-tgt-from-xml-file step reads organisms.xml and integrates that data into the object store.

The log text for -pre-retrieve was misleading (organisms.xml is being written at this stage, not read) - I have corrected it.

> - how to set the actual species that are composing this table?

As above, they are set from different sources in the integration process.

> - is it possible to modify it manually?

I'm not aware of a way to do this at the moment, other than making sure some previous source sets up the organism entry.  Others please correct me if I'm wrong!

You have organisms that don't have any previous data sources and you simply want the Entrez data?

> - also, past versions of InterMine allowed for a manual setup of a organism.xml file in the entrez-organisms folder. Is this still possible?

One hacky way to do this would be to comment out the following in bio/sources/entrez-organism/build.xml to stop the -pre-retrieve taking place

<!--
   <target name="-pre-retrieve" depends="source.-pre-retrieve">
     <property name="cp" refid="task.class.path"/>

     <taskdef name="retrieve-organisms"
              classname="org.intermine.bio.dataconversion.EntrezOrganismRetriever">
       <classpath refid="task.class.path"/>
     </taskdef>

     <echo message="Writing organism Entrez entries to ${src.data.file}"/>

     <retrieve-organisms osAlias="os.production" outputFile="${src.data.file}"/>
   </target>
-->

and then have ones own <mine>/integrate/build/organisms.xml.  But it would be a pita to maintain and I suspect you don't really want this.

-- Justin
_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev