loading custom datasource data

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

loading custom datasource data

Sofia Robb
Hello,

I am hoping someone can give me a hint to what I have done wrong.

I created my own blast datasource

here is my blast_additions.xml

<classes>
<!-- add any <class> elements here -->
  <class name="BlastResult" is-interface="true">
    <attribute name="queryName" type="java.lang.String"/>
    <attribute name="database" type="java.lang.String"/>
    <attribute name="algorithm" type="java.lang.String"/>
  </class>
  <class name="BlastHit" extends="BlastResult" is-interface="true">
    <attribute name="hitName" type="java.lang.String"/>
    <attribute name="hitDesc" type="java.lang.String"/>
    <attribute name="hitEvalue" type="java.lang.Double"/>
  </class>
  <class name="BlastHSP" extends="BlastHit" is-interface="true">
    <attribute name="hspEvalue" type="java.lang.Double"/>
    <attribute name="hsp_hStart" type="java.lang.String"/>
    <attribute name="hsp_hEnd" type="java.lang.String"/>
    <attribute name="hsp_qStart" type="java.lang.String"/>
    <attribute name="hsp_qEnd" type="java.lang.String"/>
  </class>
</classes>

Here is a sample of my blast intermine xml

<items>
        <item id="organism.0" class="Organism">
                <attribute name="genus" value="Schmidtea"/>
                <attribute name="species" value="mediterranea"/>
                <attribute name="taxonId" value="79327"/>
        </item>
        <item id="datasource.0" class="DataSource">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015"/>
                <reference name="Organism" ref_id="organism.0" />
        </item>
        <item id="datasource.1" class="DataSource">
                <attribute name="name" value="uniprot_sprot.fasta_2015_08"/>
        </item>
        <item id="dataset.0" class="DataSet">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015_VS_uniprot"/>
                <collection name="dataSources">
                        <reference name="DataSource" ref_id="datasource.0" />
                        <reference name="DataSource" ref_id="datasource.1" />
                </collection>
        </item>
        <item id="1" class="BlastResult">
                <attribute name="queryName" value="mk5-SmedSxl-v31.022420-0.7-1"/>
                <attribute name="database" value="Uniprot_swisprot"/>
                <attribute name="algorithm" value="blastx"/>
        </item>
        <item id="1.1" class="BlastHit">
                <attribute name="hitName" value="sp|Q52PG9|KCND1_BOVIN"/>
                <attribute name="hitEvalue" value="1e-64"/>
                <attribute name="hitDesc" value="Potassium voltage-gated channel subfamily D member 1 OS=Bos taurus GN=KCND1 PE=2 SV=1"/>
                <reference name="BlastResult" ref_id="1" />
        </item>
        <item id="1.1.1" class="BlastHSP">
                <attribute name="hspEvalue" value="1e-64"/>
                <attribute name="hsp_hStart" value="283"/>
                <attribute name="hsp_hEnd" value="451"/>
                <attribute name="hsp_qStart" value="28"/>
                <attribute name="hsp_qEnd" value="630"/>
                <reference name="BlastHit" ref_id="1.1" />
        </item>
...

here is the error I get when I run the integrate command.

BUILD FAILED
/var/lib/pgsql/other_data/intermine/imbuild/integrate.xml:54: The following error occurred while executing this line:
/var/lib/pgsql/other_data/intermine/imbuild/source.xml:201: Exception while reading from: /data/organisms/planaria/Smed/SmedSxl/genome/v31/Swissprot_blast/SmedSxl_v3.1_MAKER_082015_vs_swissprot.1.intermine.xml
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:170)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:440)
at org.intermine.task.Integrate.performAction(Integrate.java:223)
at org.intermine.task.Integrate.performAction(Integrate.java:135)
at org.intermine.task.Integrate.execute(Integrate.java:127)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.Main.runBuild(Main.java:851)
at org.apache.tools.ant.Main.startAnt(Main.java:235)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
Caused by: org.intermine.InterMineException: Error during unmarshalling
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:64)
at org.intermine.dataloader.XmlDataLoader.processXml(XmlDataLoader.java:68)
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:160)
... 31 more
Caused by: java.lang.NullPointerException
at java.lang.Class.isAssignableFrom(Native Method)
at org.intermine.xml.full.FullParser.populateObject(FullParser.java:213)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:135)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:78)
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:62)
... 33 more

Total time: 16 seconds


Thank you,
Sofia

_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: loading custom datasource data

Joel Richardson-2

Hi Sophia,

I'm not 100% sure, but I think your item ids need to use underbars rather than periods, e.g. "organism_0" rather than "organism.0".

Joel

-- 
Joel E. Richardson, Ph.D.
Sr. Research Scientist
Mouse Genome Informatics
The Jackson Laboratory
600 Main Street
Bar Harbor, Maine 04609
207-288-6435

From: <[hidden email]> on behalf of Sofia Robb <[hidden email]>
Date: Tuesday, October 6, 2015 3:06 PM
To: "[hidden email]" <[hidden email]>
Subject: [InterMine Dev] loading custom datasource data

Hello,

I am hoping someone can give me a hint to what I have done wrong.

I created my own blast datasource

here is my blast_additions.xml

<classes>
<!-- add any <class> elements here -->
  <class name="BlastResult" is-interface="true">
    <attribute name="queryName" type="java.lang.String"/>
    <attribute name="database" type="java.lang.String"/>
    <attribute name="algorithm" type="java.lang.String"/>
  </class>
  <class name="BlastHit" extends="BlastResult" is-interface="true">
    <attribute name="hitName" type="java.lang.String"/>
    <attribute name="hitDesc" type="java.lang.String"/>
    <attribute name="hitEvalue" type="java.lang.Double"/>
  </class>
  <class name="BlastHSP" extends="BlastHit" is-interface="true">
    <attribute name="hspEvalue" type="java.lang.Double"/>
    <attribute name="hsp_hStart" type="java.lang.String"/>
    <attribute name="hsp_hEnd" type="java.lang.String"/>
    <attribute name="hsp_qStart" type="java.lang.String"/>
    <attribute name="hsp_qEnd" type="java.lang.String"/>
  </class>
</classes>

Here is a sample of my blast intermine xml

<items>
        <item id="organism.0" class="Organism">
                <attribute name="genus" value="Schmidtea"/>
                <attribute name="species" value="mediterranea"/>
                <attribute name="taxonId" value="79327"/>
        </item>
        <item id="datasource.0" class="DataSource">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015"/>
                <reference name="Organism" ref_id="organism.0" />
        </item>
        <item id="datasource.1" class="DataSource">
                <attribute name="name" value="uniprot_sprot.fasta_2015_08"/>
        </item>
        <item id="dataset.0" class="DataSet">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015_VS_uniprot"/>
                <collection name="dataSources">
                        <reference name="DataSource" ref_id="datasource.0" />
                        <reference name="DataSource" ref_id="datasource.1" />
                </collection>
        </item>
        <item id="1" class="BlastResult">
                <attribute name="queryName" value="mk5-SmedSxl-v31.022420-0.7-1"/>
                <attribute name="database" value="Uniprot_swisprot"/>
                <attribute name="algorithm" value="blastx"/>
        </item>
        <item id="1.1" class="BlastHit">
                <attribute name="hitName" value="sp|Q52PG9|KCND1_BOVIN"/>
                <attribute name="hitEvalue" value="1e-64"/>
                <attribute name="hitDesc" value="Potassium voltage-gated channel subfamily D member 1 OS=Bos taurus GN=KCND1 PE=2 SV=1"/>
                <reference name="BlastResult" ref_id="1" />
        </item>
        <item id="1.1.1" class="BlastHSP">
                <attribute name="hspEvalue" value="1e-64"/>
                <attribute name="hsp_hStart" value="283"/>
                <attribute name="hsp_hEnd" value="451"/>
                <attribute name="hsp_qStart" value="28"/>
                <attribute name="hsp_qEnd" value="630"/>
                <reference name="BlastHit" ref_id="1.1" />
        </item>
...

here is the error I get when I run the integrate command.

BUILD FAILED
/var/lib/pgsql/other_data/intermine/imbuild/integrate.xml:54: The following error occurred while executing this line:
/var/lib/pgsql/other_data/intermine/imbuild/source.xml:201: Exception while reading from: /data/organisms/planaria/Smed/SmedSxl/genome/v31/Swissprot_blast/SmedSxl_v3.1_MAKER_082015_vs_swissprot.1.intermine.xml
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:170)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:440)
at org.intermine.task.Integrate.performAction(Integrate.java:223)
at org.intermine.task.Integrate.performAction(Integrate.java:135)
at org.intermine.task.Integrate.execute(Integrate.java:127)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.Main.runBuild(Main.java:851)
at org.apache.tools.ant.Main.startAnt(Main.java:235)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
Caused by: org.intermine.InterMineException: Error during unmarshalling
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:64)
at org.intermine.dataloader.XmlDataLoader.processXml(XmlDataLoader.java:68)
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:160)
... 31 more
Caused by: java.lang.NullPointerException
at java.lang.Class.isAssignableFrom(Native Method)
at org.intermine.xml.full.FullParser.populateObject(FullParser.java:213)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:135)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:78)
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:62)
... 33 more

Total time: 16 seconds


Thank you,
Sofia

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.


_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: loading custom datasource data

Sofia Robb
Hi Joel, 

Thank you for the suggestion. I tried changing all my '.' to '_' but it did not help :(

Sofia

On Tue, Oct 6, 2015 at 1:16 PM, Joel Richardson <[hidden email]> wrote:

Hi Sophia,

I'm not 100% sure, but I think your item ids need to use underbars rather than periods, e.g. "organism_0" rather than "organism.0".

Joel

-- 
Joel E. Richardson, Ph.D.
Sr. Research Scientist
Mouse Genome Informatics
The Jackson Laboratory
600 Main Street
Bar Harbor, Maine 04609
<a href="tel:207-288-6435" value="+12072886435" target="_blank">207-288-6435

From: <[hidden email]> on behalf of Sofia Robb <[hidden email]>
Date: Tuesday, October 6, 2015 3:06 PM
To: "[hidden email]" <[hidden email]>
Subject: [InterMine Dev] loading custom datasource data

Hello,

I am hoping someone can give me a hint to what I have done wrong.

I created my own blast datasource

here is my blast_additions.xml

<classes>
<!-- add any <class> elements here -->
  <class name="BlastResult" is-interface="true">
    <attribute name="queryName" type="java.lang.String"/>
    <attribute name="database" type="java.lang.String"/>
    <attribute name="algorithm" type="java.lang.String"/>
  </class>
  <class name="BlastHit" extends="BlastResult" is-interface="true">
    <attribute name="hitName" type="java.lang.String"/>
    <attribute name="hitDesc" type="java.lang.String"/>
    <attribute name="hitEvalue" type="java.lang.Double"/>
  </class>
  <class name="BlastHSP" extends="BlastHit" is-interface="true">
    <attribute name="hspEvalue" type="java.lang.Double"/>
    <attribute name="hsp_hStart" type="java.lang.String"/>
    <attribute name="hsp_hEnd" type="java.lang.String"/>
    <attribute name="hsp_qStart" type="java.lang.String"/>
    <attribute name="hsp_qEnd" type="java.lang.String"/>
  </class>
</classes>

Here is a sample of my blast intermine xml

<items>
        <item id="organism.0" class="Organism">
                <attribute name="genus" value="Schmidtea"/>
                <attribute name="species" value="mediterranea"/>
                <attribute name="taxonId" value="79327"/>
        </item>
        <item id="datasource.0" class="DataSource">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015"/>
                <reference name="Organism" ref_id="organism.0" />
        </item>
        <item id="datasource.1" class="DataSource">
                <attribute name="name" value="uniprot_sprot.fasta_2015_08"/>
        </item>
        <item id="dataset.0" class="DataSet">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015_VS_uniprot"/>
                <collection name="dataSources">
                        <reference name="DataSource" ref_id="datasource.0" />
                        <reference name="DataSource" ref_id="datasource.1" />
                </collection>
        </item>
        <item id="1" class="BlastResult">
                <attribute name="queryName" value="mk5-SmedSxl-v31.022420-0.7-1"/>
                <attribute name="database" value="Uniprot_swisprot"/>
                <attribute name="algorithm" value="blastx"/>
        </item>
        <item id="1.1" class="BlastHit">
                <attribute name="hitName" value="sp|Q52PG9|KCND1_BOVIN"/>
                <attribute name="hitEvalue" value="1e-64"/>
                <attribute name="hitDesc" value="Potassium voltage-gated channel subfamily D member 1 OS=Bos taurus GN=KCND1 PE=2 SV=1"/>
                <reference name="BlastResult" ref_id="1" />
        </item>
        <item id="1.1.1" class="BlastHSP">
                <attribute name="hspEvalue" value="1e-64"/>
                <attribute name="hsp_hStart" value="283"/>
                <attribute name="hsp_hEnd" value="451"/>
                <attribute name="hsp_qStart" value="28"/>
                <attribute name="hsp_qEnd" value="630"/>
                <reference name="BlastHit" ref_id="1.1" />
        </item>
...

here is the error I get when I run the integrate command.

BUILD FAILED
/var/lib/pgsql/other_data/intermine/imbuild/integrate.xml:54: The following error occurred while executing this line:
/var/lib/pgsql/other_data/intermine/imbuild/source.xml:201: Exception while reading from: /data/organisms/planaria/Smed/SmedSxl/genome/v31/Swissprot_blast/SmedSxl_v3.1_MAKER_082015_vs_swissprot.1.intermine.xml
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:170)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:440)
at org.intermine.task.Integrate.performAction(Integrate.java:223)
at org.intermine.task.Integrate.performAction(Integrate.java:135)
at org.intermine.task.Integrate.execute(Integrate.java:127)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.Main.runBuild(Main.java:851)
at org.apache.tools.ant.Main.startAnt(Main.java:235)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
Caused by: org.intermine.InterMineException: Error during unmarshalling
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:64)
at org.intermine.dataloader.XmlDataLoader.processXml(XmlDataLoader.java:68)
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:160)
... 31 more
Caused by: java.lang.NullPointerException
at java.lang.Class.isAssignableFrom(Native Method)
at org.intermine.xml.full.FullParser.populateObject(FullParser.java:213)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:135)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:78)
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:62)
... 33 more

Total time: 16 seconds


Thank you,
Sofia

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.



_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: loading custom datasource data

vkrishna
Hi Sofia,

I had a comment about the structuring of your BLAST XML in reference to your proposed data model.

Within the BlastHit and BlastHSP item stanzas, there look to be references back to the BlastResult and BlastHit classes, respectively. This is in contradiction with the data model, where the BlastHit and BlastHSP stanza do not explicitly state the reference. The various classes seem to just extend one another, which I believe is not the same as a reference.

Considering how BLAST data is normally structured, the following alteration to the data model might be better suited to your requirements:
<classes>
<!-- add any <class> elements here -->
  <class name="BlastResult" is-interface="true">
    <attribute name="queryName" type="java.lang.String"/>
    <attribute name="database" type="java.lang.String"/>
    <attribute name="algorithm" type="java.lang.String”/>
    <collection name=“blastHits” referenced-type=“BlastHit” />
  </class>
  <class name="BlastHit" is-interface="true">
    <attribute name="hitName" type="java.lang.String"/>
    <attribute name="hitDesc" type="java.lang.String"/>
    <attribute name="hitEvalue" type="java.lang.Double”/>
    <collection name=“blastHSPs” referenced-type=“BlastHsp” />
    <reference name=“blastResult” referenced-type=“BlastResult” />
  </class>
  <class name="BlastHsp" is-interface="true">
    <attribute name="hspEvalue" type="java.lang.Double"/>
    <attribute name="hsp_hStart" type="java.lang.String"/>
    <attribute name="hsp_hEnd" type="java.lang.String"/>
    <attribute name="hsp_qStart" type="java.lang.String"/>
    <attribute name="hsp_qEnd" type="java.lang.String”/>
    <reference name=“blastHit” referenced-type=“BlastHit”/>
  </class>
</classes>

In this altered data model, the following relationships are enforced:
1. A BlastResult can contain a collection of BlastHits
2. A BlastHit can contain a collection of BlastHsps, and they have a reference back to a BlastResult
3. A BlastHsp has a reference back to a particular BlastHit

Does this make sense in the context of your BLAST alignment data?

Thank you.
Vivek

On Oct 6, 2015, at 3:59 PM, Sofia Robb <[hidden email]> wrote:

Hi Joel, 

Thank you for the suggestion. I tried changing all my '.' to '_' but it did not help :(

Sofia

On Tue, Oct 6, 2015 at 1:16 PM, Joel Richardson <[hidden email]> wrote:

Hi Sophia,

I'm not 100% sure, but I think your item ids need to use underbars rather than periods, e.g. "organism_0" rather than "organism.0".

Joel

-- 
Joel E. Richardson, Ph.D.
Sr. Research Scientist
Mouse Genome Informatics
The Jackson Laboratory
600 Main Street
Bar Harbor, Maine 04609
<a href="tel:207-288-6435" value="&#43;12072886435" target="_blank" class="">207-288-6435

From: <[hidden email]> on behalf of Sofia Robb <[hidden email]>
Date: Tuesday, October 6, 2015 3:06 PM
To: "[hidden email]" <[hidden email]>
Subject: [InterMine Dev] loading custom datasource data

Hello,

I am hoping someone can give me a hint to what I have done wrong.

I created my own blast datasource

here is my blast_additions.xml

<classes>
<!-- add any <class> elements here -->
  <class name="BlastResult" is-interface="true">
    <attribute name="queryName" type="java.lang.String"/>
    <attribute name="database" type="java.lang.String"/>
    <attribute name="algorithm" type="java.lang.String"/>
  </class>
  <class name="BlastHit" extends="BlastResult" is-interface="true">
    <attribute name="hitName" type="java.lang.String"/>
    <attribute name="hitDesc" type="java.lang.String"/>
    <attribute name="hitEvalue" type="java.lang.Double"/>
  </class>
  <class name="BlastHSP" extends="BlastHit" is-interface="true">
    <attribute name="hspEvalue" type="java.lang.Double"/>
    <attribute name="hsp_hStart" type="java.lang.String"/>
    <attribute name="hsp_hEnd" type="java.lang.String"/>
    <attribute name="hsp_qStart" type="java.lang.String"/>
    <attribute name="hsp_qEnd" type="java.lang.String"/>
  </class>
</classes>

Here is a sample of my blast intermine xml

<items>
        <item id="organism.0" class="Organism">
                <attribute name="genus" value="Schmidtea"/>
                <attribute name="species" value="mediterranea"/>
                <attribute name="taxonId" value="79327"/>
        </item>
        <item id="datasource.0" class="DataSource">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015"/>
                <reference name="Organism" ref_id="organism.0" />
        </item>
        <item id="datasource.1" class="DataSource">
                <attribute name="name" value="uniprot_sprot.fasta_2015_08"/>
        </item>
        <item id="dataset.0" class="DataSet">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015_VS_uniprot"/>
                <collection name="dataSources">
                        <reference name="DataSource" ref_id="datasource.0" />
                        <reference name="DataSource" ref_id="datasource.1" />
                </collection>
        </item>
        <item id="1" class="BlastResult">
                <attribute name="queryName" value="mk5-SmedSxl-v31.022420-0.7-1"/>
                <attribute name="database" value="Uniprot_swisprot"/>
                <attribute name="algorithm" value="blastx"/>
        </item>
        <item id="1.1" class="BlastHit">
                <attribute name="hitName" value="sp|Q52PG9|KCND1_BOVIN"/>
                <attribute name="hitEvalue" value="1e-64"/>
                <attribute name="hitDesc" value="Potassium voltage-gated channel subfamily D member 1 OS=Bos taurus GN=KCND1 PE=2 SV=1"/>
                <reference name="BlastResult" ref_id="1" />
        </item>
        <item id="1.1.1" class="BlastHSP">
                <attribute name="hspEvalue" value="1e-64"/>
                <attribute name="hsp_hStart" value="283"/>
                <attribute name="hsp_hEnd" value="451"/>
                <attribute name="hsp_qStart" value="28"/>
                <attribute name="hsp_qEnd" value="630"/>
                <reference name="BlastHit" ref_id="1.1" />
        </item>
...

here is the error I get when I run the integrate command.

BUILD FAILED
/var/lib/pgsql/other_data/intermine/imbuild/integrate.xml:54: The following error occurred while executing this line:
/var/lib/pgsql/other_data/intermine/imbuild/source.xml:201: Exception while reading from: /data/organisms/planaria/Smed/SmedSxl/genome/v31/Swissprot_blast/SmedSxl_v3.1_MAKER_082015_vs_swissprot.1.intermine.xml
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:170)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:440)
at org.intermine.task.Integrate.performAction(Integrate.java:223)
at org.intermine.task.Integrate.performAction(Integrate.java:135)
at org.intermine.task.Integrate.execute(Integrate.java:127)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.Main.runBuild(Main.java:851)
at org.apache.tools.ant.Main.startAnt(Main.java:235)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
Caused by: org.intermine.InterMineException: Error during unmarshalling
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:64)
at org.intermine.dataloader.XmlDataLoader.processXml(XmlDataLoader.java:68)
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:160)
... 31 more
Caused by: java.lang.NullPointerException
at java.lang.Class.isAssignableFrom(Native Method)
at org.intermine.xml.full.FullParser.populateObject(FullParser.java:213)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:135)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:78)
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:62)
... 33 more

Total time: 16 seconds


Thank you,
Sofia

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.


_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev


_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: loading custom datasource data

Sofia Robb
Hi Vivek,

Yes this make sense. And this helps me to start to get a handle on the intermine xml syntax.  

Should I be including a reference to my organism and datasources in my BlastResult as well?

Thanks,
Sofia

On Tue, Oct 6, 2015 at 2:25 PM, Krishnakumar, Vivek <[hidden email]> wrote:
Hi Sofia,

I had a comment about the structuring of your BLAST XML in reference to your proposed data model.

Within the BlastHit and BlastHSP item stanzas, there look to be references back to the BlastResult and BlastHit classes, respectively. This is in contradiction with the data model, where the BlastHit and BlastHSP stanza do not explicitly state the reference. The various classes seem to just extend one another, which I believe is not the same as a reference.

Considering how BLAST data is normally structured, the following alteration to the data model might be better suited to your requirements:
<classes>
<!-- add any <class> elements here -->
  <class name="BlastResult" is-interface="true">
    <attribute name="queryName" type="java.lang.String"/>
    <attribute name="database" type="java.lang.String"/>
    <attribute name="algorithm" type="java.lang.String”/>
    <collection name=“blastHits” referenced-type=“BlastHit” />
  </class>
  <class name="BlastHit" is-interface="true">
    <attribute name="hitName" type="java.lang.String"/>
    <attribute name="hitDesc" type="java.lang.String"/>
    <attribute name="hitEvalue" type="java.lang.Double”/>
    <collection name=“blastHSPs” referenced-type=“BlastHsp” />
    <reference name=“blastResult” referenced-type=“BlastResult” />
  </class>
  <class name="BlastHsp" is-interface="true">
    <attribute name="hspEvalue" type="java.lang.Double"/>
    <attribute name="hsp_hStart" type="java.lang.String"/>
    <attribute name="hsp_hEnd" type="java.lang.String"/>
    <attribute name="hsp_qStart" type="java.lang.String"/>
    <attribute name="hsp_qEnd" type="java.lang.String”/>
    <reference name=“blastHit” referenced-type=“BlastHit”/>
  </class>
</classes>

In this altered data model, the following relationships are enforced:
1. A BlastResult can contain a collection of BlastHits
2. A BlastHit can contain a collection of BlastHsps, and they have a reference back to a BlastResult
3. A BlastHsp has a reference back to a particular BlastHit

Does this make sense in the context of your BLAST alignment data?

Thank you.
Vivek

On Oct 6, 2015, at 3:59 PM, Sofia Robb <[hidden email]> wrote:

Hi Joel, 

Thank you for the suggestion. I tried changing all my '.' to '_' but it did not help :(

Sofia

On Tue, Oct 6, 2015 at 1:16 PM, Joel Richardson <[hidden email]> wrote:

Hi Sophia,

I'm not 100% sure, but I think your item ids need to use underbars rather than periods, e.g. "organism_0" rather than "organism.0".

Joel

-- 
Joel E. Richardson, Ph.D.
Sr. Research Scientist
Mouse Genome Informatics
The Jackson Laboratory
600 Main Street
Bar Harbor, Maine 04609
<a href="tel:207-288-6435" value="+12072886435" target="_blank">207-288-6435

From: <[hidden email]> on behalf of Sofia Robb <[hidden email]>
Date: Tuesday, October 6, 2015 3:06 PM
To: "[hidden email]" <[hidden email]>
Subject: [InterMine Dev] loading custom datasource data

Hello,

I am hoping someone can give me a hint to what I have done wrong.

I created my own blast datasource

here is my blast_additions.xml

<classes>
<!-- add any <class> elements here -->
  <class name="BlastResult" is-interface="true">
    <attribute name="queryName" type="java.lang.String"/>
    <attribute name="database" type="java.lang.String"/>
    <attribute name="algorithm" type="java.lang.String"/>
  </class>
  <class name="BlastHit" extends="BlastResult" is-interface="true">
    <attribute name="hitName" type="java.lang.String"/>
    <attribute name="hitDesc" type="java.lang.String"/>
    <attribute name="hitEvalue" type="java.lang.Double"/>
  </class>
  <class name="BlastHSP" extends="BlastHit" is-interface="true">
    <attribute name="hspEvalue" type="java.lang.Double"/>
    <attribute name="hsp_hStart" type="java.lang.String"/>
    <attribute name="hsp_hEnd" type="java.lang.String"/>
    <attribute name="hsp_qStart" type="java.lang.String"/>
    <attribute name="hsp_qEnd" type="java.lang.String"/>
  </class>
</classes>

Here is a sample of my blast intermine xml

<items>
        <item id="organism.0" class="Organism">
                <attribute name="genus" value="Schmidtea"/>
                <attribute name="species" value="mediterranea"/>
                <attribute name="taxonId" value="79327"/>
        </item>
        <item id="datasource.0" class="DataSource">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015"/>
                <reference name="Organism" ref_id="organism.0" />
        </item>
        <item id="datasource.1" class="DataSource">
                <attribute name="name" value="uniprot_sprot.fasta_2015_08"/>
        </item>
        <item id="dataset.0" class="DataSet">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015_VS_uniprot"/>
                <collection name="dataSources">
                        <reference name="DataSource" ref_id="datasource.0" />
                        <reference name="DataSource" ref_id="datasource.1" />
                </collection>
        </item>
        <item id="1" class="BlastResult">
                <attribute name="queryName" value="mk5-SmedSxl-v31.022420-0.7-1"/>
                <attribute name="database" value="Uniprot_swisprot"/>
                <attribute name="algorithm" value="blastx"/>
        </item>
        <item id="1.1" class="BlastHit">
                <attribute name="hitName" value="sp|Q52PG9|KCND1_BOVIN"/>
                <attribute name="hitEvalue" value="1e-64"/>
                <attribute name="hitDesc" value="Potassium voltage-gated channel subfamily D member 1 OS=Bos taurus GN=KCND1 PE=2 SV=1"/>
                <reference name="BlastResult" ref_id="1" />
        </item>
        <item id="1.1.1" class="BlastHSP">
                <attribute name="hspEvalue" value="1e-64"/>
                <attribute name="hsp_hStart" value="283"/>
                <attribute name="hsp_hEnd" value="451"/>
                <attribute name="hsp_qStart" value="28"/>
                <attribute name="hsp_qEnd" value="630"/>
                <reference name="BlastHit" ref_id="1.1" />
        </item>
...

here is the error I get when I run the integrate command.

BUILD FAILED
/var/lib/pgsql/other_data/intermine/imbuild/integrate.xml:54: The following error occurred while executing this line:
/var/lib/pgsql/other_data/intermine/imbuild/source.xml:201: Exception while reading from: /data/organisms/planaria/Smed/SmedSxl/genome/v31/Swissprot_blast/SmedSxl_v3.1_MAKER_082015_vs_swissprot.1.intermine.xml
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:170)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:440)
at org.intermine.task.Integrate.performAction(Integrate.java:223)
at org.intermine.task.Integrate.performAction(Integrate.java:135)
at org.intermine.task.Integrate.execute(Integrate.java:127)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.Main.runBuild(Main.java:851)
at org.apache.tools.ant.Main.startAnt(Main.java:235)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
Caused by: org.intermine.InterMineException: Error during unmarshalling
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:64)
at org.intermine.dataloader.XmlDataLoader.processXml(XmlDataLoader.java:68)
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:160)
... 31 more
Caused by: java.lang.NullPointerException
at java.lang.Class.isAssignableFrom(Native Method)
at org.intermine.xml.full.FullParser.populateObject(FullParser.java:213)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:135)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:78)
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:62)
... 33 more

Total time: 16 seconds


Thank you,
Sofia

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.


_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev



_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: loading custom datasource data

vkrishna
Hi Sofia,

Glad this helps!

It might be worthwhile studying the InterMine core data model (https://github.com/intermine/intermine/blob/beta/bio/core/core.xml) to see what classes you can reuse.

For example, BLAST hits have HSPs that are features with coordinates, albeit 2 sets of ranges, query and target. In such a case, you can potentially extend the class SequenceFeature, which contains references to locations (start, end, strand, score, etc.) and also implicitly extends BioEntity (which is sort of like the base class in InterMine for biological features). Doing so automatically will bring in attributes like organism, datasets, etc. into your part of the data model, which you can then use appropriately.

Regarding your question about referencing organisms in your BlastResult, it would definitely be useful to do so if you plan to store BLAST data pertaining to multiple organisms which form part of your InterMine instance.

Regards,
Vivek

On Oct 6, 2015, at 4:38 PM, Sofia Robb <[hidden email]> wrote:

Hi Vivek,

Yes this make sense. And this helps me to start to get a handle on the intermine xml syntax.  

Should I be including a reference to my organism and datasources in my BlastResult as well?

Thanks,
Sofia

On Tue, Oct 6, 2015 at 2:25 PM, Krishnakumar, Vivek <[hidden email]> wrote:
Hi Sofia,

I had a comment about the structuring of your BLAST XML in reference to your proposed data model.

Within the BlastHit and BlastHSP item stanzas, there look to be references back to the BlastResult and BlastHit classes, respectively. This is in contradiction with the data model, where the BlastHit and BlastHSP stanza do not explicitly state the reference. The various classes seem to just extend one another, which I believe is not the same as a reference.

Considering how BLAST data is normally structured, the following alteration to the data model might be better suited to your requirements:
<classes>
<!-- add any <class> elements here -->
  <class name="BlastResult" is-interface="true">
    <attribute name="queryName" type="java.lang.String"/>
    <attribute name="database" type="java.lang.String"/>
    <attribute name="algorithm" type="java.lang.String”/>
    <collection name=“blastHits” referenced-type=“BlastHit” />
  </class>
  <class name="BlastHit" is-interface="true">
    <attribute name="hitName" type="java.lang.String"/>
    <attribute name="hitDesc" type="java.lang.String"/>
    <attribute name="hitEvalue" type="java.lang.Double”/>
    <collection name=“blastHSPs” referenced-type=“BlastHsp” />
    <reference name=“blastResult” referenced-type=“BlastResult” />
  </class>
  <class name="BlastHsp" is-interface="true">
    <attribute name="hspEvalue" type="java.lang.Double"/>
    <attribute name="hsp_hStart" type="java.lang.String"/>
    <attribute name="hsp_hEnd" type="java.lang.String"/>
    <attribute name="hsp_qStart" type="java.lang.String"/>
    <attribute name="hsp_qEnd" type="java.lang.String”/>
    <reference name=“blastHit” referenced-type=“BlastHit”/>
  </class>
</classes>

In this altered data model, the following relationships are enforced:
1. A BlastResult can contain a collection of BlastHits
2. A BlastHit can contain a collection of BlastHsps, and they have a reference back to a BlastResult
3. A BlastHsp has a reference back to a particular BlastHit

Does this make sense in the context of your BLAST alignment data?

Thank you.
Vivek

On Oct 6, 2015, at 3:59 PM, Sofia Robb <[hidden email]> wrote:

Hi Joel, 

Thank you for the suggestion. I tried changing all my '.' to '_' but it did not help :(

Sofia

On Tue, Oct 6, 2015 at 1:16 PM, Joel Richardson <[hidden email]> wrote:

Hi Sophia,

I'm not 100% sure, but I think your item ids need to use underbars rather than periods, e.g. "organism_0" rather than "organism.0".

Joel

-- 
Joel E. Richardson, Ph.D.
Sr. Research Scientist
Mouse Genome Informatics
The Jackson Laboratory
600 Main Street
Bar Harbor, Maine 04609
<a href="tel:207-288-6435" value="&#43;12072886435" target="_blank" class="">207-288-6435

From: <[hidden email]> on behalf of Sofia Robb <[hidden email]>
Date: Tuesday, October 6, 2015 3:06 PM
To: "[hidden email]" <[hidden email]>
Subject: [InterMine Dev] loading custom datasource data

Hello,

I am hoping someone can give me a hint to what I have done wrong.

I created my own blast datasource

here is my blast_additions.xml

<classes>
<!-- add any <class> elements here -->
  <class name="BlastResult" is-interface="true">
    <attribute name="queryName" type="java.lang.String"/>
    <attribute name="database" type="java.lang.String"/>
    <attribute name="algorithm" type="java.lang.String"/>
  </class>
  <class name="BlastHit" extends="BlastResult" is-interface="true">
    <attribute name="hitName" type="java.lang.String"/>
    <attribute name="hitDesc" type="java.lang.String"/>
    <attribute name="hitEvalue" type="java.lang.Double"/>
  </class>
  <class name="BlastHSP" extends="BlastHit" is-interface="true">
    <attribute name="hspEvalue" type="java.lang.Double"/>
    <attribute name="hsp_hStart" type="java.lang.String"/>
    <attribute name="hsp_hEnd" type="java.lang.String"/>
    <attribute name="hsp_qStart" type="java.lang.String"/>
    <attribute name="hsp_qEnd" type="java.lang.String"/>
  </class>
</classes>

Here is a sample of my blast intermine xml

<items>
        <item id="organism.0" class="Organism">
                <attribute name="genus" value="Schmidtea"/>
                <attribute name="species" value="mediterranea"/>
                <attribute name="taxonId" value="79327"/>
        </item>
        <item id="datasource.0" class="DataSource">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015"/>
                <reference name="Organism" ref_id="organism.0" />
        </item>
        <item id="datasource.1" class="DataSource">
                <attribute name="name" value="uniprot_sprot.fasta_2015_08"/>
        </item>
        <item id="dataset.0" class="DataSet">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015_VS_uniprot"/>
                <collection name="dataSources">
                        <reference name="DataSource" ref_id="datasource.0" />
                        <reference name="DataSource" ref_id="datasource.1" />
                </collection>
        </item>
        <item id="1" class="BlastResult">
                <attribute name="queryName" value="mk5-SmedSxl-v31.022420-0.7-1"/>
                <attribute name="database" value="Uniprot_swisprot"/>
                <attribute name="algorithm" value="blastx"/>
        </item>
        <item id="1.1" class="BlastHit">
                <attribute name="hitName" value="sp|Q52PG9|KCND1_BOVIN"/>
                <attribute name="hitEvalue" value="1e-64"/>
                <attribute name="hitDesc" value="Potassium voltage-gated channel subfamily D member 1 OS=Bos taurus GN=KCND1 PE=2 SV=1"/>
                <reference name="BlastResult" ref_id="1" />
        </item>
        <item id="1.1.1" class="BlastHSP">
                <attribute name="hspEvalue" value="1e-64"/>
                <attribute name="hsp_hStart" value="283"/>
                <attribute name="hsp_hEnd" value="451"/>
                <attribute name="hsp_qStart" value="28"/>
                <attribute name="hsp_qEnd" value="630"/>
                <reference name="BlastHit" ref_id="1.1" />
        </item>
...

here is the error I get when I run the integrate command.

BUILD FAILED
/var/lib/pgsql/other_data/intermine/imbuild/integrate.xml:54: The following error occurred while executing this line:
/var/lib/pgsql/other_data/intermine/imbuild/source.xml:201: Exception while reading from: /data/organisms/planaria/Smed/SmedSxl/genome/v31/Swissprot_blast/SmedSxl_v3.1_MAKER_082015_vs_swissprot.1.intermine.xml
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:170)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:440)
at org.intermine.task.Integrate.performAction(Integrate.java:223)
at org.intermine.task.Integrate.performAction(Integrate.java:135)
at org.intermine.task.Integrate.execute(Integrate.java:127)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.Main.runBuild(Main.java:851)
at org.apache.tools.ant.Main.startAnt(Main.java:235)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
Caused by: org.intermine.InterMineException: Error during unmarshalling
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:64)
at org.intermine.dataloader.XmlDataLoader.processXml(XmlDataLoader.java:68)
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:160)
... 31 more
Caused by: java.lang.NullPointerException
at java.lang.Class.isAssignableFrom(Native Method)
at org.intermine.xml.full.FullParser.populateObject(FullParser.java:213)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:135)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:78)
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:62)
... 33 more

Total time: 16 seconds


Thank you,
Sofia

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.


_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev




_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: loading custom datasource data

Sofia Robb
Thank you again. I have been having a challenging time orienting myself with the system that intermine uses.  I will look at SequenceFeature and BioEntity. 

Sofia

On Tue, Oct 6, 2015 at 2:48 PM, Krishnakumar, Vivek <[hidden email]> wrote:
Hi Sofia,

Glad this helps!

It might be worthwhile studying the InterMine core data model (https://github.com/intermine/intermine/blob/beta/bio/core/core.xml) to see what classes you can reuse.

For example, BLAST hits have HSPs that are features with coordinates, albeit 2 sets of ranges, query and target. In such a case, you can potentially extend the class SequenceFeature, which contains references to locations (start, end, strand, score, etc.) and also implicitly extends BioEntity (which is sort of like the base class in InterMine for biological features). Doing so automatically will bring in attributes like organism, datasets, etc. into your part of the data model, which you can then use appropriately.

Regarding your question about referencing organisms in your BlastResult, it would definitely be useful to do so if you plan to store BLAST data pertaining to multiple organisms which form part of your InterMine instance.

Regards,
Vivek

On Oct 6, 2015, at 4:38 PM, Sofia Robb <[hidden email]> wrote:

Hi Vivek,

Yes this make sense. And this helps me to start to get a handle on the intermine xml syntax.  

Should I be including a reference to my organism and datasources in my BlastResult as well?

Thanks,
Sofia

On Tue, Oct 6, 2015 at 2:25 PM, Krishnakumar, Vivek <[hidden email]> wrote:
Hi Sofia,

I had a comment about the structuring of your BLAST XML in reference to your proposed data model.

Within the BlastHit and BlastHSP item stanzas, there look to be references back to the BlastResult and BlastHit classes, respectively. This is in contradiction with the data model, where the BlastHit and BlastHSP stanza do not explicitly state the reference. The various classes seem to just extend one another, which I believe is not the same as a reference.

Considering how BLAST data is normally structured, the following alteration to the data model might be better suited to your requirements:
<classes>
<!-- add any <class> elements here -->
  <class name="BlastResult" is-interface="true">
    <attribute name="queryName" type="java.lang.String"/>
    <attribute name="database" type="java.lang.String"/>
    <attribute name="algorithm" type="java.lang.String”/>
    <collection name=“blastHits” referenced-type=“BlastHit” />
  </class>
  <class name="BlastHit" is-interface="true">
    <attribute name="hitName" type="java.lang.String"/>
    <attribute name="hitDesc" type="java.lang.String"/>
    <attribute name="hitEvalue" type="java.lang.Double”/>
    <collection name=“blastHSPs” referenced-type=“BlastHsp” />
    <reference name=“blastResult” referenced-type=“BlastResult” />
  </class>
  <class name="BlastHsp" is-interface="true">
    <attribute name="hspEvalue" type="java.lang.Double"/>
    <attribute name="hsp_hStart" type="java.lang.String"/>
    <attribute name="hsp_hEnd" type="java.lang.String"/>
    <attribute name="hsp_qStart" type="java.lang.String"/>
    <attribute name="hsp_qEnd" type="java.lang.String”/>
    <reference name=“blastHit” referenced-type=“BlastHit”/>
  </class>
</classes>

In this altered data model, the following relationships are enforced:
1. A BlastResult can contain a collection of BlastHits
2. A BlastHit can contain a collection of BlastHsps, and they have a reference back to a BlastResult
3. A BlastHsp has a reference back to a particular BlastHit

Does this make sense in the context of your BLAST alignment data?

Thank you.
Vivek

On Oct 6, 2015, at 3:59 PM, Sofia Robb <[hidden email]> wrote:

Hi Joel, 

Thank you for the suggestion. I tried changing all my '.' to '_' but it did not help :(

Sofia

On Tue, Oct 6, 2015 at 1:16 PM, Joel Richardson <[hidden email]> wrote:

Hi Sophia,

I'm not 100% sure, but I think your item ids need to use underbars rather than periods, e.g. "organism_0" rather than "organism.0".

Joel

-- 
Joel E. Richardson, Ph.D.
Sr. Research Scientist
Mouse Genome Informatics
The Jackson Laboratory
600 Main Street
Bar Harbor, Maine 04609
<a href="tel:207-288-6435" value="+12072886435" target="_blank">207-288-6435

From: <[hidden email]> on behalf of Sofia Robb <[hidden email]>
Date: Tuesday, October 6, 2015 3:06 PM
To: "[hidden email]" <[hidden email]>
Subject: [InterMine Dev] loading custom datasource data

Hello,

I am hoping someone can give me a hint to what I have done wrong.

I created my own blast datasource

here is my blast_additions.xml

<classes>
<!-- add any <class> elements here -->
  <class name="BlastResult" is-interface="true">
    <attribute name="queryName" type="java.lang.String"/>
    <attribute name="database" type="java.lang.String"/>
    <attribute name="algorithm" type="java.lang.String"/>
  </class>
  <class name="BlastHit" extends="BlastResult" is-interface="true">
    <attribute name="hitName" type="java.lang.String"/>
    <attribute name="hitDesc" type="java.lang.String"/>
    <attribute name="hitEvalue" type="java.lang.Double"/>
  </class>
  <class name="BlastHSP" extends="BlastHit" is-interface="true">
    <attribute name="hspEvalue" type="java.lang.Double"/>
    <attribute name="hsp_hStart" type="java.lang.String"/>
    <attribute name="hsp_hEnd" type="java.lang.String"/>
    <attribute name="hsp_qStart" type="java.lang.String"/>
    <attribute name="hsp_qEnd" type="java.lang.String"/>
  </class>
</classes>

Here is a sample of my blast intermine xml

<items>
        <item id="organism.0" class="Organism">
                <attribute name="genus" value="Schmidtea"/>
                <attribute name="species" value="mediterranea"/>
                <attribute name="taxonId" value="79327"/>
        </item>
        <item id="datasource.0" class="DataSource">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015"/>
                <reference name="Organism" ref_id="organism.0" />
        </item>
        <item id="datasource.1" class="DataSource">
                <attribute name="name" value="uniprot_sprot.fasta_2015_08"/>
        </item>
        <item id="dataset.0" class="DataSet">
                <attribute name="name" value="SmedSxl_v3.1_MAKER_082015_VS_uniprot"/>
                <collection name="dataSources">
                        <reference name="DataSource" ref_id="datasource.0" />
                        <reference name="DataSource" ref_id="datasource.1" />
                </collection>
        </item>
        <item id="1" class="BlastResult">
                <attribute name="queryName" value="mk5-SmedSxl-v31.022420-0.7-1"/>
                <attribute name="database" value="Uniprot_swisprot"/>
                <attribute name="algorithm" value="blastx"/>
        </item>
        <item id="1.1" class="BlastHit">
                <attribute name="hitName" value="sp|Q52PG9|KCND1_BOVIN"/>
                <attribute name="hitEvalue" value="1e-64"/>
                <attribute name="hitDesc" value="Potassium voltage-gated channel subfamily D member 1 OS=Bos taurus GN=KCND1 PE=2 SV=1"/>
                <reference name="BlastResult" ref_id="1" />
        </item>
        <item id="1.1.1" class="BlastHSP">
                <attribute name="hspEvalue" value="1e-64"/>
                <attribute name="hsp_hStart" value="283"/>
                <attribute name="hsp_hEnd" value="451"/>
                <attribute name="hsp_qStart" value="28"/>
                <attribute name="hsp_qEnd" value="630"/>
                <reference name="BlastHit" ref_id="1.1" />
        </item>
...

here is the error I get when I run the integrate command.

BUILD FAILED
/var/lib/pgsql/other_data/intermine/imbuild/integrate.xml:54: The following error occurred while executing this line:
/var/lib/pgsql/other_data/intermine/imbuild/source.xml:201: Exception while reading from: /data/organisms/planaria/Smed/SmedSxl/genome/v31/Swissprot_blast/SmedSxl_v3.1_MAKER_082015_vs_swissprot.1.intermine.xml
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:170)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:440)
at org.intermine.task.Integrate.performAction(Integrate.java:223)
at org.intermine.task.Integrate.performAction(Integrate.java:135)
at org.intermine.task.Integrate.execute(Integrate.java:127)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:435)
at org.apache.tools.ant.Target.performTasks(Target.java:456)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
at org.apache.tools.ant.Main.runBuild(Main.java:851)
at org.apache.tools.ant.Main.startAnt(Main.java:235)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
Caused by: org.intermine.InterMineException: Error during unmarshalling
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:64)
at org.intermine.dataloader.XmlDataLoader.processXml(XmlDataLoader.java:68)
at org.intermine.dataloader.XmlDataLoaderTask.execute(XmlDataLoaderTask.java:160)
... 31 more
Caused by: java.lang.NullPointerException
at java.lang.Class.isAssignableFrom(Native Method)
at org.intermine.xml.full.FullParser.populateObject(FullParser.java:213)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:135)
at org.intermine.xml.full.FullParser.realiseObjects(FullParser.java:78)
at org.intermine.util.XmlBinding.unmarshal(XmlBinding.java:62)
... 33 more

Total time: 16 seconds


Thank you,
Sofia

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.


_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev





_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev