FileDirectDataLoaderTask.java

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

FileDirectDataLoaderTask.java

joe carlson
Hi Julie (and others),

I’m still having troubles with the direct data loading in the gradle build system. I can get a loader to run, but it never actually processes the files I supply to it. I think that my problem is that I don’t see how to set the fileset parameter.

When I step through the code, I see that addFileSet in FileDirectDataLoaderTask.java is getting a FileSet with essentially no data in it. (and I see an exception ‘cannot evaluate org.apache.tools.ant.types.FileSet.toString()’. Later in the process method, ds.getIncludedFiles() returns nothing.

I’ve tried various permutations of setting parameter values in project.xml. I had been using <property name=“src.data.dir” location=“the dir” /> and <property name=“src.data.dir.includes” value=“the pattern” />

Through my trial-and-error(s), it looks like setting only src.data.dir will pass the directory on the the FileSet object, but with the name set to “null”. (Not a null value, but the string “null”. If I name my file “null”, it gets processed! That is a workaround, I guess.) What parameter is used to set the include pattern?

In the ant version, build.xml for this source shows me how ant constructs the fileset

>   <target name="load" depends="-init-loader-classname, init, -init-deps">
>     <taskdef name="load-data-direct"
>       classname="${direct-demo.loaderClassName}"
>       classpathref="task.class.path"/>
>     <load-data-direct integrationWriterAlias="integration.production"
>                 sourceName="${source.name}"
>                 sourceType="${source.type}">
>       <fileset dir="${src.data.dir}" includes="${src.data.dir.includes}"/>
>     </load-data-direct>
>   </target>


I don’t see what the corresponding bit of code in the gradle system? Is there something that needs to be set in the properties file for the source?

Thanks,

Joe
_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: FileDirectDataLoaderTask.java

Julie Sullivan-2
Hi Joe!

The syntax for "includes" (for direct data loaders ONLY) is:

     source.type + ".includes"

So something like this:

     <property name="fasta.includes" value="*.fasta"/>

Here's the exact line where this is set:

https://github.com/intermine/intermine/blob/dev/plugin/src/main/groovy/org/intermine/plugin/integrate/IntegrateUtils.groovy#L259

You can see there's a special IF statement around so this is ONLY for
direct data loaders. This was done for legacy reasons because the FASTA
source is a direct data loader.

So to get your source working, change your "includes" parameter to match
your source type.

I'll add some docs about direct data loaders as well.

Cheers
Julie



On 28/11/2018 21:57, Joe Carlson wrote:

> Hi Julie (and others),
>
> I’m still having troubles with the direct data loading in the gradle build system. I can get a loader to run, but it never actually processes the files I supply to it. I think that my problem is that I don’t see how to set the fileset parameter.
>
> When I step through the code, I see that addFileSet in FileDirectDataLoaderTask.java is getting a FileSet with essentially no data in it. (and I see an exception ‘cannot evaluate org.apache.tools.ant.types.FileSet.toString()’. Later in the process method, ds.getIncludedFiles() returns nothing.
>
> I’ve tried various permutations of setting parameter values in project.xml. I had been using <property name=“src.data.dir” location=“the dir” /> and <property name=“src.data.dir.includes” value=“the pattern” />
>
> Through my trial-and-error(s), it looks like setting only src.data.dir will pass the directory on the the FileSet object, but with the name set to “null”. (Not a null value, but the string “null”. If I name my file “null”, it gets processed! That is a workaround, I guess.) What parameter is used to set the include pattern?
>
> In the ant version, build.xml for this source shows me how ant constructs the fileset
>
>>    <target name="load" depends="-init-loader-classname, init, -init-deps">
>>      <taskdef name="load-data-direct"
>>        classname="${direct-demo.loaderClassName}"
>>        classpathref="task.class.path"/>
>>      <load-data-direct integrationWriterAlias="integration.production"
>>                  sourceName="${source.name}"
>>                  sourceType="${source.type}">
>>        <fileset dir="${src.data.dir}" includes="${src.data.dir.includes}"/>
>>      </load-data-direct>
>>    </target>
>
>
> I don’t see what the corresponding bit of code in the gradle system? Is there something that needs to be set in the properties file for the source?
>
> Thanks,
>
> Joe
> _______________________________________________
> dev mailing list
> [hidden email]
> https://lists.intermine.org/mailman/listinfo/dev
>
_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: FileDirectDataLoaderTask.java

joe carlson

On 11/29/18 5:03 AM, Julie Sullivan wrote:

> Hi Joe!
>
> The syntax for "includes" (for direct data loaders ONLY) is:
>
>     source.type + ".includes"
>
> So something like this:
>
>     <property name="fasta.includes" value="*.fasta"/>
>
> Here's the exact line where this is set:
>
> https://github.com/intermine/intermine/blob/dev/plugin/src/main/groovy/org/intermine/plugin/integrate/IntegrateUtils.groovy#L259 
>
>
> You can see there's a special IF statement around so this is ONLY for
> direct data loaders. This was done for legacy reasons because the
> FASTA source is a direct data loader.
>
> So to get your source working, change your "includes" parameter to
> match your source type.
>
Thanks Julie. This was the part I was missing. Works now. I'd seen the
'fasta.includes' in the xml files in BioTestMine, but did not make the
connection with the source name.

In the past I've been using a "Db direct data loader" that really sped
up loading from chado. (Our mine is up to 175 organisms now.)  I'm going
to get that working again and you may see a pull request for it.

Joe


> I'll add some docs about direct data loaders as well.
>
> Cheers
> Julie
>
>
>
> On 28/11/2018 21:57, Joe Carlson wrote:
>> Hi Julie (and others),
>>
>> I’m still having troubles with the direct data loading in the gradle
>> build system. I can get a loader to run, but it never actually
>> processes the files I supply to it. I think that my problem is that I
>> don’t see how to set the fileset parameter.
>>
>> When I step through the code, I see that addFileSet in
>> FileDirectDataLoaderTask.java is getting a FileSet with essentially
>> no data in it. (and I see an exception ‘cannot evaluate
>> org.apache.tools.ant.types.FileSet.toString()’. Later in the process
>> method, ds.getIncludedFiles() returns nothing.
>>
>> I’ve tried various permutations of setting parameter values in
>> project.xml. I had been using <property name=“src.data.dir”
>> location=“the dir” /> and <property name=“src.data.dir.includes”
>> value=“the pattern” />
>>
>> Through my trial-and-error(s), it looks like setting only
>> src.data.dir will pass the directory on the the FileSet object, but
>> with the name set to “null”. (Not a null value, but the string
>> “null”. If I name my file “null”, it gets processed! That is a
>> workaround, I guess.) What parameter is used to set the include pattern?
>>
>> In the ant version, build.xml for this source shows me how ant
>> constructs the fileset
>>
>>>    <target name="load" depends="-init-loader-classname, init,
>>> -init-deps">
>>>      <taskdef name="load-data-direct"
>>>        classname="${direct-demo.loaderClassName}"
>>>        classpathref="task.class.path"/>
>>>      <load-data-direct integrationWriterAlias="integration.production"
>>>                  sourceName="${source.name}"
>>>                  sourceType="${source.type}">
>>>        <fileset dir="${src.data.dir}"
>>> includes="${src.data.dir.includes}"/>
>>>      </load-data-direct>
>>>    </target>
>>
>>
>> I don’t see what the corresponding bit of code in the gradle system?
>> Is there something that needs to be set in the properties file for
>> the source?
>>
>> Thanks,
>>
>> Joe
>> _______________________________________________
>> dev mailing list
>> [hidden email]
>> https://lists.intermine.org/mailman/listinfo/dev
>>
_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev