[galaxy-dev] Amazon EC2 install dataset Upload problems

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[galaxy-dev] Amazon EC2 install dataset Upload problems

Dennis Gascoigne-2
Hi;

We are working our way through the cloud install and cannot get datasets to upload from the file system. Should this functionality work? Or is there a problem with cross instance access to the file system - we made sure we used galaxyData mount but got a python script error.

/mnt/galaxyData/upload_store/

Traceback (most recent call last): File "/mnt/galaxyTools/galaxy-central/tools/data_source/upload.py", line 343, in __main__() File "/mnt/galaxyTools/galaxy-central/tools/data_source/upload.py", line 335, in __main__ add_file( dataset
error

--


_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-dev] Amazon EC2 install dataset Upload problems

Enis Afgan-2
Hi Dennis, 
Could you please elaborate a bit on what you are trying to do. When you say you're trying to get a dataset from the file system - did you manually scp given dataset to the master EC2 instance and are trying to register it with Galaxy now or something else? If you're first manually copying the datasets, is there a reason why you're not using the upload tool directly from a remote site - that approach worked for us without trouble thus far.

In principle, /mnt/galaxyData is NFS mounted across all worker instances so they should all have access to the given file system.

Enis

On Mon, May 24, 2010 at 5:36 AM, Dennis Gascoigne <[hidden email]> wrote:
Hi;

We are working our way through the cloud install and cannot get datasets to upload from the file system. Should this functionality work? Or is there a problem with cross instance access to the file system - we made sure we used galaxyData mount but got a python script error.

/mnt/galaxyData/upload_store/

Traceback (most recent call last): File "/mnt/galaxyTools/galaxy-central/tools/data_source/upload.py", line 343, in __main__() File "/mnt/galaxyTools/galaxy-central/tools/data_source/upload.py", line 335, in __main__ add_file( dataset
error

--


_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev



_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-dev] Amazon EC2 install dataset Upload problems

Enis Afgan-2
We'll have to take a look at the data library upload on the cloud - I have not played with that yet  so I don't really have an answer for you just yet. It should operate normally because the Galaxy running on the cloud is stock Galaxy but we'll have to see why it's not working.

Now as far as customizing the cloud instance - at this time, there is no additional documentation about the cloud other than the how-to-get started wiki page. This is something on the to-do list however.
In the mean time, if you'd like to customize your cloud instance: all of the galaxy tools and galaxy application itself reside in /mnt/galaxyTools, which is an EBS volume. You can manipulate those tools and Galaxy itself there as you would like. Note that you should make symlinks to any added tools in directory /mnt/galaxyTools/tools/bin so that Galaxy can find needed tools (you can use stow to do this). Once you're done modifying Galaxy and adding tools, you'd have to unmount that file system, detach given EBS volume, and then create a snapshot of it (through AWS console). Next, in your S3 account, there is a bucket that corresponds to the given cluster (it's name will be gc-<hash> and if you have more than one cluster, let me know and we'll work on figuring out which bucket you should modify). In that bucket, there is a file persistent-volumes-latest.txt that contains a line TOOLS=snap-<ID>|<volume size>. Edit that file to change the reference to the snapshot so that it lists the snapshot you just created. Then, terminate the cluster (including the master instance) and start it back up. You should now have access to your tools and customized Galaxy.

Hope this helps. Let us know if you have any more questions and we'll look into the data library upload issue.

Enis

On Mon, May 24, 2010 at 4:51 PM, <[hidden email]> wrote:
I am using the DataLibrary upload from directory function. The reason I am using this function is because I have a stack of files already on the server for other reasons. Also, this is the only way you can work on files without copying them to the database file directory.

Basically, it works with a non-cloud instance, I expect it to work on the cloud too.

Is there any better documentation on the cloud install as opposed to the single wiki how to install page? For example, we have a heavily cusomtised install (changed tools and config) - is it correct to say that we would have to reconfigure the cloud install by making new ami's of our changed configs and instantiating them? Or will the cloud instances be able to deal with changes to tools and configs across the cluster?

Cheers
Dennis


On , Enis Afgan <[hidden email]> wrote:
> Hi Dennis, Could you please elaborate a bit on what you are trying to do. When you say you're trying to get a dataset from the file system - did you manually scp given dataset to the master EC2 instance and are trying to register it with Galaxy now or something else? If you're first manually copying the datasets, is there a reason why you're not using the upload tool directly from a remote site - that approach worked for us without trouble thus far.
>
>
>
>
>
> In principle, /mnt/galaxyData is NFS mounted across all worker instances so they should all have access to the given file system.
>
>
> Enis
>
> On Mon, May 24, 2010 at 5:36 AM, Dennis Gascoigne [hidden email]> wrote:
>
>
>
> Hi;
>
> We are working our way through the cloud install and cannot get datasets to upload from the file system. Should this functionality work? Or is there a problem with cross instance access to the file system - we made sure we used galaxyData mount but got a python script error.
>

>
>
>
>
>
> /mnt/galaxyData/upload_store/
>
>
>
>
>
>
> Traceback (most recent call last): File "/mnt/galaxyTools/galaxy-central/tools/data_source/upload.py", line 343, in __main__() File "/mnt/galaxyTools/galaxy-central/tools/data_source/upload.py", line 335, in __main__ add_file( dataset
>

>
>
>
>
> error
>
>
>
> --
>
>
>
>
>
> _______________________________________________
>
> galaxy-dev mailing list
>
> [hidden email]
>
> http://lists.bx.psu.edu/listinfo/galaxy-dev
>
>
>
>
>
>


_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev