Installation issue on EC2

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Installation issue on EC2

Fabiano Lucchese

                Hi, all.

 

                I’ve been trying to deploy a CloudMan/Galaxy cluster on EC2 for a couple of days now and have been facing all kinds of unexpected weirdness’s. I found out that the web-based form that automatically creates the EC2 instance uses an image that doesn`t seem to be officially supported (or at least encouraged) by the Galaxy team, so I decided to launch the instance by hand following the instructions from here:

 

                http://wiki.galaxyproject.org/CloudMan/AWS/GettingStarted

 

                Up to the point before accessing Galaxy, everything looks ok, but the service never starts. Instead, it shows up at the CloudMan Admin page as “Error” and nothing happens if I try to Stop/Start or Restart it.There’s no log either.  The Postgres service displays an immutable “Starting” status, while its log provides the following message:

 

                FATAL:  database files are incompatible with server

DETAIL:  The data directory was initialized by PostgreSQL version 9.1, which is not compatible with this version 8.4.7.

 

                It does look like a messed up installation, which is quite surprising as this is supposed to be the recommended working image.

 

                Does anybody have any clue on what might be going on?

 

                Thanks in advance.

 

                F.

 


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Installation issue on EC2

Dannon Baker
Which web form did you use to start the cloud instance?  The page at usegalaxy.org/cloudlaunch will always point to the currently supported AMI, and I'd definitely recommend using this form for launching galaxy instances.

The problem you're running into is that the unsupported AMI already initialized your database as postgres9, which isn't backwards compatible with the postgres8 installed on the official AMI.  At this point, if you want to continue with this specific cluster you'll need to keep using that original AMI (or attempt to manually migrate the database between instances, which I wouldn't recommend).

If you don't have any data you care to preserve on this cluster, just delete it and start over using usegalaxy.org/cloudlaunch and you should be good to go. If you want to try again fresh without deleting anything, just use a new cluster name when you launch the instance.

-Dannon

On Dec 6, 2012, at 7:45 PM, Fabiano Lucchese <[hidden email]> wrote:

>                 Hi, all.
>  
>                 I’ve been trying to deploy a CloudMan/Galaxy cluster on EC2 for a couple of days now and have been facing all kinds of unexpected weirdness’s. I found out that the web-based form that automatically creates the EC2 instance uses an image that doesn`t seem to be officially supported (or at least encouraged) by the Galaxy team, so I decided to launch the instance by hand following the instructions from here:
>  
>                 http://wiki.galaxyproject.org/CloudMan/AWS/GettingStarted
>  
>                 Up to the point before accessing Galaxy, everything looks ok, but the service never starts. Instead, it shows up at the CloudMan Admin page as “Error” and nothing happens if I try to Stop/Start or Restart it.There’s no log either.  The Postgres service displays an immutable “Starting” status, while its log provides the following message:
>  
>                 FATAL:  database files are incompatible with server
> DETAIL:  The data directory was initialized by PostgreSQL version 9.1, which is not compatible with this version 8.4.7.
>  
>                 It does look like a messed up installation, which is quite surprising as this is supposed to be the recommended working image.
>  
>                 Does anybody have any clue on what might be going on?
>  
>                 Thanks in advance.
>  
>                 F.
>  
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Installation issue on EC2

Fabiano Lucchese
        Hi, Dannon.

        Thank you for your reply. I was using this form:

        https://biocloudcentral.herokuapp.com/launch
       
which was in turn referenced in this tutorial:

        http://onlinelibrary.wiley.com/doi/10.1002/0471250953.bi1109s38/full#bi1109-prot-0001

        It looks like there is a lot of outdated stuff on the net regarding the Galaxy / Cloudman bundle. Anyway, I followed the flow of this new form and got my instance up and running in a few minutes (I chose a Galaxy-type cluster). However, the Galaxy, Postgres and File System services seemed to be stuck at the "Unstarted" status. I tried to manually start them and got the following error message at the Galaxy log:

/mnt/galaxyTools/galaxy-central/eggs/pysam-0.4.2_kanwei_b10f6e722e9a-py2.6-linux-x86_64-ucs4.egg/pysam/__init__.py:1: RuntimeWarning: __builtin__.file size changed, may indicate binary incompatibility
  from csamtools import *
python path is: /mnt/galaxyTools/galaxy-central/eggs/numpy-1.6.0-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/pysam-0.4.2_kanwei_b10f6e722e9a-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/boto-2.5.2-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/mercurial-2.2.3-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/Fabric-1.4.2-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/ssh-1.7.14-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/Whoosh-0.3.18-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/pycrypto-2.5-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/python_lzo-1.08_2.03_static-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/bx_python-0.7.1_7b95ff194725-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/amqplib-0.6.1-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/pexpect-2.4-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg, /mnt/gala!
 xyTools/galaxy-central/eggs/Babel-0.9.4-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/MarkupSafe-0.12-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/Mako-0.4.1-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/WebHelpers-0.2-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/simplejson-2.1.1-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/wchartype-0.1-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/elementtree-1.2.6_20050316-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/docutils-0.7-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/WebOb-0.8.5-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/Routes-1.12.3-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/Cheetah-2.2.2-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/PasteDeploy-1.3.3-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/PasteScript-1.7.3-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/Paste-1.6-py2.6.egg, /mnt/galaxyTools/galaxy-central/lib, /usr/lib/python2.6/, /u!
 sr/lib/python2.6/plat-linux2, /usr/lib/python2.6/lib-tk, /usr/!
 lib/pyth
on2.6/lib-old, /usr/lib/python2.6/lib-dynload
Traceback (most recent call last):
  File "/mnt/galaxyTools/galaxy-central/lib/galaxy/webapps/galaxy/buildapp.py", line 36, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/mnt/galaxyTools/galaxy-central/lib/galaxy/app.py", line 29, in __init__
    self.config.check()
  File "/mnt/galaxyTools/galaxy-central/lib/galaxy/config.py", line 315, in check
    raise ConfigurationError( "Unable to create missing directory: %s\n%s" % ( path, e ) )
ConfigurationError: Unable to create missing directory: /mnt/galaxyData/files
[Errno 13] Permission denied: '/mnt/galaxyData/files'
Removing PID file paster.pid

        I'm still wondering if I might be doing something wrong, which would surprise me as this time the process required much less interaction and input than the previous ones I was following.

        Any thoughts?
       
        Thanks,

        F.

-----Original Message-----
From: Dannon Baker [mailto:[hidden email]]
Sent: Friday, December 07, 2012 6:19 AM
To: Fabiano Lucchese
Cc: [hidden email]
Subject: Re: [galaxy-dev] Installation issue on EC2

Which web form did you use to start the cloud instance?  The page at usegalaxy.org/cloudlaunch will always point to the currently supported AMI, and I'd definitely recommend using this form for launching galaxy instances.

The problem you're running into is that the unsupported AMI already initialized your database as postgres9, which isn't backwards compatible with the postgres8 installed on the official AMI.  At this point, if you want to continue with this specific cluster you'll need to keep using that original AMI (or attempt to manually migrate the database between instances, which I wouldn't recommend).

If you don't have any data you care to preserve on this cluster, just delete it and start over using usegalaxy.org/cloudlaunch and you should be good to go. If you want to try again fresh without deleting anything, just use a new cluster name when you launch the instance.

-Dannon

On Dec 6, 2012, at 7:45 PM, Fabiano Lucchese <[hidden email]> wrote:

>                 Hi, all.
>  
>                 I've been trying to deploy a CloudMan/Galaxy cluster on EC2 for a couple of days now and have been facing all kinds of unexpected weirdness's. I found out that the web-based form that automatically creates the EC2 instance uses an image that doesn`t seem to be officially supported (or at least encouraged) by the Galaxy team, so I decided to launch the instance by hand following the instructions from here:
>  
>                
> http://wiki.galaxyproject.org/CloudMan/AWS/GettingStarted
>  
>                 Up to the point before accessing Galaxy, everything looks ok, but the service never starts. Instead, it shows up at the CloudMan Admin page as "Error" and nothing happens if I try to Stop/Start or Restart it.There's no log either.  The Postgres service displays an immutable "Starting" status, while its log provides the following message:
>  
>                 FATAL:  database files are incompatible with server
> DETAIL:  The data directory was initialized by PostgreSQL version 9.1, which is not compatible with this version 8.4.7.
>  
>                 It does look like a messed up installation, which is quite surprising as this is supposed to be the recommended working image.
>  
>                 Does anybody have any clue on what might be going on?
>  
>                 Thanks in advance.
>  
>                 F.
>  
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this and other
> Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Installation issue on EC2

Brad Chapman

Fabiano;
I help with the BioCloudCentral site, which is a community maintained
way to launch CloudMan and CloudBioLinux AMIs. Sorry for any confusion
between the different methods. Dannon, if you're up for it we should try
to coordinate better at least on the documentation side.

In terms of your problem, it sounds like you've tried to launch the same
CloudMan instance with multiple AMIs. Each AMI will have different
software which can create incompatibilities: your error messages
indicate different version of PostgreSQL and Python at least.

Every time you launch a CloudMan instance with the same name, it will
pull up the saved data volumes which is causing this incompatibility. As
long as you don't have any critical data on the instance, my suggestion
would be:

- Start up a CloudMan cluster with any AMI and then terminate the
  instance, removing all data. This will clear up your data volumes and
  S3 data associated with this CloudMan cluster.
 
- Pick one method and stick with that moving forward. It sounds like you
  want to run Galaxy exclusively so Dannon's launcher would be a good
  choice.

Hope this fixes it for you,
Brad


> Hi, Dannon.
>
> Thank you for your reply. I was using this form:
>
> https://biocloudcentral.herokuapp.com/launch
>
> which was in turn referenced in this tutorial:
>
> http://onlinelibrary.wiley.com/doi/10.1002/0471250953.bi1109s38/full#bi1109-prot-0001
>
> It looks like there is a lot of outdated stuff on the net regarding the Galaxy / Cloudman bundle. Anyway, I followed the flow of this new form and got my instance up and running in a few minutes (I chose a Galaxy-type cluster). However, the Galaxy, Postgres and File System services seemed to be stuck at the "Unstarted" status. I tried to manually start them and got the following error message at the Galaxy log:
>
> /mnt/galaxyTools/galaxy-central/eggs/pysam-0.4.2_kanwei_b10f6e722e9a-py2.6-linux-x86_64-ucs4.egg/pysam/__init__.py:1: RuntimeWarning: __builtin__.file size changed, may indicate binary incompatibility
>   from csamtools import *
> python path is: /mnt/galaxyTools/galaxy-central/eggs/numpy-1.6.0-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/pysam-0.4.2_kanwei_b10f6e722e9a-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/boto-2.5.2-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/mercurial-2.2.3-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/Fabric-1.4.2-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/ssh-1.7.14-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/Whoosh-0.3.18-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/pycrypto-2.5-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/python_lzo-1.08_2.03_static-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/bx_python-0.7.1_7b95ff194725-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/amqplib-0.6.1-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/pexpect-2.4-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg, /mnt/ga!
 la!
>  xyTools/galaxy-central/eggs/Babel-0.9.4-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/MarkupSafe-0.12-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/Mako-0.4.1-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/WebHelpers-0.2-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/simplejson-2.1.1-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/wchartype-0.1-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/elementtree-1.2.6_20050316-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/docutils-0.7-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/WebOb-0.8.5-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/Routes-1.12.3-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/Cheetah-2.2.2-py2.6-linux-x86_64-ucs4.egg, /mnt/galaxyTools/galaxy-central/eggs/PasteDeploy-1.3.3-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/PasteScript-1.7.3-py2.6.egg, /mnt/galaxyTools/galaxy-central/eggs/Paste-1.6-py2.6.egg, /mnt/galaxyTools/galaxy-central/lib, /usr/lib/python2.6/, !
 /u!

>  sr/lib/python2.6/plat-linux2, /usr/lib/python2.6/lib-tk, /usr/!
>  lib/pyth
> on2.6/lib-old, /usr/lib/python2.6/lib-dynload
> Traceback (most recent call last):
>   File "/mnt/galaxyTools/galaxy-central/lib/galaxy/webapps/galaxy/buildapp.py", line 36, in app_factory
>     app = UniverseApplication( global_conf = global_conf, **kwargs )
>   File "/mnt/galaxyTools/galaxy-central/lib/galaxy/app.py", line 29, in __init__
>     self.config.check()
>   File "/mnt/galaxyTools/galaxy-central/lib/galaxy/config.py", line 315, in check
>     raise ConfigurationError( "Unable to create missing directory: %s\n%s" % ( path, e ) )
> ConfigurationError: Unable to create missing directory: /mnt/galaxyData/files
> [Errno 13] Permission denied: '/mnt/galaxyData/files'
> Removing PID file paster.pid
>
> I'm still wondering if I might be doing something wrong, which would surprise me as this time the process required much less interaction and input than the previous ones I was following.
>
> Any thoughts?
>
> Thanks,
>
> F.
>
> -----Original Message-----
> From: Dannon Baker [mailto:[hidden email]]
> Sent: Friday, December 07, 2012 6:19 AM
> To: Fabiano Lucchese
> Cc: [hidden email]
> Subject: Re: [galaxy-dev] Installation issue on EC2
>
> Which web form did you use to start the cloud instance?  The page at usegalaxy.org/cloudlaunch will always point to the currently supported AMI, and I'd definitely recommend using this form for launching galaxy instances.
>
> The problem you're running into is that the unsupported AMI already initialized your database as postgres9, which isn't backwards compatible with the postgres8 installed on the official AMI.  At this point, if you want to continue with this specific cluster you'll need to keep using that original AMI (or attempt to manually migrate the database between instances, which I wouldn't recommend).
>
> If you don't have any data you care to preserve on this cluster, just delete it and start over using usegalaxy.org/cloudlaunch and you should be good to go. If you want to try again fresh without deleting anything, just use a new cluster name when you launch the instance.
>
> -Dannon
>
> On Dec 6, 2012, at 7:45 PM, Fabiano Lucchese <[hidden email]> wrote:
>
>>                 Hi, all.
>>  
>>                 I've been trying to deploy a CloudMan/Galaxy cluster on EC2 for a couple of days now and have been facing all kinds of unexpected weirdness's. I found out that the web-based form that automatically creates the EC2 instance uses an image that doesn`t seem to be officially supported (or at least encouraged) by the Galaxy team, so I decided to launch the instance by hand following the instructions from here:
>>  
>>                
>> http://wiki.galaxyproject.org/CloudMan/AWS/GettingStarted
>>  
>>                 Up to the point before accessing Galaxy, everything looks ok, but the service never starts. Instead, it shows up at the CloudMan Admin page as "Error" and nothing happens if I try to Stop/Start or Restart it.There's no log either.  The Postgres service displays an immutable "Starting" status, while its log provides the following message:
>>  
>>                 FATAL:  database files are incompatible with server
>> DETAIL:  The data directory was initialized by PostgreSQL version 9.1, which is not compatible with this version 8.4.7.
>>  
>>                 It does look like a messed up installation, which is quite surprising as this is supposed to be the recommended working image.
>>  
>>                 Does anybody have any clue on what might be going on?
>>  
>>                 Thanks in advance.
>>  
>>                 F.
>>  
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this and other
>> Galaxy lists, please use the interface at:
>>
>>  http://lists.bx.psu.edu/
>
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>   http://lists.bx.psu.edu/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Installation issue on EC2

Fabiano Lucchese
        Guys,

        I appreciate your effort to help me, but it looks like my AWS account has some serious hidden issues going on. I completely wiped out CloudMan/Galaxy instances from my EC2 environment as well as their volumes, and waited a couple of hours for the instances to disappear from the instances list. After that, I repeated the whole process twice, trying to create a Galaxy cluster with 10 and then 15 Gb of storage space, but the result was equally frustrating with some minor differences.

        In the less unsuccessful scenario, the Galaxy service was the only one down, apparently for the following reason:

 Traceback (most recent call last):
  File "/mnt/galaxyTools/galaxy-central/lib/galaxy/webapps/galaxy/buildapp.py", line 36, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/mnt/galaxyTools/galaxy-central/lib/galaxy/app.py", line 85, in __init__
    from_shed_config=True )
  File "/mnt/galaxyTools/galaxy-central/lib/galaxy/tools/data/__init__.py", line 41, in load_from_config_file
    tree = util.parse_xml( config_filename )
  File "/mnt/galaxyTools/galaxy-central/lib/galaxy/util/__init__.py", line 143, in parse_xml
    tree = ElementTree.parse(fname)
  File "/mnt/galaxyTools/galaxy-central/eggs/elementtree-1.2.6_20050316-py2.6.egg/elementtree/ElementTree.py", line 859, in parse
    tree.parse(source, parser)
  File "/mnt/galaxyTools/galaxy-central/eggs/elementtree-1.2.6_20050316-py2.6.egg/elementtree/ElementTree.py", line 576, in parse
    source = open(source, "rb")
IOError: [Errno 2] No such file or directory: './shed_tool_data_table_conf.xml'
Removing PID file paster.pid

        I restarted it a few times, rebooted the machine and even tried to update it, but nothing could magically fix the problem. I'm giving up the high-level approach and starting from a fresh installation of Galaxy in one of my instances. It's going to be less productive, but at least I have some control over what's going on and can try to diagnose problems as they occur.

        Cheers,

        F.

PS: Dannon, one thing that intrigues me is how the web form manages to find out the names of the previous clusters that I tried to instantiate. Where does it get this information from if all the respective instances have been terminated and wiped out?

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Installation issue on EC2

Dannon Baker
On Dec 10, 2012, at 6:17 PM, Fabiano Lucchese <[hidden email]> wrote:
> I appreciate your effort to help me, but it looks like my AWS account has some serious hidden issues going on. I completely wiped out CloudMan/Galaxy instances from my EC2 environment as well as their volumes, and waited a couple of hours for the instances to disappear from the instances list. After that, I repeated the whole process twice, trying to create a Galaxy cluster with 10 and then 15 Gb of storage space, but the result was equally frustrating with some minor differences.

> PS: Dannon, one thing that intrigues me is how the web form manages to find out the names of the previous clusters that I tried to instantiate. Where does it get this information from if all the respective instances have been terminated and wiped out?

Did you reuse the same cluster name for either of these?  This would explain conflicting settings - there's more to a cluster than just the running instances and persistent volumes.

That form retrieves those listings from the S3 buckets in your account.  Each cluster has its own S3 bucket -- you can identify them with the yourCluster.clusterName files in the listing.  These contain lots of information about your galaxy clusters (references to volumes, universe settings, etc.), and if you're attempting to eliminate a cluster completely (you never want to restart it and don't want *anything* preserved), you should delete the buckets referring to them as well.  When you ask Cloudman to terminate and remove a cluster permanently, it removes all of this for you, and I'd recommend always using the interface to do this and not doing it manually.

Definitely let me know what else I can do to help.  If you have a running instance you'd like for me to look at directly I'd be happy to do so -- maybe this is indeed some weird issue that we can work around better.
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Installation issue on EC2

Scooter Willis-2
In reply to this post by Fabiano Lucchese
Did you delete the S3/shapshots buckets? That is where everything is
stored and I think EBS volumes related to everything dynamic are created.

On 12/10/12 6:17 PM, "Fabiano Lucchese" <[hidden email]> wrote:

> Guys,
>
> I appreciate your effort to help me, but it looks like my AWS account
>has some serious hidden issues going on. I completely wiped out
>CloudMan/Galaxy instances from my EC2 environment as well as their
>volumes, and waited a couple of hours for the instances to disappear from
>the instances list. After that, I repeated the whole process twice,
>trying to create a Galaxy cluster with 10 and then 15 Gb of storage
>space, but the result was equally frustrating with some minor differences.
>
> In the less unsuccessful scenario, the Galaxy service was the only one
>down, apparently for the following reason:
>
> Traceback (most recent call last):
>  File
>"/mnt/galaxyTools/galaxy-central/lib/galaxy/webapps/galaxy/buildapp.py",
>line 36, in app_factory
>    app = UniverseApplication( global_conf = global_conf, **kwargs )
>  File "/mnt/galaxyTools/galaxy-central/lib/galaxy/app.py", line 85, in
>__init__
>    from_shed_config=True )
>  File
>"/mnt/galaxyTools/galaxy-central/lib/galaxy/tools/data/__init__.py", line
>41, in load_from_config_file
>    tree = util.parse_xml( config_filename )
>  File "/mnt/galaxyTools/galaxy-central/lib/galaxy/util/__init__.py",
>line 143, in parse_xml
>    tree = ElementTree.parse(fname)
>  File
>"/mnt/galaxyTools/galaxy-central/eggs/elementtree-1.2.6_20050316-py2.6.egg
>/elementtree/ElementTree.py", line 859, in parse
>    tree.parse(source, parser)
>  File
>"/mnt/galaxyTools/galaxy-central/eggs/elementtree-1.2.6_20050316-py2.6.egg
>/elementtree/ElementTree.py", line 576, in parse
>    source = open(source, "rb")
>IOError: [Errno 2] No such file or directory:
>'./shed_tool_data_table_conf.xml'
>Removing PID file paster.pid
>
> I restarted it a few times, rebooted the machine and even tried to
>update it, but nothing could magically fix the problem. I'm giving up the
>high-level approach and starting from a fresh installation of Galaxy in
>one of my instances. It's going to be less productive, but at least I
>have some control over what's going on and can try to diagnose problems
>as they occur.
>
> Cheers,
>
> F.
>
>PS: Dannon, one thing that intrigues me is how the web form manages to
>find out the names of the previous clusters that I tried to instantiate.
>Where does it get this information from if all the respective instances
>have been terminated and wiped out?
>
>___________________________________________________________
>Please keep all replies on the list by using "reply all"
>in your mail client.  To manage your subscriptions to this
>and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Installation issue on EC2

Fabiano Lucchese
In reply to this post by Dannon Baker
        Hi, Dannon.

        Thanks for the tip. No, I was not reusing the cluster names exactly to avoid previous data to mess up with my fresh new deployments. There were indeed about 8 buckets referring to clusters that don`t exist anymore. Write now I can`t remove my current cluster by the web interface because it never comes up.

        What would be the safe way to allow you to see my instance?

        Cheers,

        F.

         

-----Original Message-----
From: Dannon Baker [mailto:[hidden email]]
Sent: Monday, December 10, 2012 3:36 PM
To: Fabiano Lucchese
Cc: Brad Chapman; [hidden email]
Subject: Re: [galaxy-dev] Installation issue on EC2

On Dec 10, 2012, at 6:17 PM, Fabiano Lucchese <[hidden email]> wrote:
> I appreciate your effort to help me, but it looks like my AWS account has some serious hidden issues going on. I completely wiped out CloudMan/Galaxy instances from my EC2 environment as well as their volumes, and waited a couple of hours for the instances to disappear from the instances list. After that, I repeated the whole process twice, trying to create a Galaxy cluster with 10 and then 15 Gb of storage space, but the result was equally frustrating with some minor differences.

> PS: Dannon, one thing that intrigues me is how the web form manages to find out the names of the previous clusters that I tried to instantiate. Where does it get this information from if all the respective instances have been terminated and wiped out?

Did you reuse the same cluster name for either of these?  This would explain conflicting settings - there's more to a cluster than just the running instances and persistent volumes.

That form retrieves those listings from the S3 buckets in your account.  Each cluster has its own S3 bucket -- you can identify them with the yourCluster.clusterName files in the listing.  These contain lots of information about your galaxy clusters (references to volumes, universe settings, etc.), and if you're attempting to eliminate a cluster completely (you never want to restart it and don't want *anything* preserved), you should delete the buckets referring to them as well.  When you ask Cloudman to terminate and remove a cluster permanently, it removes all of this for you, and I'd recommend always using the interface to do this and not doing it manually.

Definitely let me know what else I can do to help.  If you have a running instance you'd like for me to look at directly I'd be happy to do so -- maybe this is indeed some weird issue that we can work around better.

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/