[galaxy-dev] display_application

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

[galaxy-dev] display_application

Davide Cittaro
Hi all, 
I've noticed that when I try to visualize a BAM file on UCSC (with display_application) the galaxy web process drains RAM and never releases it... I easly go in OutOfMemory error. Is it trying to load BAM (or any other custom file defined to be visualized in the same way) in RAM?
Can anybody explain how display_application works and how to debug it? 

d
/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

tel.: +39(02)574303007
*/




_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-dev] display_application

Davide Cittaro

On May 28, 2010, at 10:22 AM, Davide Cittaro wrote:

Hi all, 
I've noticed that when I try to visualize a BAM file on UCSC (with display_application) the galaxy web process drains RAM and never releases it... I easly go in OutOfMemory error. Is it trying to load BAM (or any other custom file defined to be visualized in the same way) in RAM?
Can anybody explain how display_application works and how to debug it? 


Mmm... apparently the Paste egg loads the file to be shown to UCSC in memory (and there's no close()). I see this egg is not developed by GalaxyTeam, so I don't know if this should be issued as a galaxy bug. BTW, I've tried this

$ diff -u Paste-1.6-py2.6.egg/paste/wsgilib.py.tmp Paste-1.6-py2.6.egg/paste/wsgilib.py
--- Paste-1.6-py2.6.egg/paste/wsgilib.py.tmp        2010-05-28 11:06:00.174278394 +0200
+++ Paste-1.6-py2.6.egg/paste/wsgilib.py        2010-05-20 10:44:49.354765626 +0200
@@ -15,7 +15,6 @@
 from traceback import print_exception
 import urllib
 from cStringIO import StringIO
-import tempfile
 import sys
 from urlparse import urlsplit
 import warnings
@@ -527,8 +526,7 @@
             "If you provide conditional you must also provide "
             "start_response")
     data = []
-    #output = StringIO()
-    output = tempfile.NamedTemporaryFile(dir='/data/galaxy_dist/database/tmp')
+    output = StringIO()
     def replacement_start_response(status, headers, exc_info=None):
         if conditional is not None and not conditional(status, headers):
             data.append(None)
@@ -551,9 +549,7 @@
         data.append(None)
     if len(data) < 2:
         data.append(None)
-    #data.append(output.getvalue())
-    output.seek(0)
-    data.append(output.read())
+    data.append(output.getvalue())
     return data

 

 ## Deprecation warning wrapper:

Essentially substituting the cStringIO handler with a temporary file. A temporary file for each galaxy history item I would like to see on UCSC is created, in multiple copies... Still the output is read and never released so memory easily drains...

d


d
/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

tel.: +39(02)574303007
*/



_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev

/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

tel.: +39(02)574303007
*/




_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-dev] display_application

Davide Cittaro

On May 28, 2010, at 11:14 AM, Davide Cittaro wrote:
Mmm... apparently the Paste egg loads the file to be shown to UCSC in memory (and there's no close()). I see this egg is not developed by GalaxyTeam, so I don't know if this should be issued as a galaxy bug. BTW, I've tried this


[cut]

Needless to say, it's pretty useless... the object is kept in memory until galaxy is alive....

d

/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

tel.: +39(02)574303007
*/




_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-dev] display_application

Davide Cittaro
Hi guys, sorry for this mail flooding, seriously...

On May 28, 2010, at 1:59 PM, Davide Cittaro wrote:
[cut]

Needless to say, it's pretty useless... the object is kept in memory until galaxy is alive....


use_printdebug = False

solves the memory issue (at least doesn't call paste.PrintDebugMiddleware which calls paste.intercept_output....)
BTW, still not able to see BAM files... well, actually I can see the reads at the beginning of chromosome 10, which are the reads at the beginning of my BAM file :-(

d

/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

tel.: +39(02)574303007
*/




_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-dev] display_application

Nate Coraor (nate@bx.psu.edu)
Davide Cittaro wrote:

> use_printdebug = False

Ah, that was going to be my first question.  I suggest just use_debug =
False.

>
> solves the memory issue (at least doesn't call
> paste.PrintDebugMiddleware which calls paste.intercept_output....)
> BTW, still not able to see BAM files... well, actually I can see the
> reads at the beginning of chromosome 10, which are the reads at the
> beginning of my BAM file :-(
>
> d
>
> /*
> Davide Cittaro
>
> Cogentech - Consortium for Genomic Technologies
> via adamello, 16
> 20139 Milano
> Italy
>
> tel.: +39(02)574303007
> e-mail: [hidden email]
> <mailto:[hidden email]>
> */
>
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> galaxy-dev mailing list
> [hidden email]
> http://lists.bx.psu.edu/listinfo/galaxy-dev

_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-dev] display_application

Davide Cittaro

On May 28, 2010, at 2:58 PM, Nate Coraor wrote:

Davide Cittaro wrote:

use_printdebug = False

Ah, that was going to be my first question.  I suggest just use_debug = False.

solves the memory issue (at least doesn't call paste.PrintDebugMiddleware which calls paste.intercept_output....)
BTW, still not able to see BAM files... well, actually I can see the reads at the beginning of chromosome 10, which are the reads at the beginning of my BAM file :-(

I've asked to open the galaxy test server to the UCSC in California... Still get truncated BAM files, at the beginning of chr10... What is nice is that files are truncated in a different manner on our mirror, like it can read some more information before end of communication... Unfortunately there's no log about this "truncation" error... :-(

d

d
/*
Davide Cittaro
Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy
tel.: +39(02)574303007
e-mail: [hidden email] <[hidden email]>
*/
------------------------------------------------------------------------
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev


/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

tel.: +39(02)574303007
*/




_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-dev] display_application

Davide Cittaro
Found something weird

On May 28, 2010, at 3:05 PM, Davide Cittaro wrote:


On May 28, 2010, at 2:58 PM, Nate Coraor wrote:

Davide Cittaro wrote:

use_printdebug = False

Ah, that was going to be my first question.  I suggest just use_debug = False.

solves the memory issue (at least doesn't call paste.PrintDebugMiddleware which calls paste.intercept_output....)
BTW, still not able to see BAM files... well, actually I can see the reads at the beginning of chromosome 10, which are the reads at the beginning of my BAM file :-(

I've asked to open the galaxy test server to the UCSC in California... Still get truncated BAM files, at the beginning of chr10... What is nice is that files are truncated in a different manner on our mirror, like it can read some more information before end of communication... Unfortunately there's no log about this "truncation" error... :-(


It looks like I'm only able to load the portion of BAM file that is in the UCSC range previously selected. Suppose I have a clean session in UCSC, spanning chr10:1-50,000,000. As I link from history to UCSC a BAM file I can see reads for the same span, nothing more (and, obviously no reads on other chroms).
What is strange is that the same doesn't apply on other chromosomes, it seems that galaxy tells the UCSC the content of BAM file from the beginning (chr10 in my sorted case) to the max span available (which is the end of chr10 at max)... It acts as if there is a kind of galaxy cache that is never emptied... Does this make sense to you? Besides, have you ever tried visualizatoin of BAM files when using remoteuser?

d

/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

tel.: +39(02)574303007
*/




_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-dev] display_application

Daniel Blankenberg
Hi Davide,

Galaxy doesn't do anything 'special' here. It provides access to 3 files to UCSC, a BAM file, the BAI index file, and a track definition 'file'. The track definition contains 4 pieces of information, the type of track, the url of the bam file, the dbkey and a name for the track. Check that the track file has valid information (especially the URL) and that the bam, bai and track files are accessible via HTTP.


IIRC the UCSC browser does some caching on its end for BAM files by 'filename', so you will likely need to use different history items than the ones that were failing (due to having debug options on) or clear this cache. Can you confirm the odd behavior occurs on history items that did not experience the memory errors? Cloning the history or copying the history items (under edit attributes) should be sufficient. 

Thanks,

Dan



On May 28, 2010, at 9:31 AM, Davide Cittaro wrote:

Found something weird

On May 28, 2010, at 3:05 PM, Davide Cittaro wrote:


On May 28, 2010, at 2:58 PM, Nate Coraor wrote:

Davide Cittaro wrote:

use_printdebug = False

Ah, that was going to be my first question.  I suggest just use_debug = False.

solves the memory issue (at least doesn't call paste.PrintDebugMiddleware which calls paste.intercept_output....)
BTW, still not able to see BAM files... well, actually I can see the reads at the beginning of chromosome 10, which are the reads at the beginning of my BAM file :-(

I've asked to open the galaxy test server to the UCSC in California... Still get truncated BAM files, at the beginning of chr10... What is nice is that files are truncated in a different manner on our mirror, like it can read some more information before end of communication... Unfortunately there's no log about this "truncation" error... :-(


It looks like I'm only able to load the portion of BAM file that is in the UCSC range previously selected. Suppose I have a clean session in UCSC, spanning chr10:1-50,000,000. As I link from history to UCSC a BAM file I can see reads for the same span, nothing more (and, obviously no reads on other chroms).
What is strange is that the same doesn't apply on other chromosomes, it seems that galaxy tells the UCSC the content of BAM file from the beginning (chr10 in my sorted case) to the max span available (which is the end of chr10 at max)... It acts as if there is a kind of galaxy cache that is never emptied... Does this make sense to you? Besides, have you ever tried visualizatoin of BAM files when using remoteuser?

d

/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

tel.: +39(02)574303007
*/



_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev


_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-dev] display_application

Davide Cittaro

On May 28, 2010, at 4:38 PM, Daniel Blankenberg wrote:

Hi Davide,

Galaxy doesn't do anything 'special' here. It provides access to 3 files to UCSC, a BAM file, the BAI index file, and a track definition 'file'. The track definition contains 4 pieces of information, the type of track, the url of the bam file, the dbkey and a name for the track. Check that the track file has valid information (especially the URL) and that the bam, bai and track files are accessible via HTTP.


They are accessible. Indeed I can read the whole bam file with samtools from a remote machine, i.e.:


works perfectly


IIRC the UCSC browser does some caching on its end for BAM files by 'filename', so you will likely need to use different history items than the ones that were failing (due to having debug options on) or clear this cache. Can you confirm the odd behavior occurs on history items that did not experience the memory errors? Cloning the history or copying the history items (under edit attributes) should be sufficient. 


I'm going to test on a new history and new BAM files, just to be sure.... wait for it... 
Nope... doesn't work. I only get reads for chr10 up to 103 Mb (hg18)... 

d

Thanks,

Dan



On May 28, 2010, at 9:31 AM, Davide Cittaro wrote:

Found something weird

On May 28, 2010, at 3:05 PM, Davide Cittaro wrote:


On May 28, 2010, at 2:58 PM, Nate Coraor wrote:

Davide Cittaro wrote:

use_printdebug = False

Ah, that was going to be my first question.  I suggest just use_debug = False.

solves the memory issue (at least doesn't call paste.PrintDebugMiddleware which calls paste.intercept_output....)
BTW, still not able to see BAM files... well, actually I can see the reads at the beginning of chromosome 10, which are the reads at the beginning of my BAM file :-(

I've asked to open the galaxy test server to the UCSC in California... Still get truncated BAM files, at the beginning of chr10... What is nice is that files are truncated in a different manner on our mirror, like it can read some more information before end of communication... Unfortunately there's no log about this "truncation" error... :-(


It looks like I'm only able to load the portion of BAM file that is in the UCSC range previously selected. Suppose I have a clean session in UCSC, spanning chr10:1-50,000,000. As I link from history to UCSC a BAM file I can see reads for the same span, nothing more (and, obviously no reads on other chroms).
What is strange is that the same doesn't apply on other chromosomes, it seems that galaxy tells the UCSC the content of BAM file from the beginning (chr10 in my sorted case) to the max span available (which is the end of chr10 at max)... It acts as if there is a kind of galaxy cache that is never emptied... Does this make sense to you? Besides, have you ever tried visualizatoin of BAM files when using remoteuser?

d

/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

tel.: +39(02)574303007
*/



_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev


/*
Davide Cittaro

Cogentech - Consortium for Genomic Technologies
via adamello, 16
20139 Milano
Italy

tel.: +39(02)574303007
*/




_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Galaxy :failure preparing job error

Modi, Amit
In reply to this post by Nate Coraor (nate@bx.psu.edu)
Galaxy :failure preparing job error Hi,

I created a metadata (“config”) and want to run my tool using that.

I am getting this error when I try to use the uploaded file with this new extension .

I have specified the command line parameters as :

<command interpreter="bash">kelvin_wrapper $confile  ${os.path.join( confile.extra_files_path , 'pedfile.txt' )}  ${os.path.join( confile.extra_files_path , 'mapfile.txt')}  ${os.path.join( confile.extra_files_path , 'frequencyfile.txt' )} ${os.path.join( confile.extra_files_path , 'datafile.txt')} $brfile $pplfile $modfile</command>

Error:

Traceback (most recent call last):
  File "/export/home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/sge.py", line 120, in queue_job
    job_wrapper.prepare()
  File "/export/home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py", line 390, in prepare
    self.command_line = self.tool.build_command_line( param_dict )
  File "/export/home/galaxy/galaxy-dist/lib/galaxy/tools/__init__.py", line 1343, in build_command_line
    command_line = fill_template( self.command, context=param_dict )
  File "/export/home/galaxy/galaxy-dist/lib/galaxy/util/template.py", line 9, in fill_template
    return str( Template( source=template_text, searchList=[context] ) )
  File "/export/home/galaxy/galaxy-dist/eggs/Cheetah-2.2.2-py2.4-linux-x86_64-ucs4.egg/Cheetah/Template.py", line 1004, in __str__
    return getattr(self, mainMethName)()
  File "cheetah_DynamicallyCompiledCheetahTemplate_1275498667_76_83196.py", line 86, in respond
NameError: global name 'confile' is not defined
Tool execution generated the following error message:
failure preparing job


Can anybody please help

Regards,
Amit Modi


----------------------------------------- Confidentiality Notice: The following mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. The recipient is responsible to maintain the confidentiality of this information and to use the information only for authorized purposes. If you are not the intended recipient (or authorized to receive information for the intended recipient), you are hereby notified that any review, use, disclosure, distribution, copying, printing, or action taken in reliance on the contents of this e-mail is strictly prohibited. If you have received this communication in error, please notify us immediately by reply e-mail and destroy all copies of the original message. Thank you.


_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev