Quantcast

mapreduce approach in workflows

classic Classic list List threaded Threaded
1 message Options
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

mapreduce approach in workflows

Cittaro Davide
Hi all,
We are trying to write our own GATK workflow (although I guess an "official" one will be released...) and I realized I have to parallelize execution as some steps are painfully slow.
In order to run GATK quickly I need to split the whole process on specific genomic intervals (-L option) and merge results at the end (http://www.broadinstitute.org/gsa/wiki/index.php/Parallelism_and_the_GATK), hence I have to inform galaxy that it has to create parallel jobs, one for each interval, in a MapReduce way. I see from here http://gmod.org/mediawiki/images/5/50/April2012GalaxyUpdate.pdf that this feature will be somehow implemented, I also see from bitbucket commits that a BAM slicer has been added... Is there a release plan for this? Should I write my own wrapper/adapter?
Thanks

d
/*
Davide Cittaro, PhD

Coordinator of Bioinformatics Core
Center for Translational Genomics and Bioinformatics
San Raffaele Scientific Institute
Via Olgettina 58
20132 Milano
Italy

Office: +39 02 26439140
Mail: [hidden email]
Skype: daweonline
*/











--------------------------------------------------------------------------
LA TUA CURA E' SCRITTA NEL TUO DNA. AL SAN RAFFAELE LA STIAMO REALIZZANDO.
AIUTA LA RICERCA, DAI IL TUO 5XMILLE - CF: 03 06 80 153
info:[hidden email] - www.5xmille.org

Disclaimer added by CodeTwo Exchange Rules 2007
http://www.codetwo.com


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Loading...