[galaxy-user] Issue with saving 'manipulate fastq' in workflow; and request for advice dealing with barcoded 454 data
I'm a new user, learning how to use Galaxy while I wait for my 454 results. So I'm not actually playing with any data yet but I'm trying to set up a draft workflow as practice. Two issues:
I am having trouble with the 'manipulate fastq' command. Without this, my workflow saves quickly and seems fine, but when I include even a (seemingly simple) 'manipulate fastq' step, it tries to save for many minutes, unsuccessfully, until I get sick of it and close the window.
Well this isn't really an issue, just a request for advice! My dataset will be a barcoded amplicon library, containing 8 different gene regions (which I can recognise from the amplicon-specific primer sequences) amplified in 64 different individuals (which I can recognise by an individual-specific barcode sequence). I thought I'd set up a workflow with the following steps: 1) convert to FASTQ format. 2) grooming, filtering to remove short reads etc. 3) 'manipulate FASTQ' to match all sequences containing one of the eight reverse primer sequences, and reverse-complement them. 4) FASTQ--tabular format conversion. 5) eight separate 'select' steps to select sequences with a match to either the forward primer or the reverse-complemented reverse primer of the desired gene region.
My question is: does this seem sensible? Is there a more efficient way to do this that I haven't discovered yet? I was thinking I'd then set up another workflow to label barcoded individuals, for I could use each of the eight gene 'output files' in turn as input.
Thanks so much for this service! The screencasts are especially great.