file formats in workflows

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

file formats in workflows

Paul Webster
Hi,

Was impressed by the Worflow screencasts but when I converted my history
to a workflow it failed. This is because some of the tools output
tabular data, but some downstream tools require interval format input
and therefore fail. When I made the history I manually changed the file
format and nominated the fields for chr,start,end,strand, but this is
lost in the workflow. Obvious ideas:

* Use a tabular-to-interval tool - I looked but can't find one
* Only used tools which output interval format - a less desirable
solution. e.g. I can't figure out how to filter out duplicate codons in
my gene-BED-to-codon-BED file except using the statistics->count
function which outputs tabular.

Has anyone worked out how to change formats within a workflow?

regards,
Paul
_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user
Reply | Threaded
Open this post in threaded view
|

Re: file formats in workflows

Dannon Baker
Paul,

We're working on a new feature that'll allow extra actions within workflows that should cover this case, and it will be available soon.  Regarding a workaround for this specific scenario, would you mind sharing the workflow with '[hidden email]' ?  That way I can take a look and see if I can come up with something that'll work for you in the short term, as well as verifying that the new workflow actions will handle the task.

Thanks for using Galaxy,
-Dannon


On Jun 3, 2010, at 4:11 AM, Paul Webster wrote:

> Hi,
>
> Was impressed by the Worflow screencasts but when I converted my history to a workflow it failed. This is because some of the tools output tabular data, but some downstream tools require interval format input and therefore fail. When I made the history I manually changed the file format and nominated the fields for chr,start,end,strand, but this is lost in the workflow. Obvious ideas:
>
> * Use a tabular-to-interval tool - I looked but can't find one
> * Only used tools which output interval format - a less desirable solution. e.g. I can't figure out how to filter out duplicate codons in my gene-BED-to-codon-BED file except using the statistics->count function which outputs tabular.
>
> Has anyone worked out how to change formats within a workflow?
>
> regards,
> Paul
> _______________________________________________
> galaxy-user mailing list
> [hidden email]
> http://lists.bx.psu.edu/listinfo/galaxy-user

_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user
Reply | Threaded
Open this post in threaded view
|

Re: file formats in workflows

Paul Webster
Dannon,

Thanks for your interest. It would be a handy feature to be able to attach metadata to tabular files in the workflow. I have shared the workflow. It is called 'chr21 analysis'. Input is a gene BED file for a chromosome region from UCSC data (which includes gene variants). Workflow fails after I use "count" to get only unique records from my codon file. The other input is a list of SNPs in the same region.

regards,
Paul

Dannon Baker wrote:
Paul,

We're working on a new feature that'll allow extra actions within workflows that should cover this case, and it will be available soon.  Regarding a workaround for this specific scenario, would you mind sharing the workflow with '[hidden email]' ?  That way I can take a look and see if I can come up with something that'll work for you in the short term, as well as verifying that the new workflow actions will handle the task.

Thanks for using Galaxy,
-Dannon


On Jun 3, 2010, at 4:11 AM, Paul Webster wrote:

  
Hi,

Was impressed by the Worflow screencasts but when I converted my history to a workflow it failed. This is because some of the tools output tabular data, but some downstream tools require interval format input and therefore fail. When I made the history I manually changed the file format and nominated the fields for chr,start,end,strand, but this is lost in the workflow. Obvious ideas:

* Use a tabular-to-interval tool - I looked but can't find one
* Only used tools which output interval format - a less desirable solution. e.g. I can't figure out how to filter out duplicate codons in my gene-BED-to-codon-BED file except using the statistics->count function which outputs tabular.

Has anyone worked out how to change formats within a workflow?

regards,
Paul
_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user
    

  


_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user