Re: [galaxy-bugs] Help with mRNA sequences

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-bugs] Help with mRNA sequences

Anton Nekrutenko
Lee-Ann:

I shared a history (using chr22 data for simplicity):


and a workflow:


with you that do the trick. Basically, you 

1. download mRNA data as BED and as mRNA sequences (history items 1 & 2)
2. Collapse sequences to tab-delimited format (history item 3)
3. Remove dots and numbers by replacing dots with tabs and cutting accession and sequence out (History items 4 & 5)
4. Join sequences with bed file (History item 6)
5. Downloading the SNPs (History item # 7)
6. Joining with SNPs (History item # 8).

You can use workflow to run this analysis genomewide.

Let me know if you have issues.

Tx,

anton
galaxy team



On Jun 8, 2010, at 6:56 AM, Lee Wood wrote:

Hi Anton
 
Sorry to reply directly to you, but I'm a bit desperate :) .  The problem I'm having is not with joining the files.  The problem is that I need to retrieve the mRNA co-ordiantes and mRNA sequences from Galaxy.  So basically what I want is a file that contains the mRNA co-ordinates and sequences, this file will then be joined to the file containing SNPs to identify which SNPs are within which mRNAs.

The problem is that I can't retrieve the mRNA co-ordinates.  When I go through the mRNA  group and track for RefSeq genes it gives me transcript and not mRNA co-ordinates.   Then I thought if I use the refSeqAli table instead of RefSeq genes I could get the co-ordinates, and join this file to the file containing the mRNA sequences (that I retieved by chosing output format as sequence and selecting sequence type as mRNA).  The problem here is that because the sequence file only contains the mRNA accession number and sequence I have to join the two files based on the NM numbers.  But in the refSeqAli file (co-ordinates file) the NM number looks something like this NM1234 whereas in the mRNA sequence file looks something like this NM1234.2 so they won't join.

Is there a way to retrieve the mRNA co-ordinates and sequences through Galaxy, or will I have to create a script to do it myself?

Sorry for the super confusing email, but I'd really really appreciate any help.

Thank-you
Lee-Ann
 
On Tue, Jun 1, 2010 at 9:10 PM, Anton Nekrutenko <[hidden email]> wrote:
Lee-Ann:

If you upload coordinates of mRNA mapping to the genome in BED format and join it with coordinates of SNPs as shown in this movie:

http://screencast.g2.bx.psu.edu/galaxy/quickie5_join/flow.html

you will be able to identify mRNA containing SNPs.

Let me know if you still have issues.

anton
galaxy team



On May 31, 2010, at 7:16 AM, Lee Wood wrote:

Hi

Could you please help.  I need to identify SNPs within mRNA sequences.  I have manged to retrieve all mRNA sequences using the mRNA and EST group, Human mRNAs track and RefSeqGenes table, and outputting in sequence format.  The problem is, that to identify which SNPs are within which mRNA sequences I need the mRNA co-ordinates.  If I leave everything the same in the table browser and output as BED for example it gives me gene and transcript information and not mRNA information.

Your help will be greatly appreciated
Lee-Ann
_______________________________________________
galaxy-bugs mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-bugs

Anton Nekrutenko
http://nekrut.bx.psu.edu
http://usegalaxy.org







_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user