New 5' start site, 3' end site and long internal coding exons

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

New 5' start site, 3' end site and long internal coding exons

Yanwei Tan
Deal all,

I was wondering if anyone know Galaxy can analyze the New 5' start site,
3' end site and long internal coding exons. Because I just try to run
Scripture
(http://www.broadinstitute.org/software/scripture/Graph%20building) and
got the bed file.  Can someone give me some hint?

Many thanks in advance!
Wei
_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user
Reply | Threaded
Open this post in threaded view
|

Re: New 5' start site, 3' end site and long internal coding exons

Jen Hillman-Jackson
Hi Wei,

Could you explain in more detail the type of analysis that you are
interested in doing?
Including a few lines of the BED file (5-10) pasted into the reply would
be helpful.

Thanks,

Jen
Galaxy Tea

On 6/10/10 11:14 AM, Yanwei Tan wrote:

> Deal all,
>
> I was wondering if anyone know Galaxy can analyze the New 5' start
> site, 3' end site and long internal coding exons. Because I just try
> to run Scripture
> (http://www.broadinstitute.org/software/scripture/Graph%20building)
> and got the bed file.  Can someone give me some hint?
>
> Many thanks in advance!
> Wei
> _______________________________________________
> galaxy-user mailing list
> [hidden email]
> http://lists.bx.psu.edu/listinfo/galaxy-user

--
Jennifer Jackson
http://usegalaxy.org

_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user
Jennifer Hillman-Jackson
http://galaxyproject.org
Reply | Threaded
Open this post in threaded view
|

Re: New 5' start site, 3' end site and long internal coding exons

Yanwei Tan
Hi Jen,

Thanks a lot for your reply!  So please see the following bed file.

track name=junctions description="TopHat junctions"
chr19    3262083    3264847    JUNC00000001    2    -    3262083    
3264847    255,0,0    2    61,37    0,2727
chr19    3266410    3267296    JUNC00000002    2    -    3266410    
3267296    255,0,0    2    49,55    0,831
chr19    3267271    3268345    JUNC00000003    3    -    3267271    
3268345    255,0,0    2    65,48    0,1026
chr19    3268383    3268720    JUNC00000004    2    -    3268383    
3268720    255,0,0    2    33,61    0,276
chr19    3272970    3274417    JUNC00000005    5    -    3272970    
3274417    255,0,0    2    65,62    0,1385
chr19    3274528    3276802    JUNC00000006    1    -    3274528    
3276802    255,0,0    2    28,47    0,2227
chr19    3276860    3279592    JUNC00000007    4    -    3276860    
3279592    255,0,0    2    59,60    0,2672
chr19    3279962    3281488    JUNC00000008    2    -    3279962    
3281488    255,0,0    2    34,70    0,1456
chr19    3281520    3282759    JUNC00000009    2    -    3281520    
3282759    255,0,0    2    68,59    0,1180
chr19    3284771    3285362    JUNC00000010    2    +    3284771    
3285362    255,0,0    2    46,53    0,538
chr19    3284779    3285344    JUNC00000011    2    +    3284779    
3285344    255,0,0    2    54,35    0,530
chr19    3285352    3286924    JUNC00000012    2    +    3285352    
3286924    255,0,0    2    43,50    0,1522
chr19    3289043    3291029    JUNC00000013    5    +    3289043    
3291029    255,0,0    2    44,53    0,1933

I have stimulated sample and control sample, I would like to know if
there is 3 UTR, 5 start site variance in the stimulated sample. There is
always some interesting info in the 3 UTR.  If the length of 3UTR
changes under specific biological condition, then it could explain some
mechanism, for example some microRNA info.

And so far, I uploaded the bed file to Galaxy, and choose from cdsEnd to
txEnd (should be transcript end, right?) which should be 3 UTR info. If
this is right, I check by hand every 3UTR of every gene compared with
public annotation like UCSC. But this is not a proper way and
time-consuming. I do not have bioinfomatics experience, do not know if
one can give me some advice how to do this properly.

Many thanks for your help!

Best,
Wei




On 6/22/10 6:35 PM, Jennifer Jackson wrote:

> Hi Wei,
>
> Could you explain in more detail the type of analysis that you are
> interested in doing?
> Including a few lines of the BED file (5-10) pasted into the reply
> would be helpful.
>
> Thanks,
>
> Jen
> Galaxy Tea
>
> On 6/10/10 11:14 AM, Yanwei Tan wrote:
>> Deal all,
>>
>> I was wondering if anyone know Galaxy can analyze the New 5' start
>> site, 3' end site and long internal coding exons. Because I just try
>> to run Scripture
>> (http://www.broadinstitute.org/software/scripture/Graph%20building)
>> and got the bed file.  Can someone give me some hint?
>>
>> Many thanks in advance!
>> Wei
>> _______________________________________________
>> galaxy-user mailing list
>> [hidden email]
>> http://lists.bx.psu.edu/listinfo/galaxy-user
>
_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user
Reply | Threaded
Open this post in threaded view
|

Re: New 5' start site, 3' end site and long internal coding exons

Jen Hillman-Jackson
"Hello Wei,

Thank you for sending more information and sample data.

The general query path would be to:

1) bring in the UCSC Genes 3' UTR annotation as a dataset
(knownGene -> BED -> 3' UTR Exons -> Galaxy)

2) analyze your data versus these target transcripts by overlap using
tools in the group "Operate on Genomic Intervals"

You may find it helpful to group transcripts and overlapping samples
into gene clusters using the UCSC table knownIsoforms.

Hopefully this helps to get you started,
Thanks,
Jen

Galaxy Team

On 6/22/10 10:07 AM, Yanwei Tan wrote:

> Hi Jen,
>
> Thanks a lot for your reply!  So please see the following bed file.
>
> track name=junctions description="TopHat junctions"
> chr19    3262083    3264847    JUNC00000001    2    -    3262083    
> 3264847    255,0,0    2    61,37    0,2727
> chr19    3266410    3267296    JUNC00000002    2    -    3266410    
> 3267296    255,0,0    2    49,55    0,831
> chr19    3267271    3268345    JUNC00000003    3    -    3267271    
> 3268345    255,0,0    2    65,48    0,1026
> chr19    3268383    3268720    JUNC00000004    2    -    3268383    
> 3268720    255,0,0    2    33,61    0,276
> chr19    3272970    3274417    JUNC00000005    5    -    3272970    
> 3274417    255,0,0    2    65,62    0,1385
> chr19    3274528    3276802    JUNC00000006    1    -    3274528    
> 3276802    255,0,0    2    28,47    0,2227
> chr19    3276860    3279592    JUNC00000007    4    -    3276860    
> 3279592    255,0,0    2    59,60    0,2672
> chr19    3279962    3281488    JUNC00000008    2    -    3279962    
> 3281488    255,0,0    2    34,70    0,1456
> chr19    3281520    3282759    JUNC00000009    2    -    3281520    
> 3282759    255,0,0    2    68,59    0,1180
> chr19    3284771    3285362    JUNC00000010    2    +    3284771    
> 3285362    255,0,0    2    46,53    0,538
> chr19    3284779    3285344    JUNC00000011    2    +    3284779    
> 3285344    255,0,0    2    54,35    0,530
> chr19    3285352    3286924    JUNC00000012    2    +    3285352    
> 3286924    255,0,0    2    43,50    0,1522
> chr19    3289043    3291029    JUNC00000013    5    +    3289043    
> 3291029    255,0,0    2    44,53    0,1933
>
> I have stimulated sample and control sample, I would like to know if
> there is 3 UTR, 5 start site variance in the stimulated sample. There
> is always some interesting info in the 3 UTR.  If the length of 3UTR
> changes under specific biological condition, then it could explain
> some mechanism, for example some microRNA info.
>
> And so far, I uploaded the bed file to Galaxy, and choose from cdsEnd
> to txEnd (should be transcript end, right?) which should be 3 UTR
> info. If this is right, I check by hand every 3UTR of every gene
> compared with public annotation like UCSC. But this is not a proper
> way and time-consuming. I do not have bioinfomatics experience, do not
> know if one can give me some advice how to do this properly.
>
> Many thanks for your help!
>
> Best,
> Wei
>
>
>
>
> On 6/22/10 6:35 PM, Jennifer Jackson wrote:
>> Hi Wei,
>>
>> Could you explain in more detail the type of analysis that you are
>> interested in doing?
>> Including a few lines of the BED file (5-10) pasted into the reply
>> would be helpful.
>>
>> Thanks,
>>
>> Jen
>> Galaxy Tea
>>
>> On 6/10/10 11:14 AM, Yanwei Tan wrote:
>>> Deal all,
>>>
>>> I was wondering if anyone know Galaxy can analyze the New 5' start
>>> site, 3' end site and long internal coding exons. Because I just try
>>> to run Scripture
>>> (http://www.broadinstitute.org/software/scripture/Graph%20building)
>>> and got the bed file.  Can someone give me some hint?
>>>
>>> Many thanks in advance!
>>> Wei
>>> _______________________________________________
>>> galaxy-user mailing list
>>> [hidden email]
>>> http://lists.bx.psu.edu/listinfo/galaxy-user
>>

--
Jennifer Jackson
http://usegalaxy.org

_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user
Jennifer Hillman-Jackson
http://galaxyproject.org