How to decide "Mean Inner Distance between Mate Pairs"?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How to decide "Mean Inner Distance between Mate Pairs"?

Du, Jianguang

Dear All,

I am analyzing the downloaded RNA-seq datasets. However I am not sure how much is Mean Inner Distance between Mate Pairs for these paired-end datasets.

Take a paired-end RNA-seq dataset as an example, there is a description for this dataset in SRA database of NCBI: "Layout: PAIRED, Orientation: 5'-3'-3'-5', Nominal length: 400, Nominal Std Dev: 20".

At first I thought the Mean Inner Distance between Mate Pairs should be 325bps because the length of reads on both ends is 36bps. However when I aligned the sequence of the paired reads on to transcripts and genome using BLASTn, the distance between the paired reads is about 200bps. How should I decide the Mean Inner Distance between Mate Pairs in my case?

Thanks.

Jianguang Du


___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: How to decide "Mean Inner Distance between Mate Pairs"?

Sean Davis


On Wed, Aug 15, 2012 at 11:13 AM, Du, Jianguang <[hidden email]> wrote:

Dear All,

I am analyzing the downloaded RNA-seq datasets. However I am not sure how much is Mean Inner Distance between Mate Pairs for these paired-end datasets.

Take a paired-end RNA-seq dataset as an example, there is a description for this dataset in SRA database of NCBI: "Layout: PAIRED, Orientation: 5'-3'-3'-5', Nominal length: 400, Nominal Std Dev: 20"

At first I thought the Mean Inner Distance between Mate Pairs should be 325bps because the length of reads on both ends is 36bps. However when I aligned the sequence of the paired reads on to transcripts and genome using BLASTn, the distance between the paired reads is about 200bps. How should I decide the Mean Inner Distance between Mate Pairs in my case?


The information from SRA is likely only an approximation.  SRA does not validate these details, I do not think.

You can probably use the distribution from your data as the best estimate.  

Sean 

Thanks.

Jianguang Du



___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: How to decide "Mean Inner Distance between Mate Pairs"?

Jen Hillman-Jackson
Great advice Sean!

Jianguang, this is the correct analysis - mapping the data to test the
actual insert size of the library as sequenced. The experimental notes
at SRA are just a starting place, the data is truth. A sample through
TopHat itself might produce more precise results. I suspect the coverage
on your top Blastn HSP is not complete, breaking off where it hits a
splice. And that you have some bias for sequences/hits that cross
junctions near ends. But overall, none of this would likely make that
much of a difference in the analysis as a whole.

Good luck!

Jen
Galaxy team

On 8/15/12 8:39 AM, Sean Davis wrote:

>
>
> On Wed, Aug 15, 2012 at 11:13 AM, Du, Jianguang <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Dear All,
>
>     I am analyzing the downloaded RNA-seq datasets. However I am not
>     sure how much is Mean Inner Distance between Mate Pairs for these
>     paired-end datasets.
>
>     Take a paired-end RNA-seq dataset as an example, there is a
>     description for this dataset in SRA database of NCBI: "Layout:
>     PAIRED/, Orientation: /5'-3'-3'-5'/, Nominal length: /400/, Nominal
>     Std Dev: /20"
>
>     At first I thought the Mean Inner Distance between Mate Pairs should
>     be 325bps because the length of reads on both ends is 36bps. However
>     when I aligned the sequence of the paired reads on to transcripts
>     and genome using BLASTn, the distance between the paired reads
>     is about 200bps. How should I decide the Mean Inner Distance between
>     Mate Pairs in my case?
>
>
> The information from SRA is likely only an approximation.  SRA does not
> validate these details, I do not think.
>
> You can probably use the distribution from your data as the best estimate.
>
> Sean
>
>     Thanks.
>
>     Jianguang Du
>
>
>
>
> ___________________________________________________________
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>    http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>    http://lists.bx.psu.edu/
>

--
Jennifer Jackson
http://galaxyproject.org
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/
Jennifer Hillman-Jackson
http://galaxyproject.org