[Gmod-ajax] Question on Read and coverage Display

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[Gmod-ajax] Question on Read and coverage Display

Nadine Elpida Tatto
Hey there,

I have a question according die method reads are displayed in an Alignment2 track. In my case it is about RNA-Seq reads, where less reads are in the track than I expected.
My assumption is, that identical reads are combined in one read to decrease the overall amount of displayed reads. Is that correct? And if, is there a way to see how much reads are combined into one?

Then I have another question according the XYPlot with a bigWig file as a source. I converted a bam file (same RNA-Seq Alignment as above) using samtools' genomeCoverageBed into a bed file and then using UCSCs script bedGraphToBigWig to convert it into a bigWig format.
The point now is that the XY Plot displays a rather high coverage at positions which are supposed to be an intron (the green track). I think the problem lies in the conversion, because if I have a look at the bam file in the IGV browser the automatically generated coverage histogram doesn't show the same pattern. Does anyone has some experience with such conversions?
If I use the  Alignment track for the coverage histogram looks much better, but I ask myself what this darker grey area at the bottom of the histogram stands for?
 


Thank you very much in advance!

cheers
Nadine

----------------------------------------
DI(FH) Nadine Tatto
Bioinformatics

ACIB - Austrian Centre of Industrial Biotechnology
----------------------------------------
Tel: +43 1 47654 6838
Fax: +43 1 36006 6847
Email: [hidden email]
Web: www.acib.at
Office: Muthgasse 18, 1190 Wien
----------------------------------------
ACIB GmbH, Petersgasse 14, 8010 Graz, Austria
FB: 224687y FBG: HG Graz UID: ATU 54545504
---------------------------------------



------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Question on Read and coverage Display

Keiran Raine
Hi Nadine,

Please see inline comments below:

Regards,

Keiran Raine
Principal Bioinformatician
Cancer Genome Project
Wellcome Trust Sanger Institute

Tel:+44 (0)1223 834244 Ext: 7703
Office: H104

On 23 Oct 2015, at 15:54, Nadine Elpida Tatto <[hidden email]> wrote:

Hey there,

I have a question according die method reads are displayed in an Alignment2 track. In my case it is about RNA-Seq reads, where less reads are in the track than I expected.
My assumption is, that identical reads are combined in one read to decrease the overall amount of displayed reads. Is that correct? And if, is there a way to see how much reads are combined into one?

Take a look at the track menu, some read classes are turned off by default.  This may be where you missing reads are hiding.


Then I have another question according the XYPlot with a bigWig file as a source. I converted a bam file (same RNA-Seq Alignment as above) using samtools' genomeCoverageBed into a bed file and then using UCSCs script bedGraphToBigWig to convert it into a bigWig format.
The point now is that the XY Plot displays a rather high coverage at positions which are supposed to be an intron (the green track). I think the problem lies in the conversion, because if I have a look at the bam file in the IGV browser the automatically generated coverage histogram doesn't show the same pattern. Does anyone has some experience with such conversions?
If I use the  Alignment track for the coverage histogram looks much better, but I ask myself what this darker grey area at the bottom of the histogram stands for?

The genomeCoverageBed code (I'm guessing you mean bed tools rather than samtools) doesn't filter the reads in any way so will include duplicates, supplementary reads etc.  My understanding is that the IGV coverage (generated on the fly) honours any read selection criteria you may have set.  If you want genomeCoverageBed to be more representative you would have to filter the input:

% samtools view -ub -F 3840 in.bam | genomeCoverageBed -ibam - -g genome.lst

Try generating the resulting file with and without the '-F' option to see the difference (limit the samtools view to a small chromosome for a quick test.)

 


Thank you very much in advance!

cheers
Nadine

----------------------------------------
DI(FH) Nadine Tatto
Bioinformatics

ACIB - Austrian Centre of Industrial Biotechnology
----------------------------------------
Tel: +43 1 47654 6838
Fax: +43 1 36006 6847
Email: [hidden email]
Web: www.acib.at
Office: Muthgasse 18, 1190 Wien
----------------------------------------
ACIB GmbH, Petersgasse 14, 8010 Graz, Austria
FB: 224687y FBG: HG Graz UID: ATU 54545504
---------------------------------------

<IMAGE1.img>

------------------------------------------------------------------------------
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax


-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Question on Read and coverage Display

Colin
Hi Nadine,

If I remember correctly, I think you can also use genomeCoverageBed with the -split option (or "bedtools genomecov"  http://bedtools.readthedocs.org/en/latest/content/tools/genomecov.html) to make it so that the introns don't get counted in the coverage


-Colin

On Wed, Oct 28, 2015 at 8:56 AM, Keiran Raine <[hidden email]> wrote:
Hi Nadine,

Please see inline comments below:

Regards,

Keiran Raine
Principal Bioinformatician
Cancer Genome Project
Wellcome Trust Sanger Institute

Tel:<a href="tel:%2B44%20%280%291223%20834244%C2%A0Ext%3A%207703" value="+441223834244" target="_blank">+44 (0)1223 834244 Ext: 7703
Office: H104

On 23 Oct 2015, at 15:54, Nadine Elpida Tatto <[hidden email]> wrote:

Hey there,

I have a question according die method reads are displayed in an Alignment2 track. In my case it is about RNA-Seq reads, where less reads are in the track than I expected.
My assumption is, that identical reads are combined in one read to decrease the overall amount of displayed reads. Is that correct? And if, is there a way to see how much reads are combined into one?

Take a look at the track menu, some read classes are turned off by default.  This may be where you missing reads are hiding.


Then I have another question according the XYPlot with a bigWig file as a source. I converted a bam file (same RNA-Seq Alignment as above) using samtools' genomeCoverageBed into a bed file and then using UCSCs script bedGraphToBigWig to convert it into a bigWig format.
The point now is that the XY Plot displays a rather high coverage at positions which are supposed to be an intron (the green track). I think the problem lies in the conversion, because if I have a look at the bam file in the IGV browser the automatically generated coverage histogram doesn't show the same pattern. Does anyone has some experience with such conversions?
If I use the  Alignment track for the coverage histogram looks much better, but I ask myself what this darker grey area at the bottom of the histogram stands for?

The genomeCoverageBed code (I'm guessing you mean bed tools rather than samtools) doesn't filter the reads in any way so will include duplicates, supplementary reads etc.  My understanding is that the IGV coverage (generated on the fly) honours any read selection criteria you may have set.  If you want genomeCoverageBed to be more representative you would have to filter the input:

% samtools view -ub -F 3840 in.bam | genomeCoverageBed -ibam - -g genome.lst

Try generating the resulting file with and without the '-F' option to see the difference (limit the samtools view to a small chromosome for a quick test.)

 


Thank you very much in advance!

cheers
Nadine

----------------------------------------
DI(FH) Nadine Tatto
Bioinformatics

ACIB - Austrian Centre of Industrial Biotechnology
----------------------------------------
Tel: <a href="tel:%2B43%201%2047654%206838" value="+431476546838" target="_blank">+43 1 47654 6838
Fax: <a href="tel:%2B43%201%2036006%206847" value="+431360066847" target="_blank">+43 1 36006 6847
Email: [hidden email]
Web: www.acib.at
Office: Muthgasse 18, 1190 Wien
----------------------------------------
ACIB GmbH, Petersgasse 14, 8010 Graz, Austria
FB: 224687y FBG: HG Graz UID: ATU 54545504
---------------------------------------

<IMAGE1.img>

------------------------------------------------------------------------------
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax


-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax



------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

[Gmod-ajax] Antw: Re: Question on Read and coverage Display

Nadine Elpida Tatto
Hi Colin,

thank you - I will try this! Seems to me, that this is the solution for my problem.

cheers
Nadine


----------------------------------------
DI(FH) Nadine Tatto
Bioinformatics

ACIB - Austrian Centre of Industrial Biotechnology
----------------------------------------
Tel: +43 1 47654 6838
Fax: +43 1 36006 6847
Email: [hidden email]
Web: www.acib.at
Office: Muthgasse 18, 1190 Wien
----------------------------------------
ACIB GmbH, Petersgasse 14, 8010 Graz, Austria
FB: 224687y FBG: HG Graz UID: ATU 54545504
---------------------------------------


>>> Colin <[hidden email]> 28.10.15 17.53 Uhr >>>
Hi Nadine,

If I remember correctly, I think you can also use genomeCoverageBed with the -split option (or "bedtools genomecov"  http://bedtools.readthedocs.org/en/latest/content/tools/genomecov.html) to make it so that the introns don't get counted in the coverage


-Colin

On Wed, Oct 28, 2015 at 8:56 AM, Keiran Raine <[hidden email]> wrote:
Hi Nadine,

Please see inline comments below:

Regards,

Keiran Raine
Principal Bioinformatician
Cancer Genome Project
Wellcome Trust Sanger Institute

Tel:<a href="tel:%2B44%20%280%291223%20834244%C2%A0Ext%3A%207703" value="+441223834244" target="_blank">+44 (0)1223 834244 Ext: 7703
Office: H104

On 23 Oct 2015, at 15:54, Nadine Elpida Tatto <[hidden email]> wrote:

Hey there,

I have a question according die method reads are displayed in an Alignment2 track. In my case it is about RNA-Seq reads, where less reads are in the track than I expected.
My assumption is, that identical reads are combined in one read to decrease the overall amount of displayed reads. Is that correct? And if, is there a way to see how much reads are combined into one?

Take a look at the track menu, some read classes are turned off by default.  This may be where you missing reads are hiding.


Then I have another question according the XYPlot with a bigWig file as a source. I converted a bam file (same RNA-Seq Alignment as above) using samtools' genomeCoverageBed into a bed file and then using UCSCs script bedGraphToBigWig to convert it into a bigWig format.
The point now is that the XY Plot displays a rather high coverage at positions which are supposed to be an intron (the green track). I think the problem lies in the conversion, because if I have a look at the bam file in the IGV browser the automatically generated coverage histogram doesn't show the same pattern. Does anyone has some experience with such conversions?
If I use the  Alignment track for the coverage histogram looks much better, but I ask myself what this darker grey area at the bottom of the histogram stands for?

The genomeCoverageBed code (I'm guessing you mean bed tools rather than samtools) doesn't filter the reads in any way so will include duplicates, supplementary reads etc.  My understanding is that the IGV coverage (generated on the fly) honours any read selection criteria you may have set.  If you want genomeCoverageBed to be more representative you would have to filter the input:

% samtools view -ub -F 3840 in.bam | genomeCoverageBed -ibam - -g genome.lst

Try generating the resulting file with and without the '-F' option to see the difference (limit the samtools view to a small chromosome for a quick test.)

 


Thank you very much in advance!

cheers
Nadine

----------------------------------------
DI(FH) Nadine Tatto
Bioinformatics

ACIB - Austrian Centre of Industrial Biotechnology
----------------------------------------
Tel: <a href="tel:%2B43%201%2047654%206838" value="+431476546838" target="_blank">+43 1 47654 6838
Fax: <a href="tel:%2B43%201%2036006%206847" value="+431360066847" target="_blank">+43 1 36006 6847
Email: [hidden email]
Web: www.acib.at
Office: Muthgasse 18, 1190 Wien
----------------------------------------
ACIB GmbH, Petersgasse 14, 8010 Graz, Austria
FB: 224687y FBG: HG Graz UID: ATU 54545504
---------------------------------------

<IMAGE1.img>

------------------------------------------------------------------------------
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax


-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax



------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

[Gmod-ajax] Antw: Re: Question on Read and coverage Display

Nadine Elpida Tatto
In reply to this post by Keiran Raine
Hi Keiran, Hey there,

I have a question according die method reads are displayed in an Alignment2 track. In my case it is about RNA-Seq reads, where less reads are in the track than I expected.
My assumption is, that identical reads are combined in one read to decrease the overall amount of displayed reads. Is that correct? And if, is there a way to see how much reads are combined into one?

Take a look at the track menu, some read classes are turned off by default.  This may be where you missing reads are hiding.

>> I know this option, but this doesn't lead to the expected amount of displayed reads.

Then I have another question according the XYPlot with a bigWig file as a source. I converted a bam file (same RNA-Seq Alignment as above) using samtools' genomeCoverageBed into a bed file and then using UCSCs script bedGraphToBigWig to convert it into a bigWig format.
The point now is that the XY Plot displays a rather high coverage at positions which are supposed to be an intron (the green track). I think the problem lies in the conversion, because if I have a look at the bam file in the IGV browser the automatically generated coverage histogram doesn't show the same pattern. Does anyone has some experience with such conversions?
If I use the  Alignment track for the coverage histogram looks much better, but I ask myself what this darker grey area at the bottom of the histogram stands for?

The genomeCoverageBed code (I'm guessing you mean bed tools rather than samtools) doesn't filter the reads in any way so will include duplicates, supplementary reads etc.  My understanding is that the IGV coverage (generated on the fly) honours any read selection criteria you may have set.  If you want genomeCoverageBed to be more representative you would have to filter the input:

% samtools view -ub -F 3840 in.bam | genomeCoverageBed -ibam - -g genome.lst


Try generating the resulting file with and without the '-F' option to see the difference (limit the samtools view to a small chromosome for a quick test.)

>> I will try this too.

Thank you

Best,
Nadine

----------------------------------------
DI(FH) Nadine Tatto
Bioinformatics

ACIB - Austrian Centre of Industrial Biotechnology
----------------------------------------
Tel: +43 1 47654 6838
Fax: +43 1 36006 6847
Email: [hidden email]
Web: www.acib.at
Office: Muthgasse 18, 1190 Wien
----------------------------------------
ACIB GmbH, Petersgasse 14, 8010 Graz, Austria
FB: 224687y FBG: HG Graz UID: ATU 54545504
---------------------------------------



------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: Question on Read and coverage Display

Raymond Wan-2
In reply to this post by Nadine Elpida Tatto

Dear all,

Sorry (to Nadine) for hijacking her question, but I actually have a similar query.


On Fri, Oct 23, 2015 at 10:54 PM, Nadine Elpida Tatto <[hidden email]> wrote:
I have a question according die method reads are displayed in an Alignment2 track. In my case it is about RNA-Seq reads, where less reads are in the track than I expected.
My assumption is, that identical reads are combined in one read to decrease the overall amount of displayed reads. Is that correct? And if, is there a way to see how much reads are combined into one?


In the attached screen capture of a BAM track, the small rectangles on the left are reads and about 100 bp in length -- seems fine to me.  But, I'm wondering what the large rectangles on the right are -- are they reads that have been combined?  Does that mean these reads happen to match up end-to-end perfectly or is there some magic going on behind the scenes if there is a small gap or overlap between reads?

I've clicked on one of the rectangles and the coordinates do indicate that reads have been merged.

Is it possible to make JBrowse to not merge reads together and/or stop merging at a high enough zoom-level?  Even if I zoom up so that a read covers the width of the window, the reads are still merged -- this is causing some confusion to others whom I'm showing the track to.

Thank you!

Ray


 

------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax

jbrowse.png (68K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Antw: Re: Question on Read and coverage Display

Colin
In reply to this post by Nadine Elpida Tatto
Hi Nadine

I saw that you said this

>>> I know this option, but this doesn't lead to the expected amount of displayed reads. 

There was at least one other thread that was similar to this in the past. Link here https://github.com/GMOD/jbrowse/issues/560

This particular bug has been fixed in 1.11.6. Are you using this version?


Also, Raymond, can you make a new thread for that? Very weird problem!



Thanks
-Colin

On Thu, Oct 29, 2015 at 3:25 AM, Nadine Elpida Tatto <[hidden email]> wrote:
Hi Keiran, Hey there,

I have a question according die method reads are displayed in an Alignment2 track. In my case it is about RNA-Seq reads, where less reads are in the track than I expected.
My assumption is, that identical reads are combined in one read to decrease the overall amount of displayed reads. Is that correct? And if, is there a way to see how much reads are combined into one?

Take a look at the track menu, some read classes are turned off by default.  This may be where you missing reads are hiding.

>> I know this option, but this doesn't lead to the expected amount of displayed reads.

Then I have another question according the XYPlot with a bigWig file as a source. I converted a bam file (same RNA-Seq Alignment as above) using samtools' genomeCoverageBed into a bed file and then using UCSCs script bedGraphToBigWig to convert it into a bigWig format.
The point now is that the XY Plot displays a rather high coverage at positions which are supposed to be an intron (the green track). I think the problem lies in the conversion, because if I have a look at the bam file in the IGV browser the automatically generated coverage histogram doesn't show the same pattern. Does anyone has some experience with such conversions?
If I use the  Alignment track for the coverage histogram looks much better, but I ask myself what this darker grey area at the bottom of the histogram stands for?

The genomeCoverageBed code (I'm guessing you mean bed tools rather than samtools) doesn't filter the reads in any way so will include duplicates, supplementary reads etc.  My understanding is that the IGV coverage (generated on the fly) honours any read selection criteria you may have set.  If you want genomeCoverageBed to be more representative you would have to filter the input:

% samtools view -ub -F 3840 in.bam | genomeCoverageBed -ibam - -g genome.lst


Try generating the resulting file with and without the '-F' option to see the difference (limit the samtools view to a small chromosome for a quick test.)

>> I will try this too.

Thank you

Best,
Nadine

----------------------------------------
DI(FH) Nadine Tatto
Bioinformatics

ACIB - Austrian Centre of Industrial Biotechnology
----------------------------------------
Tel: <a href="tel:%2B43%201%2047654%206838" value="+431476546838" target="_blank">+43 1 47654 6838
Fax: <a href="tel:%2B43%201%2036006%206847" value="+431360066847" target="_blank">+43 1 36006 6847
Email: [hidden email]
Web: www.acib.at
Office: Muthgasse 18, 1190 Wien
----------------------------------------
ACIB GmbH, Petersgasse 14, 8010 Graz, Austria
FB: 224687y FBG: HG Graz UID: ATU 54545504
---------------------------------------



------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax



------------------------------------------------------------------------------

_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax