[Gmod-ajax] How do I exclude duplicates from Alignments2 and SNPCoverage tracks

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[Gmod-ajax] How do I exclude duplicates from Alignments2 and SNPCoverage tracks

Keiran Raine
Hi,

Is there a setting to handle this?

Thanks,

Keiran Raine
Principal Bioinformatician
Cancer Genome Project
Wellcome Trust Sanger Institute

Tel:+44 (0)1223 834244 Ext: 7703
Office: H104


-- The Wellcome Trust Sanger Institute is operated by Genome Rese arch Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: How do I exclude duplicates from Alignments2 and SNPCoverage tracks

Robert Buels-2
No, there isn't currently a setting for this.

What are some of the use cases for it?  If you can propose a way to
implement efficiently, I might consider doing it, but I can't think of
one off the top of my head.


Robert Buels
Lead Developer
JBrowse - http://jbrowse.org

On 08/29/2013 09:13 AM, Keiran Raine wrote:

> Hi,
>
> Is there a setting to handle this?
>
> Thanks,
>
> Keiran Raine
> Principal Bioinformatician
> Cancer Genome Project
> Wellcome Trust Sanger Institute
>
> [hidden email] <mailto:[hidden email]>
> Tel:+44 (0)1223 834244 Ext: 7703
> Office: H104
>
>
> -- The Wellcome Trust Sanger Institute is operated by Genome Rese arch
> Limited, a charity registered in England with number 1021457 and a
> company registered in England with number 2742969, whose registered
> office is 215 Euston Road, London, NW1 2BE.
>
>
> ------------------------------------------------------------------------------
> Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
> Discover the easy way to master current and previous Microsoft technologies
> and advance your career. Get an incredible 1,500+ hours of step-by-step
> tutorial videos with LearnDevNow. Subscribe today and save!
> http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
>
>
>
> _______________________________________________
> Gmod-ajax mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-ajax
>

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: How do I exclude duplicates from Alignments2 and SNPCoverage tracks

Keiran Raine
What are some of the use cases for it?  If you can propose a way to implement efficiently, I might consider doing it, but I can't think of one off the top of my head.

Generally you don't want to see duplicates, most callers don't use information from duplicate reads as PCR and Optical duplicates skew allele frequency.  The attached image shows the same data, the first track had duplicates 'marked', the second the reads have been removed.

Although you can save space in BAM files by removing duplicate reads it means that you can't extract back to the starting point and remap with a different algorithm.

It's the default in IGV… sadly we've only just switched to doing this and I've noticed that GBrowse is showing the duplicate reads too.


-- The Wellcome Trust Sanger Institute is operated by Genome Rese arch Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: How do I exclude duplicates from Alignments2 and SNPCoverage tracks

Robert Buels-2
Can you email me a link to one of your BAM files with duplicate reads?
I'll see if I can figure out a good way to do it.

Until then, there's always the option of filtering out duplicates by
post-processing the BAM before displaying it.


Robert Buels
Lead Developer
JBrowse - http://jbrowse.org

On 08/29/2013 10:04 AM, Keiran Raine wrote:

>> What are some of the use cases for it?  If you can propose a way to
>> implement efficiently, I might consider doing it, but I can't think of
>> one off the top of my head.
>
> Generally you don't want to see duplicates, most callers don't use
> information from duplicate reads as PCR and Optical duplicates skew
> allele frequency.  The attached image shows the same data, the first
> track had duplicates 'marked', the second the reads have been removed.
>
> Although you can save space in BAM files by removing duplicate reads it
> means that you can't extract back to the starting point and remap with a
> different algorithm.
>
> It's the default in IGV… sadly we've only just switched to doing this
> and I've noticed that GBrowse is showing the duplicate reads too.
>
>
> -- The Wellcome Trust Sanger Institute is operated by Genome Research
> Limited, a charity registered in England with number 1021457 and a
> company registered in England with number 2742969, whose registered
> office is 215 Euston Road, London, NW1 2BE.

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: How do I exclude duplicates from Alignments2 and SNPCoverage tracks

Robert Buels-2
Ah good.  I thought I remembered there being a flag for that in the BAM
spec, but I hadn't looked yet.

Would you mind opening a github issue for this?


Robert Buels
Lead Developer
JBrowse - http://jbrowse.org

On 08/29/2013 10:28 AM, Keiran Raine wrote:

> Hi Robert,
>
> It should be as simple as:
>
> next if(flag &1024);
>
>
> -- The Wellcome Trust Sanger Institute is operated by Genome Research
> Limited, a charity registered in England with number 1021457 and a
> company registered in England with number 2742969, whose registered
> office is 215 Euston Road, London, NW1 2BE.
>
>
>
>
>
>
> Keiran Raine
> Principal Bioinformatician
> Cancer Genome Project
> Wellcome Trust Sanger Institute
>
> [hidden email] <mailto:[hidden email]>
> Tel:+44 (0)1223 834244 Ext: 7703
> Office: H104
>
> On 29 Aug 2013, at 15:19, Robert Buels <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>> Can you email me a link to one of your BAM files with duplicate reads?
>> I'll see if I can figure out a good way to do it.
>>
>> Until then, there's always the option of filtering out duplicates by
>> post-processing the BAM before displaying it.
>>
>>
>> Robert Buels
>> Lead Developer
>> JBrowse - http://jbrowse.org
>>
>> On 08/29/2013 10:04 AM, Keiran Raine wrote:
>>>> What are some of the use cases for it?  If you can propose a way to
>>>> implement efficiently, I might consider doing it, but I can't think of
>>>> one off the top of my head.
>>>
>>> Generally you don't want to see duplicates, most callers don't use
>>> information from duplicate reads as PCR and Optical duplicates skew
>>> allele frequency.  The attached image shows the same data, the first
>>> track had duplicates 'marked', the second the reads have been removed.
>>>
>>> Although you can save space in BAM files by removing duplicate reads it
>>> means that you can't extract back to the starting point and remap with a
>>> different algorithm.
>>>
>>> It's the default in IGV… sadly we've only just switched to doing this
>>> and I've noticed that GBrowse is showing the duplicate reads too.
>>>
>>>
>>> -- The Wellcome Trust Sanger Institute is operated by Genome Research
>>> Limited, a charity registered in England with number 1021457 and a
>>> company registered in England with number 2742969, whose registered
>>> office is 215 Euston Road, London, NW1 2BE.
>

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: How do I exclude duplicates from Alignments2 and SNPCoverage tracks

Keiran Raine
Added #332

Keiran Raine
Principal Bioinformatician
Cancer Genome Project
Wellcome Trust Sanger Institute

Tel:+44 (0)1223 834244 Ext: 7703
Office: H104

On 29 Aug 2013, at 15:38, Robert Buels <[hidden email]> wrote:

Ah good.  I thought I remembered there being a flag for that in the BAM spec, but I hadn't looked yet.

Would you mind opening a github issue for this?


Robert Buels
Lead Developer
JBrowse - http://jbrowse.org

On 08/29/2013 10:28 AM, Keiran Raine wrote:
Hi Robert,

It should be as simple as:

next if(flag &1024);


-- The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.






Keiran Raine
Principal Bioinformatician
Cancer Genome Project
Wellcome Trust Sanger Institute

[hidden email] <[hidden email]>
Tel:+44 (0)1223 834244 Ext: 7703
Office: H104

On 29 Aug 2013, at 15:19, Robert Buels <[hidden email]
<[hidden email]>> wrote:

Can you email me a link to one of your BAM files with duplicate reads?
I'll see if I can figure out a good way to do it.

Until then, there's always the option of filtering out duplicates by
post-processing the BAM before displaying it.


Robert Buels
Lead Developer
JBrowse - http://jbrowse.org

On 08/29/2013 10:04 AM, Keiran Raine wrote:
What are some of the use cases for it?  If you can propose a way to
implement efficiently, I might consider doing it, but I can't think of
one off the top of my head.

Generally you don't want to see duplicates, most callers don't use
information from duplicate reads as PCR and Optical duplicates skew
allele frequency.  The attached image shows the same data, the first
track had duplicates 'marked', the second the reads have been removed.

Although you can save space in BAM files by removing duplicate reads it
means that you can't extract back to the starting point and remap with a
different algorithm.

It's the default in IGV… sadly we've only just switched to doing this
and I've noticed that GBrowse is showing the duplicate reads too.


-- The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.



-- The Wellcome Trust Sanger Institute is operated by Genome Rese arch Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax
Reply | Threaded
Open this post in threaded view
|

Re: How do I exclude duplicates from Alignments2 and SNPCoverage tracks

Andrew V. Uzilov
I heartily second this feature request, but I would like to propose a more general solution --- IGV also implements a checkbox filtering on some other BAM/SAM flag bits (but not all), so why not go for all flags and have a panel that's something like this:


I added elaboration on this in github issue #332 that Kieran opened.



On Thu, Aug 29, 2013 at 10:44 AM, Keiran Raine <[hidden email]> wrote:
Added #332

Keiran Raine
Principal Bioinformatician
Cancer Genome Project
Wellcome Trust Sanger Institute

Tel:<a href="tel:%2B44%20%280%291223%20834244%C2%A0Ext%3A%207703" value="+441223834244" target="_blank">+44 (0)1223 834244 Ext: 7703
Office: H104

On 29 Aug 2013, at 15:38, Robert Buels <[hidden email]> wrote:

Ah good.  I thought I remembered there being a flag for that in the BAM spec, but I hadn't looked yet.

Would you mind opening a github issue for this?


Robert Buels
Lead Developer
JBrowse - http://jbrowse.org

On 08/29/2013 10:28 AM, Keiran Raine wrote:
Hi Robert,

It should be as simple as:

next if(flag &1024);


-- The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.






Keiran Raine
Principal Bioinformatician
Cancer Genome Project
Wellcome Trust Sanger Institute

[hidden email] <[hidden email]>
Tel:<a href="tel:%2B44%20%280%291223%20834244%20Ext%3A%207703" value="+441223834244" target="_blank">+44 (0)1223 834244 Ext: 7703
Office: H104

On 29 Aug 2013, at 15:19, Robert Buels <[hidden email]
<[hidden email]>> wrote:

Can you email me a link to one of your BAM files with duplicate reads?
I'll see if I can figure out a good way to do it.

Until then, there's always the option of filtering out duplicates by
post-processing the BAM before displaying it.


Robert Buels
Lead Developer
JBrowse - http://jbrowse.org

On 08/29/2013 10:04 AM, Keiran Raine wrote:
What are some of the use cases for it?  If you can propose a way to
implement efficiently, I might consider doing it, but I can't think of
one off the top of my head.

Generally you don't want to see duplicates, most callers don't use
information from duplicate reads as PCR and Optical duplicates skew
allele frequency.  The attached image shows the same data, the first
track had duplicates 'marked', the second the reads have been removed.

Although you can save space in BAM files by removing duplicate reads it
means that you can't extract back to the starting point and remap with a
different algorithm.

It's the default in IGV… sadly we've only just switched to doing this
and I've noticed that GBrowse is showing the duplicate reads too.


-- The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.



-- The Wellcome Trust Sanger Institute is operated by Genome Rese arch Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax



------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-ajax mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-ajax