Re: [Bioperl-l] bio::db::sam coverage problem

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [Bioperl-l] bio::db::sam coverage problem

Fields, Christopher J
Peter,

I’m going to cc the gmod-gbrowse list with this one, as Scott and Lincoln might be able to better answer you.

chris

On Dec 6, 2013, at 3:32 PM, Peter Huang <[hidden email]> wrote:

> Hi there,
>
> I am trying to use Bio::DB::SAM to process my sam file generated from samtools. However, I encountered one problem.
>
> Let's say I want to extract a base from a specific position that contains a SNP. I'd like to obtain count information regarding the individual bases at this position for later analysis.
> If I use samtools view
> /share/apps/software/samtools/samtools   view  LibG_sort.bam   ARP6:5134-5134   -o filteredresults.sam
> less filteredresults.sam | wc
>  32333  614327 17194532
>
> However, when I tried to use the Bio::DB::SAM, the coverage is only 8088.
>
> Here is what I have in my perl script:
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> my $sam = Bio::DB::Sam->new( -bam => $bamFile, -fasta => $reference );
> my $total;
>
> my $cb = sub {
>    my ($seqid, $site, $pileups) = @_;
> if( $site == $position ) {
>    $total = scalar @$pileups;
>        my $refBase = $sam->segment($seqid, $site, $site)->dna;
>
>        for my $pileup (@$pileups){
>            my $al = $pileup->alignment;
>            my $qSeq = $al->qseq;
>
>            my $qStr = substr($qSeq, $pileup->qpos, 1);
>        }
>    }
> }
> $sam->pileup("ARP6:5134-5134", $cb);
> print "total is $total\n";
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> And the output for the total is 8088
>
> I am curious if there is anything I did wrong or I missed anything. Your help is greatly appreciated! Thanks!
>
> Best,
>
> Peter
>
> _______________________________________________
> Bioperl-l mailing list
> [hidden email]
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


------------------------------------------------------------------------------
Sponsored by Intel(R) XDK
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: [Bioperl-l] bio::db::sam coverage problem

Keiran Raine
Hi Peter,

Please see max_pileup_cnt:

$sam->max_pileup_cnt([$new_cnt])

The Samtools library caps pileups at a set level, defaulting to 8000. The callback will not be invoked on a single position more than the level set by the cap, even if there are more reads. Called with no arguments, this method returns the current cap value. Called with a numeric argument, it changes the cap. There is currently no way to specify an unlimited cap.

This method can be called as an instance method or a class method.



Keiran Raine
Principal Bioinformatician
Cancer Genome Project
Wellcome Trust Sanger Institute

Tel:+44 (0)1223 834244 Ext: 7703
Office: H104

On 6 Dec 2013, at 22:31, "Fields, Christopher J" <[hidden email]> wrote:

Peter,

I’m going to cc the gmod-gbrowse list with this one, as Scott and Lincoln might be able to better answer you.

chris

On Dec 6, 2013, at 3:32 PM, Peter Huang <[hidden email]> wrote:

Hi there,

I am trying to use Bio::DB::SAM to process my sam file generated from samtools. However, I encountered one problem.

Let's say I want to extract a base from a specific position that contains a SNP. I'd like to obtain count information regarding the individual bases at this position for later analysis.
If I use samtools view
/share/apps/software/samtools/samtools   view  LibG_sort.bam   ARP6:5134-5134   -o filteredresults.sam
less filteredresults.sam | wc
32333  614327 17194532

However, when I tried to use the Bio::DB::SAM, the coverage is only 8088.

Here is what I have in my perl script:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
my $sam = Bio::DB::Sam->new( -bam => $bamFile, -fasta => $reference );
my $total;

my $cb = sub {
  my ($seqid, $site, $pileups) = @_;
if( $site == $position ) {
  $total = scalar @$pileups;
      my $refBase = $sam->segment($seqid, $site, $site)->dna;

      for my $pileup (@$pileups){
          my $al = $pileup->alignment;
          my $qSeq = $al->qseq;

          my $qStr = substr($qSeq, $pileup->qpos, 1);
      }
  }
}
$sam->pileup("ARP6:5134-5134", $cb);
print "total is $total\n";
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
And the output for the total is 8088

I am curious if there is anything I did wrong or I missed anything. Your help is greatly appreciated! Thanks!

Best,

Peter

_______________________________________________
Bioperl-l mailing list
[hidden email]
http://lists.open-bio.org/mailman/listinfo/bioperl-l


------------------------------------------------------------------------------
Sponsored by Intel(R) XDK
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

------------------------------------------------------------------------------
Sponsored by Intel(R) XDK
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse