bp_seqfeature_delete.pl (memory issue)

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

bp_seqfeature_delete.pl (memory issue)

Hyunmin Kim
Hi, all.

I ran the ‘bp_seqfeature_delete.pl’.

My computer has about 15G memory size.

so, it stopped by memory check process by linux.

How can I delete my gff database in mysql with another method?

My command :
$ bp_seqfeature_delete.pl -d database_name -t 'Transposon:RepeatMasker' -u gmod -p gmod_password

I don’t know exactly about the mysql database. 
If you can tell me about mysql table structure and I can delete the my gff database using mysql query sentence.

Thanks,
Hyunmin

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: bp_seqfeature_delete.pl (memory issue)

Timothy Parnell
Hi Hyunmin,

The bp_seqfeature_delete.pl script will take a lot of memory because it searches for all the features in advance and loads them in memory prior to deleting. A much more efficient approach would be to use a stream iterator and delete features as you find them. This simple snippet of code should get you started.

#!/usr/bin/perl

use strict;
use Bio::DB::SeqFeature::Store;

my $db = Bio::DB::SeqFeature::Store->new(
-adaptor => 'DBI::mysql',
-dsn     => 'database_name',
-user    => 'gmod',
-pass    => 'gmod_password’,
-write   => 1,
) or die "can't connect to database\n";

my $stream = $db->get_seq_stream(
-types   => 'Transposon:RepeatMasker',
);
my $deleted = 0;
while (my $f = $stream->next_seq) {
$deleted++ if $db->delete($f);
}
print " $deleted features were deleted\n" if $deleted;

__END__

This simple approach should work well for simple features, but for complex features with parentage and/or normalized features, e.g. genes with multiple transcripts and shared exons, this likely will not work and require more recursive work. The simplest approach for that scenario would be to modify the source GFF3 and reload.

Hope that helps,
Tim

On May 19, 2014, at 7:39 PM, Hyunmin Kim <[hidden email]<mailto:[hidden email]>> wrote:

Hi, all.

I ran the ‘bp_seqfeature_delete.pl’.

My computer has about 15G memory size.

so, it stopped by memory check process by linux.

How can I delete my gff database in mysql with another method?

My command :
$ bp_seqfeature_delete.pl -d database_name -t 'Transposon:RepeatMasker' -u gmod -p gmod_password

I don’t know exactly about the mysql database.
If you can tell me about mysql table structure and I can delete the my gff database using mysql query sentence.

Thanks,
Hyunmin
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: bp_seqfeature_delete.pl (memory issue)

Hyunmin Kim
Thanks for your kindly reply.

I’ll try it.

Best,
Hyunmin

2014. 5. 21., 오전 5:42, Timothy Parnell <[hidden email]> 작성:

> Hi Hyunmin,
>
> The bp_seqfeature_delete.pl script will take a lot of memory because it searches for all the features in advance and loads them in memory prior to deleting. A much more efficient approach would be to use a stream iterator and delete features as you find them. This simple snippet of code should get you started.
>
> #!/usr/bin/perl
>
> use strict;
> use Bio::DB::SeqFeature::Store;
>
> my $db = Bio::DB::SeqFeature::Store->new(
> -adaptor => 'DBI::mysql',
> -dsn     => 'database_name',
> -user    => 'gmod',
> -pass    => 'gmod_password’,
> -write   => 1,
> ) or die "can't connect to database\n";
>
> my $stream = $db->get_seq_stream(
> -types   => 'Transposon:RepeatMasker',
> );
> my $deleted = 0;
> while (my $f = $stream->next_seq) {
> $deleted++ if $db->delete($f);
> }
> print " $deleted features were deleted\n" if $deleted;
>
> __END__
>
> This simple approach should work well for simple features, but for complex features with parentage and/or normalized features, e.g. genes with multiple transcripts and shared exons, this likely will not work and require more recursive work. The simplest approach for that scenario would be to modify the source GFF3 and reload.
>
> Hope that helps,
> Tim
>
> On May 19, 2014, at 7:39 PM, Hyunmin Kim <[hidden email]<mailto:[hidden email]>> wrote:
>
> Hi, all.
>
> I ran the ‘bp_seqfeature_delete.pl’.
>
> My computer has about 15G memory size.
>
> so, it stopped by memory check process by linux.
>
> How can I delete my gff database in mysql with another method?
>
> My command :
> $ bp_seqfeature_delete.pl -d database_name -t 'Transposon:RepeatMasker' -u gmod -p gmod_password
>
> I don’t know exactly about the mysql database.
> If you can tell me about mysql table structure and I can delete the my gff database using mysql query sentence.
>
> Thanks,
> Hyunmin
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos.
> Get unparalleled scalability from the best Selenium testing platform available
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs_______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>


------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse