gmod_bulk_load_gff3.pl delete option

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

gmod_bulk_load_gff3.pl delete option

Daniel Ence
Hi all, I have been using the gmod_bulk_load_gff3.pl to try to delete old features from one scaffold and to add in new features on the same scaffold. Whenever I run gmod_bulk_load_gff3.pl with the --delete ooption, I get a chado-delete file that is sometimes 100's GB large, fills whatever disk I'm running on, and halts the processes. I haven't been able to get the process to continue by deleting or clearing that chado-delete file. 

Has anyone else experienced the same thing or found a way to avoid filling up the disk?

Thanks,
Daniel

Daniel Ence
Graduate Student
Eccles Institute of Human Genetics
University of Utah
15 North 2030 East, Room 2100
Salt Lake City, UT 84112-5330

------------------------------------------------------------------------------
Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery
and much more. Keep your Java skills current with LearnJavaNow -
200+ hours of step-by-step video tutorials by Java experts.
SALE $49.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122612 
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: gmod_bulk_load_gff3.pl delete option

Scott Cain
Hi Daniel,

That's interesting.  I wonder if there are elements that are getting
repeated, like when you were doling out cookies as a kid (one for you,
one for me, two for you, one-two for me, etc :-)  I realize with a
file that big it might be hard to sort out but can you tell if
anything is weird about it?  Perhaps you could try with a small GFF
file to see how that works.  I honestly haven't even tested the delete
option in a few years (though I don't think anything should have
changed either).

Scott


On Wed, Jan 9, 2013 at 1:54 PM, Daniel Ence <[hidden email]> wrote:

> Hi all, I have been using the gmod_bulk_load_gff3.pl to try to delete old
> features from one scaffold and to add in new features on the same scaffold.
> Whenever I run gmod_bulk_load_gff3.pl with the --delete ooption, I get a
> chado-delete file that is sometimes 100's GB large, fills whatever disk I'm
> running on, and halts the processes. I haven't been able to get the process
> to continue by deleting or clearing that chado-delete file.
>
> Has anyone else experienced the same thing or found a way to avoid filling
> up the disk?
>
> Thanks,
> Daniel
>
> Daniel Ence
> Graduate Student
> Eccles Institute of Human Genetics
> University of Utah
> 15 North 2030 East, Room 2100
> Salt Lake City, UT 84112-5330
>
> ------------------------------------------------------------------------------
> Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery
> and much more. Keep your Java skills current with LearnJavaNow -
> 200+ hours of step-by-step video tutorials by Java experts.
> SALE $49.99 this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122612
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery
and much more. Keep your Java skills current with LearnJavaNow -
200+ hours of step-by-step video tutorials by Java experts.
SALE $49.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122612 
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: gmod_bulk_load_gff3.pl delete option

Daniel Ence
Hi Scott,

Thanks for that suggestion. I will try the debugging that you suggested and let you know what I find out.

One observation that I do have about the repeated deletions is that the output included many lines like this:

          Deleting all features with name species:rnd-1_family-242|genus:LTR, type match and organism cardio_obsc

For any given repeat masked element, there might be anywhere from 2 to 12000 repeated lines like that in the output. This might be a problem with how the gff3 is getting parsed, because repeat-masked elements of the same type will have the same Name attribute, but their ID attributes will be different. I don't know if this would cause the giant file problem I described before, but I am worried that it is deleting all of the repeated elements across the whole genome.

Thanks,
Daniel


Daniel Ence
Graduate Student
Eccles Institute of Human Genetics
University of Utah
15 North 2030 East, Room 2100
Salt Lake City, UT 84112-5330
________________________________________
From: Scott Cain [[hidden email]]
Sent: Wednesday, January 09, 2013 1:49 PM
To: Daniel Ence
Cc: [hidden email]
Subject: Re: [Gmod-schema] gmod_bulk_load_gff3.pl delete option

Hi Daniel,

That's interesting.  I wonder if there are elements that are getting
repeated, like when you were doling out cookies as a kid (one for you,
one for me, two for you, one-two for me, etc :-)  I realize with a
file that big it might be hard to sort out but can you tell if
anything is weird about it?  Perhaps you could try with a small GFF
file to see how that works.  I honestly haven't even tested the delete
option in a few years (though I don't think anything should have
changed either).

Scott


On Wed, Jan 9, 2013 at 1:54 PM, Daniel Ence <[hidden email]> wrote:

> Hi all, I have been using the gmod_bulk_load_gff3.pl to try to delete old
> features from one scaffold and to add in new features on the same scaffold.
> Whenever I run gmod_bulk_load_gff3.pl with the --delete ooption, I get a
> chado-delete file that is sometimes 100's GB large, fills whatever disk I'm
> running on, and halts the processes. I haven't been able to get the process
> to continue by deleting or clearing that chado-delete file.
>
> Has anyone else experienced the same thing or found a way to avoid filling
> up the disk?
>
> Thanks,
> Daniel
>
> Daniel Ence
> Graduate Student
> Eccles Institute of Human Genetics
> University of Utah
> 15 North 2030 East, Room 2100
> Salt Lake City, UT 84112-5330
>
> ------------------------------------------------------------------------------
> Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery
> and much more. Keep your Java skills current with LearnJavaNow -
> 200+ hours of step-by-step video tutorials by Java experts.
> SALE $49.99 this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122612
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>



--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery
and much more. Keep your Java skills current with LearnJavaNow -
200+ hours of step-by-step video tutorials by Java experts.
SALE $49.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122612 
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema