Re: MAKER run error

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: MAKER run error

Scott Cain
Hi Guinara,

I don't know (though my guess would be that you're running out of memory).  I'm cc'ing the MAKER developer's mailing list to see if anybody on that list knows.

Scott


On Wed, Dec 6, 2017 at 8:36 PM, Gulnara Tagirdzhanova <[hidden email]> wrote:
Hello,

I got this error running maker on mac:

STATUS: Parsing control files...
STATUS: Processing and indexing input FASTA files...
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
Filesize limit exceeded: 25

Is there anything that could solve it?

Thank you,
Gulnara





--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: MAKER run error

Carson Holt-2
The FASTA file gets indexed by BioPerl using BerkleyDB. I’m guessing there is something odd about your input file and the database has run out of HASHes for indexing. You can google if there is a setting you can configure in BerkleyDB on Mac. But I suspect you are doing something like giving the raw reads from an mRNA-seq experiment or DNA sequencing to MAKER (resulting in billions of entrires to be indexed), which would be incorrect. MAKER can’t handle raw data. You must first assemble it using using like Trinity for example for mRNA.

Thanks,
Carson

On Dec 7, 2017, at 8:53 AM, Scott Cain <[hidden email]> wrote:

Hi Guinara,

I don't know (though my guess would be that you're running out of memory).  I'm cc'ing the MAKER developer's mailing list to see if anybody on that list knows.

Scott


On Wed, Dec 6, 2017 at 8:36 PM, Gulnara Tagirdzhanova <[hidden email]> wrote:
Hello,

I got this error running maker on mac:

STATUS: Parsing control files...
STATUS: Processing and indexing input FASTA files...
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
Filesize limit exceeded: 25

Is there anything that could solve it?

Thank you,
Gulnara





--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research
_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Private message regarding: MAKER run error

Carson Holt-2
The issue is with Berkley DB. BioPerl is using perl’s DB_File module to index the fastas. 

1. Make sure you do not have an extremely large number of reads in the fasta files (i.e. mRNA-seq data which cannot be used directly as input to MAKER, you must assemble it first into transcriptome contigs)
2. Reinstall perl and compile against the newly installed BerkleyDB libraries.
3. Remove the brew installed BerkleyDB and use perl’s precompiled DB_File module.

You can count reads in your fasta input using this command (replace file.fasta)

grep -c “>” file.fasta

If your counts are really high (i.e. higher than a few hundred thousand maximum), then you have a data issue. You are either giving too much data or the wrong data as input.

—Carson



On Apr 11, 2018, at 11:39 AM, ohon Kin <[hidden email]> wrote:


hello ; Carson 

i really would appreciate your help im kind of having same issue 
i get this Error when i run maker i assumed that it required big memory space  

STATUS: Processing and indexing input FASTA files...
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
Filesize limit exceeded: 25

while working 1T of my Hard-disc capacity seems not enough for maker annotation
i think something wrong in my input data or the dependencies 
 would you please advice on the matter and elaborate solutions please 

i have install BerkleyDB using brew 

The input giving to Maker as followed :
Genome , EST , Protein. all in Fasta format, downloaded from NCBI ---> then added it directly to maker for annotation

 do i have to apply these data pre-process before it applied to maker 








On Thursday, 7 December 2017 19:00:52 UTC+3, Carson Holt wrote:
The FASTA file gets indexed by BioPerl using BerkleyDB. 
 
I’m guessing there is something odd about your input file and the database has run out of HASHes for indexing. 
 
You can google if there is a setting you can configure in BerkleyDB on Mac.
 
But I suspect you are doing something like giving the raw reads from an mRNA-seq experiment or DNA sequencing to MAKER (resulting in billions of entrires to be indexed), which would be incorrect. MAKER can’t handle raw data. You must first assemble it using using like Trinity for example for mRNA.

Thanks,
Carson

On Dec 7, 2017, at 8:53 AM, Scott Cain <sc...@scottcain.net> wrote:

Hi Guinara,

I don't know (though my guess would be that you're running out of memory).  I'm cc'ing the MAKER developer's mailing list to see if anybody on that list knows.

Scott


On Wed, Dec 6, 2017 at 8:36 PM, Gulnara Tagirdzhanova <tagi...@ualberta.ca> wrote:
Hello,

I got this error running maker on mac:

STATUS: Parsing control files...
STATUS: Processing and indexing input FASTA files...
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
Filesize limit exceeded: 25

Is there anything that could solve it?

Thank you,
Gulnara





-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research
_______________________________________________
maker-devel mailing list
maker...@box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Private message regarding: MAKER run error

ohon Kin



grep -c ">" Ca_kacst.fna 

32572


the EST i have are assembled to contigs 

grep -c ">" Ca_EST 

23602



grep -c ">" Ca__protein.faa 

26729


these are my input-data i have reinstall perl as your instructions please have a look, the tool still 1T not enough will stop while running of the run

i get this Error

ad$ ./maker

STATUS: Parsing control files...

WARNING: 'max_dna_len' is set too low.  The minimum value permited is 50,000.

max_dna_len will be reset to 50,000


STATUS: Processing and indexing input FASTA files...

HASH: Out of overflow pages.  Increase page size

HASH: Out of overflow pages.  Increase page size

HASH: Out of overflow pages.  Increase page size

HASH: Out of overflow pages.  Increase page size

HASH: Out of overflow pages.  Increase page size

HASH: Out of overflow pages.  Increase page size

HASH: Out of overflow pages.  Increase page size

HASH: Out of overflow pages.  Increase page size

HASH: Out of overflow pages.  Increase page size

HASH: Out of overflow pages.  Increase page size

Filesize limit exceeded: 25




my maker_opt


#-----Genome (these are always required)
genome=/Users/mohanad/Documents/maker/data/Ca_dromedarius_kacst.fna   #genome sequence (fasta file or fasta embeded in GFF3 file)
organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic

#-----Re-annotation Using MAKER Derived GFF3
maker_gff= #MAKER derived GFF3 file
est_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = no
altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no
protein_pass=0 #use protein alignments in maker_gff: 1 = yes, 0 = no
rm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = no
model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no
pred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no
other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no

#-----EST Evidence (for best results provide a file for at least one)
est=/Users/mohanad/Documents/maker/data/Ca_dromedarius_EST #set of ESTs or assembled mRNA-seq in fasta format
altest= #EST/cDNA sequence file in fasta format from an alternate organism
est_gff= #aligned ESTs or mRNA-seq from an external GFF3 file
altest_gff= #aligned ESTs from a closly relate species in GFF3 format

#-----Protein Homology Evidence (for best results provide a file for at least one)
protein=/Users/mohanad/Documents/maker/data/Ca_dromedarius_V1.0_protein.faa    #protein sequence file in fasta format (i.e. from mutiple oransisms)
protein_gff=  #aligned protein homology evidence from an external GFF3 file

#-----Repeat Masking (leave values blank to skip repeat masking)
model_org=all #select a model organism for RepBase masking in RepeatMasker
rmlib= #provide an organism specific repeat library in fasta format for RepeatMasker
repeat_protein= #provide a fasta file of transposable element proteins for RepeatRunner
rm_gff= #pre-identified repeat elements from an external GFF3 file
prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = no
softmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering)

#-----Gene Prediction
snaphmm= #SNAP HMM file
gmhmm= #GeneMark HMM file
augustus_species= #Augustus gene prediction species model
fgenesh_par_file= #FGENESH parameter file
pred_gff= #ab-initio predictions from an external GFF3 file
model_gff= #annotated gene models from an external GFF3 file (annotation pass-through)
est2genome=1#infer gene predictions directly from ESTs, 1 = yes, 0 = no
protein2genome=0 #infer predictions from protein homology, 1 = yes, 0 = no
trna=0 #find tRNAs with tRNAscan, 1 = yes, 0 = no
snoscan_rrna= #rRNA file to have Snoscan find snoRNAs
unmask=0 #also run ab-initio prediction programs on unmasked sequence, 1 = yes, 0 = no

#-----Other Annotation Feature Types (features MAKER doesn't recognize)
other_gff= #extra features to pass-through to final MAKER generated GFF3 file

#-----External Application Behavior Options
alt_peptide=C #amino acid used to replace non-standard amino acids in BLAST databases
cpus=1 #max number of cpus to use in BLAST and RepeatMasker (not for MPI, leave 1 when using MPI)

#-----MAKER Behavior Options
max_dna_len=10000 #length for dividing up contigs into chunks (increases/decreases memory usage)
min_contig=1 #skip genome contigs below this length (under 10kb are often useless)

pred_flank=200 #flank for extending evidence clusters sent to gene predictors
pred_stats=0 #report AED and QI statistics for all predictions as well as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=0 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no
always_complete=0 #extra steps to force start and stop codons, 1 = yes, 0 = no
map_forward=0 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)

split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)
single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes

tries=2 #number of times to try a contig if there is a failure for some reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no
TMP= #specify a directory other than the system default temporary directory for temporary files


On 11 April 2018 at 20:57, Carson Holt <[hidden email]> wrote:
The issue is with Berkley DB. BioPerl is using perl’s DB_File module to index the fastas. 

1. Make sure you do not have an extremely large number of reads in the fasta files (i.e. mRNA-seq data which cannot be used directly as input to MAKER, you must assemble it first into transcriptome contigs)
2. Reinstall perl and compile against the newly installed BerkleyDB libraries.
3. Remove the brew installed BerkleyDB and use perl’s precompiled DB_File module.

You can count reads in your fasta input using this command (replace file.fasta)

grep -c “>” file.fasta

If your counts are really high (i.e. higher than a few hundred thousand maximum), then you have a data issue. You are either giving too much data or the wrong data as input.

—Carson



On Apr 11, 2018, at 11:39 AM, ohon Kin <[hidden email]> wrote:


hello ; Carson 

i really would appreciate your help im kind of having same issue 
i get this Error when i run maker i assumed that it required big memory space  

STATUS: Processing and indexing input FASTA files...
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
Filesize limit exceeded: 25

while working 1T of my Hard-disc capacity seems not enough for maker annotation
i think something wrong in my input data or the dependencies 
 would you please advice on the matter and elaborate solutions please 

i have install BerkleyDB using brew 

The input giving to Maker as followed :
Genome , EST , Protein. all in Fasta format, downloaded from NCBI ---> then added it directly to maker for annotation

 do i have to apply these data pre-process before it applied to maker 








On Thursday, 7 December 2017 19:00:52 UTC+3, Carson Holt wrote:
The FASTA file gets indexed by BioPerl using BerkleyDB. 
 
I’m guessing there is something odd about your input file and the database has run out of HASHes for indexing. 
 
You can google if there is a setting you can configure in BerkleyDB on Mac.
 
But I suspect you are doing something like giving the raw reads from an mRNA-seq experiment or DNA sequencing to MAKER (resulting in billions of entrires to be indexed), which would be incorrect. MAKER can’t handle raw data. You must first assemble it using using like Trinity for example for mRNA.

Thanks,
Carson

On Dec 7, 2017, at 8:53 AM, Scott Cain <sc...@scottcain.net> wrote:

Hi Guinara,

I don't know (though my guess would be that you're running out of memory).  I'm cc'ing the MAKER developer's mailing list to see if anybody on that list knows.

Scott


On Wed, Dec 6, 2017 at 8:36 PM, Gulnara Tagirdzhanova <tagi...@ualberta.ca> wrote:
Hello,

I got this error running maker on mac:

STATUS: Parsing control files...
STATUS: Processing and indexing input FASTA files...
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
Filesize limit exceeded: 25

Is there anything that could solve it?

Thank you,
Gulnara





-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research
_______________________________________________
maker-devel mailing list
maker...@box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org




--

Warning: This message and its attachment, if any, are confidential and may contain information protected by law. If you are not the intended recipient, please contact the sender immediately and delete the message and its attachment, if any. You should not copy the message and its attachment, if any, or disclose its contents to any other person or use it for any purpose. Statements and opinions expressed in this e-mail and its attachment, if any, are those of the sender, and do not necessarily reflect those of kacst. accepts no liability for any damage caused by this email.

تحذير: هذه الرسالة وما تحويه من مرفقات (إن وجدت) تمثل وثيقة سرية قد تحتوي على معلومات محمية بموجب القانون. إذا لم تكن الشخص المعني بهذه الرسالة فيجب عليك تنبيه المُرسل بخطأ وصولها إليك، وحذف الرسالة ومرفقاتها (إن وجدت)، ولا يجوز لك نسخ أو توزيع هذه الرسالة أو مرفقاتها (إن وجدت) أو أي جزء منها، أو البوح بمحتوياتها للغير أو استعمالها لأي غرض. علماً بأن فحوى هذه الرسالة ومرفقاتها (ان وجدت) تعبر عن رأي المُرسل وليس بالضرورة رأي مدينة الملك عبدالعزيز، ولا تتحمل المدينة أي مسئولية عن الأضرار الناتجة عن ما قد يحتويه هذا البريد.


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Private message regarding: MAKER run error

Carson Holt-2
The datasets do not look too large. The failure you are seeing is happening outside of MAKER. So there is something wrong on the system itself. You will probably have to reinstall perl against your local libraries especially if you reinstalled BerkleyDB. Or try downloading the latest stable release of Perl (it comes precompiled against static libraries - Berkeley DB version 1.x which can help avoid some issues). You will have to reinstall MAKER to use that version of Perl (MAKER uses the perl version used to call Build.PL during the install).

If you are running on something like FreeBSD, it may just break Perl’s DB_File.

Also this note from CPAN —>
Although DB_File is intended to be used with Berkeley DB version 1, it can also be used with version 2, 3 or 4. In this case the interface is limited to the functionality provided by Berkeley DB 1.x.

If reinstalling tools does not work around your issue, you may just have to run on a different system.

—Carson


On Apr 15, 2018, at 8:34 AM, ohon Kin <[hidden email]> wrote:



grep -c ">" Ca_kacst.fna 
32572


the EST i have are assembled to contigs 
grep -c ">" Ca_EST 
23602


grep -c ">" Ca__protein.faa 
26729

these are my input-data i have reinstall perl as your instructions please have a look, the tool still 1T not enough will stop while running of the run

i get this Error
ad$ ./maker
STATUS: Parsing control files...
WARNING: 'max_dna_len' is set too low.  The minimum value permited is 50,000.
max_dna_len will be reset to 50,000

STATUS: Processing and indexing input FASTA files...
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
Filesize limit exceeded: 25



my maker_opt


#-----Genome (these are always required)
genome=/Users/mohanad/Documents/maker/data/Ca_dromedarius_kacst.fna   #genome sequence (fasta file or fasta embeded in GFF3 file)
organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic

#-----Re-annotation Using MAKER Derived GFF3
maker_gff= #MAKER derived GFF3 file
est_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = no
altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no
protein_pass=0 #use protein alignments in maker_gff: 1 = yes, 0 = no
rm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = no
model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no
pred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no
other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no

#-----EST Evidence (for best results provide a file for at least one)
est=/Users/mohanad/Documents/maker/data/Ca_dromedarius_EST #set of ESTs or assembled mRNA-seq in fasta format
altest= #EST/cDNA sequence file in fasta format from an alternate organism
est_gff= #aligned ESTs or mRNA-seq from an external GFF3 file
altest_gff= #aligned ESTs from a closly relate species in GFF3 format

#-----Protein Homology Evidence (for best results provide a file for at least one)
protein=/Users/mohanad/Documents/maker/data/Ca_dromedarius_V1.0_protein.faa    #protein sequence file in fasta format (i.e. from mutiple oransisms)
protein_gff=  #aligned protein homology evidence from an external GFF3 file

#-----Repeat Masking (leave values blank to skip repeat masking)
model_org=all #select a model organism for RepBase masking in RepeatMasker
rmlib= #provide an organism specific repeat library in fasta format for RepeatMasker
repeat_protein= #provide a fasta file of transposable element proteins for RepeatRunner
rm_gff= #pre-identified repeat elements from an external GFF3 file
prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = no
softmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering)

#-----Gene Prediction
snaphmm= #SNAP HMM file
gmhmm= #GeneMark HMM file
augustus_species= #Augustus gene prediction species model
fgenesh_par_file= #FGENESH parameter file
pred_gff= #ab-initio predictions from an external GFF3 file
model_gff= #annotated gene models from an external GFF3 file (annotation pass-through)
est2genome=1#infer gene predictions directly from ESTs, 1 = yes, 0 = no
protein2genome=0 #infer predictions from protein homology, 1 = yes, 0 = no
trna=0 #find tRNAs with tRNAscan, 1 = yes, 0 = no
snoscan_rrna= #rRNA file to have Snoscan find snoRNAs
unmask=0 #also run ab-initio prediction programs on unmasked sequence, 1 = yes, 0 = no

#-----Other Annotation Feature Types (features MAKER doesn't recognize)
other_gff= #extra features to pass-through to final MAKER generated GFF3 file

#-----External Application Behavior Options
alt_peptide=C #amino acid used to replace non-standard amino acids in BLAST databases
cpus=1 #max number of cpus to use in BLAST and RepeatMasker (not for MPI, leave 1 when using MPI)

#-----MAKER Behavior Options
max_dna_len=10000 #length for dividing up contigs into chunks (increases/decreases memory usage)
min_contig=1 #skip genome contigs below this length (under 10kb are often useless)

pred_flank=200 #flank for extending evidence clusters sent to gene predictors
pred_stats=0 #report AED and QI statistics for all predictions as well as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=0 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no
always_complete=0 #extra steps to force start and stop codons, 1 = yes, 0 = no
map_forward=0 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)

split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)
single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes

tries=2 #number of times to try a contig if there is a failure for some reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no
TMP= #specify a directory other than the system default temporary directory for temporary files


On 11 April 2018 at 20:57, Carson Holt <[hidden email]> wrote:
The issue is with Berkley DB. BioPerl is using perl’s DB_File module to index the fastas. 

1. Make sure you do not have an extremely large number of reads in the fasta files (i.e. mRNA-seq data which cannot be used directly as input to MAKER, you must assemble it first into transcriptome contigs)
2. Reinstall perl and compile against the newly installed BerkleyDB libraries.
3. Remove the brew installed BerkleyDB and use perl’s precompiled DB_File module.

You can count reads in your fasta input using this command (replace file.fasta)

grep -c “>” file.fasta

If your counts are really high (i.e. higher than a few hundred thousand maximum), then you have a data issue. You are either giving too much data or the wrong data as input.

—Carson



On Apr 11, 2018, at 11:39 AM, ohon Kin <[hidden email]> wrote:


hello ; Carson 

i really would appreciate your help im kind of having same issue 
i get this Error when i run maker i assumed that it required big memory space  

STATUS: Processing and indexing input FASTA files...
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
HASH: Out of overflow pages.  Increase page size
Filesize limit exceeded: 25

while working 1T of my Hard-disc capacity seems not enough for maker annotation
i think something wrong in my input data or the dependencies 
 would you please advice on the matter and elaborate solutions please 

i have install BerkleyDB using brew 

The input giving to Maker as followed :
Genome , EST , Protein. all in Fasta format, downloaded from NCBI ---> then added it directly to maker for annotation

 do i have to apply these data pre-process before it applied to maker 








On Thursday, 7 December 2017 19:00:52 UTC+3, Carson Holt wrote:
The FASTA file gets indexed by BioPerl using BerkleyDB. 
 
I’m guessing there is something odd about your input file and the database has run out of HASHes for indexing. 
 
You can google if there is a setting you can configure in BerkleyDB on Mac.
 
But I suspect you are doing something like giving the raw reads from an mRNA-seq experiment or DNA sequencing to MAKER (resulting in billions of entrires to be indexed), which would be incorrect. MAKER can’t handle raw data. You must first assemble it using using like Trinity for example for mRNA.

Thanks,
Carson

On Dec 7, 2017, at 8:53 AM, Scott Cain <sc...@scottcain.net> wrote:

Hi Guinara,

I don't know (though my guess would be that you're running out of memory).  I'm cc'ing the MAKER developer's mailing list to see if anybody on that list knows.

Scott


On Wed, Dec 6, 2017 at 8:36 PM, Gulnara Tagirdzhanova <tagi...@ualberta.ca> wrote:
Hello,

I got this error running maker on mac:

STATUS: Parsing control files...
STATUS: Processing and indexing input FASTA files...
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
HASH: Out of overflow pages. Increase page size
Filesize limit exceeded: 25

Is there anything that could solve it?

Thank you,
Gulnara





-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research
_______________________________________________
maker-devel mailing list
maker...@box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org




--
Warning: This message and its attachment, if any, are confidential and may contain information protected by law. If you are not the intended recipient, please contact the sender immediately and delete the message and its attachment, if any. You should not copy the message and its attachment, if any, or disclose its contents to any other person or use it for any purpose. Statements and opinions expressed in this e-mail and its attachment, if any, are those of the sender, and do not necessarily reflect those of kacst. accepts no liability for any damage caused by this email.
تحذير: هذه الرسالة وما تحويه من مرفقات (إن وجدت) تمثل وثيقة سرية قد تحتوي على معلومات محمية بموجب القانون. إذا لم تكن الشخص المعني بهذه الرسالة فيجب عليك تنبيه المُرسل بخطأ وصولها إليك، وحذف الرسالة ومرفقاتها (إن وجدت)، ولا يجوز لك نسخ أو توزيع هذه الرسالة أو مرفقاتها (إن وجدت) أو أي جزء منها، أو البوح بمحتوياتها للغير أو استعمالها لأي غرض. علماً بأن فحوى هذه الرسالة ومرفقاتها (ان وجدت) تعبر عن رأي المُرسل وليس بالضرورة رأي مدينة الملك عبدالعزيز، ولا تتحمل المدينة أي مسئولية عن الأضرار الناتجة عن ما قد يحتويه هذا البريد.


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org