questions about generating gene model by GLEAN

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

questions about generating gene model by GLEAN

Zhan, Shuai
Hi,

Is there someone used GLEAN for generating consensus gene set? Would you mind sharing some experience with me? I'd greatly appreciate any help.

I have tried for run it for couple of weeks. At first I found it can't fetch the real sequence from database but on object, then I change some some codes of fetchseq {} of GLEAN::Evidence::Base.
 
By far, it has began processing my input and correctly analyzed the candidate start, stop, donor, and acceptor for the first contig.
But it still failed with something looks like "MSG: asking for tag value that does not exist Evidence".

$glean-lca --database glean --user me --password 123  --param param.yaml > test.dat
No reference provided; attempting to analyze entire genome
Gathering evidence from 'FGENE' for scaff Contig0:1,1129007 ...
The initial exon of CDS:FGENESH(fgp17144.t1) did not begin with a valid start codon
Extending the terminal exon of CDS:FGENESH(fgp17240.t1) to next valid downstream stop codon
The initial exon of CDS:FGENESH(fgp17165.t1) did not begin with a valid start codon
Extending the initial exon of CDS:FGENESH(fgp17192.t1) to next valid upstream start codon
The initial exon of CDS:FGENESH(fgp17178.t1) did not begin with a valid start codon
Error providing evidence type: GeneModel
The error was:
------------- EXCEPTION -------------
MSG: asking for tag value that does not exist Evidence
STACK Bio::SeqFeature::Generic::get_tag_values /usr/lib/perl5/site_perl/5.8.8/Bio/SeqFeature/Generic.pm:517
STACK Glean::Site::dump /home/zhan/geneset/glean-gene/bin/../lib/Glean/Site.pm:52
STACK Glean::MLE::_add_evidence /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:167
STACK Glean::MLE::add_evidence /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:94
STACK (eval) /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:203
STACK Glean::MLE::estimate /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:202
STACK toplevel glean-gene/bin/glean-lca:172
 
I think it failed in add_evidence method of Glean::MLE.
According to the track, I also report some key scalar by print sentense.

For example,
It failed in adding the first candidate site to evidence.
The current $site is Contig0:1011027:1011027:0:-1, its primary_tag is "start", its seq_id is "Contig0" its strand is -1.
At first it invoked _add_evidence {} of Glean/MLE.pm.
The problem is $site->dump need get the value of tag "Evidence" for $site, but actually the "Evidence" tag didn't exist for this site.
THen I found $site of @sites was created by list_starts {} of Glean::Evidence::GeneModel, but its inital tags only have "Next Stop" and "Readingframe". I can't find any hint of when tag "Evidence" was added to $site.

The related source codes of glean are listed:
sub _add_evidence{
  my @sites = $evid->list_sites($scaff);
  for my $site (@sites) {
    ...
    $stype->{$sloc}->{_site} ||= $site->dump;
  }
  ...
}
sub dump { # from Glean/Site.pm
  ...
  return join("\t",
              $site->seq_id,
              $site->primary_tag,
              defined $site->score ? sprintf("%g", $site->score == 0 ? $site->worst : -log($site->score)) : "NA",
              ...
              join(";", $site->get_tag_values("Evidence"))
             );
  ...
}
sub list_sites { # from Evidence/GeneModel.pm
  ...
  push @sites, $self->list_starts(@cds);
  ...
}
sub list_starts { # from Evidence/GeneModel.pm
  ...
  push @sites, Glean::Site->new(-primary => "start",
                                  -start   => $pos + $str * (pos($startseq) - 3),
                                  -end     => $pos + $str * (pos($startseq) - 3),
                                  -strand  => $str,
                                  -frame   => 0,
                                  -source  => "$self",
                                  -seq_id  => $seq_id,
                                  -tag => { NextStop => $nextstop,
                                            ReadingFrame => $frame,
                                          },
                                 );
  ...
}
sub add_evidence{
  ...
  eval {
    $self->add_evidence($scaff, $evtype->new($db, $self->{_log}, $algo, $params));
  };
  croak("Error providing evidence type: $type\nThe error was:\n$@") if $@;
  ...
}

Sincerely
Shuai Zhan

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

augustus error

Benjamin Hitz

Not sure if this is the correct forum ... but just trying to get a basic Maker pipleline running for yeast strains:

I get this error:
running  augustus.
#--------- command -------------#
Widget::augustus:
/usr/local/augustus.2.5/bin/augustus --species=saccharomyces --UTR=off /Users/hitz/comparativeGenomics/maker/ab972/ab972_V64.maker.output/ab972_V64_datastore/C8/8A/chr02_2010_05_21//theVoid.chr02_2010_05_21/query.masked.fasta > /Users/hitz/comparativeGenomics/maker/ab972/ab972_V64.maker.output/ab972_V64_datastore/C8/8A/chr02_2010_05_21//theVoid.chr02_2010_05_21/chr02_2010_05_21.masked.all.saccharomyces.augustus
#-------------------------------#
sh: line 1: 33099 Segmentation fault      /usr/local/augustus.2.5/bin/augustus --species=saccharomyces --UTR=off /Users/hitz/comparativeGenomics/maker/ab972/ab972_V64.maker.output/ab972_V64_datastore/C8/8A/chr02_2010_05_21//theVoid.chr02_2010_05_21/query.masked.fasta > /Users/hitz/comparativeGenomics/maker/ab972/ab972_V64.maker.output/ab972_V64_datastore/C8/8A/chr02_2010_05_21//theVoid.chr02_2010_05_21/chr02_2010_05_21.masked.all.saccharomyces.augustus
ERROR: Augustus failed

FATAL ERROR
ERROR: Failed while preparing masked sequence and ab-inits!!

ERROR: Chunk failed at level 6
!!
FAILED CONTIG:chr02_2010_05_21


Augustus doesn't alway fail though - just it seems on the part where it creates the the *all*.augustus files.  It works on 4 chromosomes out of 17 though, and the "fragmented" contigs (auto_annotator files) seem to produce results.

This is being run on an intel mac 64bit running 10.6.5.

Ben

--
Ben Hitz
Senior Scientific Programmer
Saccharomyces Genome Project
Stanford University
[hidden email]




_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: questions about generating gene model by GLEAN

Carson Hinton Holt
In reply to this post by Zhan, Shuai
Re: [maker-devel] questions about generating gene model by GLEAN I don’t use GLEAN, but you can use MAKER to build consensus gene sets.  You just provide it with the existing models in GFF3 format and then provide it with a protein or EST dataset to help guide its decisions on which model is the best supported.  If you also select HMM training sets for SNAP or augustus, then MAKER can build new de novo gene models as well.

If you do not have a GFF3 file for existing genes, you can use map2assembly that comes with MAKER to help map FASTA entries for exiting transcripts to the assembly as a GFF3 file.

Thanks,
Carson


On 1/28/11 9:48 AM, "Zhan, Shuai" <Shuai.Zhan@...> wrote:

Hi,

Is there someone used GLEAN for generating consensus gene set? Would you mind sharing some experience with me? I'd greatly appreciate any help.

I have tried for run it for couple of weeks. At first I found it can't fetch the real sequence from database but on object, then I change some some codes of fetchseq {} of GLEAN::Evidence::Base.

By far, it has began processing my input and correctly analyzed the candidate start, stop, donor, and acceptor for the first contig.
But it still failed with something looks like "MSG: asking for tag value that does not exist Evidence".

$glean-lca --database glean --user me --password 123  --param param.yaml > test.dat
No reference provided; attempting to analyze entire genome
Gathering evidence from 'FGENE' for scaff Contig0:1,1129007 ...
The initial exon of CDS:FGENESH(fgp17144.t1) did not begin with a valid start codon
Extending the terminal exon of CDS:FGENESH(fgp17240.t1) to next valid downstream stop codon
The initial exon of CDS:FGENESH(fgp17165.t1) did not begin with a valid start codon
Extending the initial exon of CDS:FGENESH(fgp17192.t1) to next valid upstream start codon
The initial exon of CDS:FGENESH(fgp17178.t1) did not begin with a valid start codon
Error providing evidence type: GeneModel
The error was:
------------- EXCEPTION -------------
MSG: asking for tag value that does not exist Evidence
STACK Bio::SeqFeature::Generic::get_tag_values /usr/lib/perl5/site_perl/5.8.8/Bio/SeqFeature/Generic.pm:517
STACK Glean::Site::dump /home/zhan/geneset/glean-gene/bin/../lib/Glean/Site.pm:52
STACK Glean::MLE::_add_evidence /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:167
STACK Glean::MLE::add_evidence /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:94
STACK (eval) /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:203
STACK Glean::MLE::estimate /home/zhan/geneset/glean-gene/bin/../lib/Glean/MLE.pm:202
STACK toplevel glean-gene/bin/glean-lca:172

I think it failed in add_evidence method of Glean::MLE.
According to the track, I also report some key scalar by print sentense.

For example,
It failed in adding the first candidate site to evidence.
The current $site is Contig0:1011027:1011027:0:-1, its primary_tag is "start", its seq_id is "Contig0" its strand is -1.
At first it invoked _add_evidence {} of Glean/MLE.pm.
The problem is $site->dump need get the value of tag "Evidence" for $site, but actually the "Evidence" tag didn't exist for this site.
THen I found $site of @sites was created by list_starts {} of Glean::Evidence::GeneModel, but its inital tags only have "Next Stop" and "Readingframe". I can't find any hint of when tag "Evidence" was added to $site.

The related source codes of glean are listed:
sub _add_evidence{
  my @sites = $evid->list_sites($scaff);
  for my $site (@sites) {
    ...
    $stype->{$sloc}->{_site} ||= $site->dump;
  }
  ...
}
sub dump { # from Glean/Site.pm
  ...
  return join("\t",
              $site->seq_id,
              $site->primary_tag,
              defined $site->score ? sprintf("%g", $site->score == 0 ? $site->worst : -log($site->score)) : "NA",
              ...
              join(";", $site->get_tag_values("Evidence"))
             );
  ...
}
sub list_sites { # from Evidence/GeneModel.pm
  ...
  push @sites, $self->list_starts(@cds);
  ...
}
sub list_starts { # from Evidence/GeneModel.pm
  ...
  push @sites, Glean::Site->new(-primary => "start",
                                  -start   => $pos + $str * (pos($startseq) - 3),
                                  -end     => $pos + $str * (pos($startseq) - 3),
                                  -strand  => $str,
                                  -frame   => 0,
                                  -source  => "$self",
                                  -seq_id  => $seq_id,
                                  -tag => { NextStop => $nextstop,
                                            ReadingFrame => $frame,
                                          },
                                 );
  ...
}
sub add_evidence{
  ...
  eval {
    $self->add_evidence($scaff, $evtype->new($db, $self->{_log}, $algo, $params));
  };
  croak("Error providing evidence type: $type\nThe error was:\n$@") if $@;
  ...
}

Sincerely
Shuai Zhan

_______________________________________________
maker-devel mailing list
maker-devel@...
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org



_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org