passing names from a gff to new predictions

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

passing names from a gff to new predictions

UMD Bioinformatics
Hello

I have an interesting issue with an existing Maker gff. I have a gff file with human friendly names that I would like to pass to the new predictions. However, some of those genes in the human friendly gff file are incorrect or have errors. If I use the gff as model_gff or pred_gff with the map_forward=1 the names move but so do the incorrect models. Maker simply duplicates these predictions to the new outputs. If I remove the GFF file from the ctl file I get new predictions, that have the necessary corrections but they now have unfriendly names. Do you have any suggestions on how to associate the old names with the new predictions? I could simple blast the old proteins vs the new ones and associate them in that manor but I was wondering if there were any other options within Maker.

Since I have the GFF files I also have the associated transcripts and proteins.
Do I need to do some iteration of est2/genome then generate a new model gff file?

The issue we are dealing with is thousands of short introns in our gff file. These are less than 20 bp and are not biologically feasible so we are trying to correct the gene model predictions.

Cheers
Ian



_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: passing names from a gff to new predictions

Carson Holt-2
If you give anything to pred_gff or model_gff then it is allowed to compete as a predictor and thus can end up in the final results.  You stated that the models you are passing in have errors, and you don't want them to be allowed to compete and end up in your final models? Correct.

MAKER is not made to expect erroneous input, so I don't have an easy solution for you (I do have a less easy solution though; but you will need to do some editing of the MAKER code).

  1. Open .../maker/lib/maker/auto_annotator.pm in an editor like emacs or vi.
  2. Search for the 'best_annotations' subroutine (around line 1248 depending on which version of MAKER you have).
  3. Then edit it as follows:

This is how the top section of the subroutine should look at first -->

sub best_annotations {
     my $annotations = shift;
     my $CTL_OPT = shift;

     my @predictors = @{$CTL_OPT->{_predictor}};

...

Change it to this -->

sub best_annotations {
     my $annotations = shift;
     my $CTL_OPT = shift;

     my @predictors = grep {!/model_gff/} @{$CTL_OPT->{_predictor}};

...



Now run maker again with your old GFF3 file as input to model_gff, and just remember to change the MAKER code back to the way it was when your done with everything.  Basically the change will hard filter model_gff results from being allowed into your final annotations.  So names will still move from model_gff to your final results with the map_forward=1 option but none of the old models will make it as gene/mRNA/exon/CDS features in the final GFF3 (they will still be listed as match/match_part reference features though).

Thanks,
Carson



On 4/15/14, 11:01 AM, "UMD Bioinformatics" <[hidden email]> wrote:

Hello

I have an interesting issue with an existing Maker gff. I have a gff file with human friendly names that I would like to pass to the new predictions. However, some of those genes in the human friendly gff file are incorrect or have errors. If I use the gff as model_gff or pred_gff with the map_forward=1 the names move but so do the incorrect models. Maker simply duplicates these predictions to the new outputs. If I remove the GFF file from the ctl file I get new predictions, that have the necessary corrections but they now have unfriendly names. Do you have any suggestions on how to associate the old names with the new predictions? I could simple blast the old proteins vs the new ones and associate them in that manor but I was wondering if there were any other options within Maker.

Since I have the GFF files I also have the associated transcripts and proteins.
Do I need to do some iteration of est2/genome then generate a new model gff file?

The issue we are dealing with is thousands of short introns in our gff file. These are less than 20 bp and are not biologically feasible so we are trying to correct the gene model predictions.

Cheers
Ian



_______________________________________________
maker-devel mailing list


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: passing names from a gff to new predictions

UMD Bioinformatics
Carson,

That seems to fix this issue. Thanks for the insight not something I would have ever come up with.

Cheers
Ian
On Apr 15, 2014, at 1:31 PM, Carson Holt <[hidden email]> wrote:

If you give anything to pred_gff or model_gff then it is allowed to compete as a predictor and thus can end up in the final results.  You stated that the models you are passing in have errors, and you don't want them to be allowed to compete and end up in your final models? Correct.

MAKER is not made to expect erroneous input, so I don't have an easy solution for you (I do have a less easy solution though; but you will need to do some editing of the MAKER code).

  1. Open .../maker/lib/maker/auto_annotator.pm in an editor like emacs or vi.
  2. Search for the 'best_annotations' subroutine (around line 1248 depending on which version of MAKER you have).
  3. Then edit it as follows:

This is how the top section of the subroutine should look at first -->

sub best_annotations {
     my $annotations = shift;
     my $CTL_OPT = shift;

     my @predictors = @{$CTL_OPT->{_predictor}};

...

Change it to this -->

sub best_annotations {
     my $annotations = shift;
     my $CTL_OPT = shift;

     my @predictors = grep {!/model_gff/} @{$CTL_OPT->{_predictor}};

...



Now run maker again with your old GFF3 file as input to model_gff, and just remember to change the MAKER code back to the way it was when your done with everything.  Basically the change will hard filter model_gff results from being allowed into your final annotations.  So names will still move from model_gff to your final results with the map_forward=1 option but none of the old models will make it as gene/mRNA/exon/CDS features in the final GFF3 (they will still be listed as match/match_part reference features though).

Thanks,
Carson



On 4/15/14, 11:01 AM, "UMD Bioinformatics" <[hidden email]> wrote:

Hello

I have an interesting issue with an existing Maker gff. I have a gff file with human friendly names that I would like to pass to the new predictions. However, some of those genes in the human friendly gff file are incorrect or have errors. If I use the gff as model_gff or pred_gff with the map_forward=1 the names move but so do the incorrect models. Maker simply duplicates these predictions to the new outputs. If I remove the GFF file from the ctl file I get new predictions, that have the necessary corrections but they now have unfriendly names. Do you have any suggestions on how to associate the old names with the new predictions? I could simple blast the old proteins vs the new ones and associate them in that manor but I was wondering if there were any other options within Maker.

Since I have the GFF files I also have the associated transcripts and proteins.
Do I need to do some iteration of est2/genome then generate a new model gff file?

The issue we are dealing with is thousands of short introns in our gff file. These are less than 20 bp and are not biologically feasible so we are trying to correct the gene model predictions.

Cheers
Ian



_______________________________________________
maker-devel mailing list



_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org