Robert Auber
Hello MAKER devs,

I have been attempting to run MAKER2 (2.31.10) on an assembly of a Haptophyte genome (~160MB). After running four rounds of MAKER2 (details of pipeline shown below in diagram), I obtained a reasonable set of gene calls with great BUSCO completeness and AED scores. However, ~11,00 of the ~42,000 gene models have an overlap in their CDS with another gene model's CDS. There are no overlaps on the same strand (thank goodness), and I have read on the google group threads that cases of overlapping gene models on opposite stands can occur, but I have not seen it to this extent. Below are some examples of some overlapping models. Interestingly, some of these models have unique exon structures compared to their overlapping partner.

Screen Shot 2019-11-05 at 12.29.53 PM.png

Screen Shot 2019-11-05 at 12.31.17 PM.png

I have tried extensively to filter out these overlapping gene models by employing different thresholds such as AED/eAED scores and amount of est/prot/etc support but the majority of these genes seem to have a minimal difference in AED scores and have multiple lines of support. The only trend I have pulled from the output is that 40% of the overlapping genes are labeled as 'snap' or 'snap processed' genes, though again most of these do have support from other sources according to the gff3. I would be very appreciative of any thought or input you may have on these overlapping gene models. If you need more metrics or information on what I did when running MAKER2 to help clarify anything, please let me know! 


