I have been attempting to run MAKER2 (2.31.10) on an assembly of a Haptophyte genome (~160MB). After running four rounds of MAKER2 (details of pipeline shown below in diagram), I obtained a reasonable set of gene calls with great BUSCO completeness and AED scores. However, ~11,00 of the ~42,000 gene models have an overlap in their CDS with another gene model's CDS. There are no overlaps on the same strand (thank goodness), and I have read on the google group threads that cases of overlapping gene models on opposite stands can occur, but I have not seen it to this extent. Below are some examples of some overlapping models. Interestingly, some of these models have unique exon structures compared to their overlapping partner.
I have tried extensively to filter out these overlapping gene models by employing different thresholds such as AED/eAED scores and amount of est/prot/etc support but the majority of these genes seem to have a minimal difference in AED scores and have multiple lines of support. The only trend I have pulled from the output is that 40% of the overlapping genes are labeled as 'snap' or 'snap processed' genes, though again most of these do have support from other sources according to the gff3. I would be very appreciative of any thought or input you may have on these overlapping gene models. If you need more metrics or information on what I did when running MAKER2 to help clarify anything, please let me know!