map the transcripts back onto the genome using "est2genome=1"
I am trying to annotate a new NMR genome assembly. Since the gene annotation is available for the old version of NMR from NCBI, I tried to map the published refSeq transcripts onto the genome by "est2genome=1". But I found quite a few genes were lost during mapping.
Then I did another test to check the functionality of the mapping by "est2genome=1". I mapped the published refSeq transcripts onto the old genome (the same version for the published gene annotation) by maker with "est2genome=1". Still I can found quite a few genes were lost during the mapping. Below I show you the results of gene annotaion by BUSCOs, which annotation completeness with single-copy orthologs. You can see, even we only consider the single-copy orthologs, there are still 4% were not map back to the genome.
Do you have any comments on this? Besides would you please give us some suggestions to make more published gene annotation map back to the same genome assembly through "est2genome=1"? Attached is the maker_opts.ctl file I used for the mapping. Many thanks.
# this is the BUSCOs results using the published gene annotation
4077 Complete BUSCOs (C) 1367 Complete and single-copy BUSCOs (S) 2710 Complete and duplicated BUSCOs (D) 14 Fragmented BUSCOs (F) 13 Missing BUSCOs (M) 4104 Total BUSCO groups searched
#this is the BUSCOs results using gene models after mapping by maker2. C:93.4%[S:36.5%,D:56.9%],F:2.6%,M:4.0%,n:4104
3830 Complete BUSCOs (C) 1496 Complete and single-copy BUSCOs (S) 2334 Complete and duplicated BUSCOs (D) 105 Fragmented BUSCOs (F) 169 Missing BUSCOs (M) 4104 Total BUSCO groups searched