map_forward and temporary storage questions

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

map_forward and temporary storage questions

Zoe Clarke-2
Hello!

I am currently running Maker on a 2.5GB genome that has already had a list of ~8000 genes very thoroughly annotated. My hope is to find and annotate the rest of the genes using ESTs and protein homology. However, I tested Maker on a single contig of my genome (there are ~20,000 contigs) and I can't find any of the genes from my original gtf file even though I followed all of the instructions in this wiki: http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data (I entered the original gff under model_gff, and used map_forward=1). I am worried this is because my gff3 file isn't formatted properly. Here are a few lines in my gff file as an example:
--------------------------------------------
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  transcript      1094446 1105585 .       +       .       ID=DIMT1.1;geneID=DIMT1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1094446 1094521 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1094874 1094947 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1095459 1095545 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097351 1097412 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097492 1097585 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097670 1097719 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1098957 1099080 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1099217 1099309 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1100870 1100934 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1101967 1102030 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1103784 1103890 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1105543 1105585 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1094446 1094521 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1094874 1094947 .       +       2       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1095459 1095545 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097351 1097412 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097492 1097585 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097670 1097719 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1098957 1099080 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1099217 1099309 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1100870 1100934 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1101967 1102030 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1103784 1103890 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1105543 1105582 .       +       1       Parent=DIMT1.1
--------------------------------------
​This is from the contig I used as a test for maker, and I can't find DIMT1.1 in the final gff file. At first I thought it might be because "geneID" is a listed attribute, but changing this to "Name" didn't help. Do you have any ideas why these genes might not be mapping forward? If it's something I can fix in the gff file, I am hoping I can fix it and use it for the second round of Maker after I have trained Snap.

Also, do you think a better quality annotation would results from Snap trained from this curated list of ~8000 genes (that has been expertly done) or by the round 1 output of Maker?

A final question: I am having memory storage issues with Maker, as it is currently taking up ~15TB of storage with temporary files. I am running Maker on a cluster and whenever my submitted Maker job runs out of memory it fails, so I have to resubmit it about every hour, which leaves a lot of temporary folders (e.g. maker_x6V2y4) in my directory. I notice that some of these temporary files haven't been updated in days - is it okay to delete them?

Thank you so much for your help!
Zoe
______________________________________
Zoe Clarke
PhD candidate in Computational Biology at U of T
Personal website: https://zoe-clarke.weebly.com/

_______________________________________________
maker-devel mailing list
[hidden email]
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: map_forward and temporary storage questions

Carson Holt-2
MAKER uses GFF3 format which is not the same as GTF.  You will need to convert your file to GFF3 format.

You can try this online tool (I haven’t used it to tell you how well it works) http://www.sequenceontology.org/cgi-bin/converter.cgi 

There are also a number of other resources available when you google "how to convert GTF to GFF3”.

—Carson


On Sep 25, 2020, at 3:17 AM, Zoe Clarke <[hidden email]> wrote:

Hello!

I am currently running Maker on a 2.5GB genome that has already had a list of ~8000 genes very thoroughly annotated. My hope is to find and annotate the rest of the genes using ESTs and protein homology. However, I tested Maker on a single contig of my genome (there are ~20,000 contigs) and I can't find any of the genes from my original gtf file even though I followed all of the instructions in this wiki: http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data (I entered the original gff under model_gff, and used map_forward=1). I am worried this is because my gff3 file isn't formatted properly. Here are a few lines in my gff file as an example:
--------------------------------------------
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  transcript      1094446 1105585 .       +       .       ID=DIMT1.1;geneID=DIMT1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1094446 1094521 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1094874 1094947 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1095459 1095545 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097351 1097412 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097492 1097585 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097670 1097719 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1098957 1099080 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1099217 1099309 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1100870 1100934 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1101967 1102030 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1103784 1103890 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1105543 1105585 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1094446 1094521 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1094874 1094947 .       +       2       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1095459 1095545 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097351 1097412 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097492 1097585 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097670 1097719 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1098957 1099080 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1099217 1099309 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1100870 1100934 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1101967 1102030 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1103784 1103890 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1105543 1105582 .       +       1       Parent=DIMT1.1
--------------------------------------
​This is from the contig I used as a test for maker, and I can't find DIMT1.1 in the final gff file. At first I thought it might be because "geneID" is a listed attribute, but changing this to "Name" didn't help. Do you have any ideas why these genes might not be mapping forward? If it's something I can fix in the gff file, I am hoping I can fix it and use it for the second round of Maker after I have trained Snap.

Also, do you think a better quality annotation would results from Snap trained from this curated list of ~8000 genes (that has been expertly done) or by the round 1 output of Maker?

A final question: I am having memory storage issues with Maker, as it is currently taking up ~15TB of storage with temporary files. I am running Maker on a cluster and whenever my submitted Maker job runs out of memory it fails, so I have to resubmit it about every hour, which leaves a lot of temporary folders (e.g. maker_x6V2y4) in my directory. I notice that some of these temporary files haven't been updated in days - is it okay to delete them?

Thank you so much for your help!
Zoe
______________________________________
Zoe Clarke
PhD candidate in Computational Biology at U of T
_______________________________________________
maker-devel mailing list
[hidden email]
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: map_forward and temporary storage questions

Zoe Clarke-2
Thanks Carson!

I managed to get the GFF3 file fixed and Maker is now recognizing it the way it should.

I am hoping I can also ask for your advice on functional annotation with Maker. The result of my current run will be gff and fasta files that have the original gff3 genes plus Maker's. The original gff3 file was annotated with HGNC gene symbols, so these new files are a mix of gene symbols and Maker-derived gene names. I would ideally like my final annotation to consist almost entirely of gene symbols. Is the only way to do this (after using Blastp and InterProScan to get putative homology) to undergo an additional manual curation process through something like Apollo?

Thanks again,
Zoe

______________________________________
Zoe Clarke
PhD candidate in Computational Biology at U of T
Personal website: https://zoe-clarke.weebly.com/

From: Carson Holt <[hidden email]>
Sent: October 2, 2020 4:50 PM
To: Zoe Clarke <[hidden email]>
Cc: [hidden email] <[hidden email]>
Subject: Re: [maker-devel] map_forward and temporary storage questions
 
EXTERNAL EMAIL:  Treat content with extra caution. 
MAKER uses GFF3 format which is not the same as GTF.  You will need to convert your file to GFF3 format.

You can try this online tool (I haven’t used it to tell you how well it works) http://www.sequenceontology.org/cgi-bin/converter.cgi 

There are also a number of other resources available when you google "how to convert GTF to GFF3”.

—Carson


On Sep 25, 2020, at 3:17 AM, Zoe Clarke <[hidden email]> wrote:

Hello!

I am currently running Maker on a 2.5GB genome that has already had a list of ~8000 genes very thoroughly annotated. My hope is to find and annotate the rest of the genes using ESTs and protein homology. However, I tested Maker on a single contig of my genome (there are ~20,000 contigs) and I can't find any of the genes from my original gtf file even though I followed all of the instructions in this wiki: http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data (I entered the original gff under model_gff, and used map_forward=1). I am worried this is because my gff3 file isn't formatted properly. Here are a few lines in my gff file as an example:
--------------------------------------------
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  transcript      1094446 1105585 .       +       .       ID=DIMT1.1;geneID=DIMT1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1094446 1094521 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1094874 1094947 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1095459 1095545 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097351 1097412 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097492 1097585 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097670 1097719 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1098957 1099080 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1099217 1099309 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1100870 1100934 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1101967 1102030 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1103784 1103890 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1105543 1105585 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1094446 1094521 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1094874 1094947 .       +       2       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1095459 1095545 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097351 1097412 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097492 1097585 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097670 1097719 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1098957 1099080 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1099217 1099309 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1100870 1100934 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1101967 1102030 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1103784 1103890 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1105543 1105582 .       +       1       Parent=DIMT1.1
--------------------------------------
​This is from the contig I used as a test for maker, and I can't find DIMT1.1 in the final gff file. At first I thought it might be because "geneID" is a listed attribute, but changing this to "Name" didn't help. Do you have any ideas why these genes might not be mapping forward? If it's something I can fix in the gff file, I am hoping I can fix it and use it for the second round of Maker after I have trained Snap.

Also, do you think a better quality annotation would results from Snap trained from this curated list of ~8000 genes (that has been expertly done) or by the round 1 output of Maker?

A final question: I am having memory storage issues with Maker, as it is currently taking up ~15TB of storage with temporary files. I am running Maker on a cluster and whenever my submitted Maker job runs out of memory it fails, so I have to resubmit it about every hour, which leaves a lot of temporary folders (e.g. maker_x6V2y4) in my directory. I notice that some of these temporary files haven't been updated in days - is it okay to delete them?

Thank you so much for your help!
Zoe
______________________________________
Zoe Clarke
PhD candidate in Computational Biology at U of T
Personal website: https://zoe-clarke.weebly.com/
_______________________________________________
maker-devel mailing list
[hidden email]
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: map_forward and temporary storage questions

Carson Holt-2
If you can create a two column file (column1 being the old ID and column2 being the new ID), you can use the map_fasta_ids and map_gff_ids to rename everything with a corresponding line in the two column file.  Make sure no ID's occur twice (in old or new), it will create issues.  The maker ID will not be lost, instead it will move to the Alias= field in the GFF3.

—Carson



On Oct 5, 2020, at 7:47 AM, Zoe Clarke <[hidden email]> wrote:

Thanks Carson!

I managed to get the GFF3 file fixed and Maker is now recognizing it the way it should.

I am hoping I can also ask for your advice on functional annotation with Maker. The result of my current run will be gff and fasta files that have the original gff3 genes plus Maker's. The original gff3 file was annotated with HGNC gene symbols, so these new files are a mix of gene symbols and Maker-derived gene names. I would ideally like my final annotation to consist almost entirely of gene symbols. Is the only way to do this (after using Blastp and InterProScan to get putative homology) to undergo an additional manual curation process through something like Apollo?

Thanks again,
Zoe

______________________________________
Zoe Clarke
PhD candidate in Computational Biology at U of T

From: Carson Holt <[hidden email]>
Sent: October 2, 2020 4:50 PM
To: Zoe Clarke <[hidden email]>
Cc: [hidden email] <[hidden email]>
Subject: Re: [maker-devel] map_forward and temporary storage questions
 
EXTERNAL EMAIL:  Treat content with extra caution. 
MAKER uses GFF3 format which is not the same as GTF.  You will need to convert your file to GFF3 format.

You can try this online tool (I haven’t used it to tell you how well it works) http://www.sequenceontology.org/cgi-bin/converter.cgi 

There are also a number of other resources available when you google "how to convert GTF to GFF3”.

—Carson


On Sep 25, 2020, at 3:17 AM, Zoe Clarke <[hidden email]> wrote:

Hello!

I am currently running Maker on a 2.5GB genome that has already had a list of ~8000 genes very thoroughly annotated. My hope is to find and annotate the rest of the genes using ESTs and protein homology. However, I tested Maker on a single contig of my genome (there are ~20,000 contigs) and I can't find any of the genes from my original gtf file even though I followed all of the instructions in this wiki: http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data (I entered the original gff under model_gff, and used map_forward=1). I am worried this is because my gff3 file isn't formatted properly. Here are a few lines in my gff file as an example:
--------------------------------------------
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  transcript      1094446 1105585 .       +       .       ID=DIMT1.1;geneID=DIMT1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1094446 1094521 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1094874 1094947 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1095459 1095545 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097351 1097412 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097492 1097585 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1097670 1097719 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1098957 1099080 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1099217 1099309 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1100870 1100934 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1101967 1102030 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1103784 1103890 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  exon    1105543 1105585 97.75   +       .       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1094446 1094521 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1094874 1094947 .       +       2       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1095459 1095545 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097351 1097412 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097492 1097585 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1097670 1097719 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1098957 1099080 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1099217 1099309 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1100870 1100934 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1101967 1102030 .       +       1       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1103784 1103890 .       +       0       Parent=DIMT1.1
WCK01_AAF20200214_F8-ctg36      ovaltine_v0.13  CDS     1105543 1105582 .       +       1       Parent=DIMT1.1
--------------------------------------
​This is from the contig I used as a test for maker, and I can't find DIMT1.1 in the final gff file. At first I thought it might be because "geneID" is a listed attribute, but changing this to "Name" didn't help. Do you have any ideas why these genes might not be mapping forward? If it's something I can fix in the gff file, I am hoping I can fix it and use it for the second round of Maker after I have trained Snap.

Also, do you think a better quality annotation would results from Snap trained from this curated list of ~8000 genes (that has been expertly done) or by the round 1 output of Maker?

A final question: I am having memory storage issues with Maker, as it is currently taking up ~15TB of storage with temporary files. I am running Maker on a cluster and whenever my submitted Maker job runs out of memory it fails, so I have to resubmit it about every hour, which leaves a lot of temporary folders (e.g. maker_x6V2y4) in my directory. I notice that some of these temporary files haven't been updated in days - is it okay to delete them?

Thank you so much for your help!
Zoe
______________________________________
Zoe Clarke
PhD candidate in Computational Biology at U of T
_______________________________________________
maker-devel mailing list
[hidden email]
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org

smime.p7s (1K) Download Attachment