gff export with wrong format

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

gff export with wrong format

Pedrolance
Hi,

I am using the gff file generated when exporting the annotations on webapollo 1.0

Why is each feature spanning 3 lines? Is this normal? Also, there is no value for score (which should be denoted as a . in case it is null or so I read), is the expected? The online gff validator online says it is mandatory.

GenomeTools error: could not parse score '' on line 3 in file '/var/www/servers/genometools.org/htdocs/cgi-bin/gff3/Bcinerea_V1.0.gff'


Many thanks for any help.

Cheers

Helder Pedro

The file looks like this:

##gff-version 3
##sequence-region 17 1 247158
17
     WebApollo
  gene 187239 188341 - Name=Bcin17g00180;owner=jvk;ID=0C419286F8B24F2EC4B8F44CE79C863F
17
     WebApollo
  mRNA 187239 188341 - Name=Bcin17g00180.1;Parent=0C419286F8B24F2EC4B8F44CE79C863F;owner=jvk;ID=0B70164208E3C9EA088F077A9E45DCAB
17
     WebApollo
  CDS 188209 188232 - 0 Name=9476153C9706DC42D61F0D29E5AFCB3C;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=9476153C9706DC42D61F0D29E5AFCB3C
17
     WebApollo
  CDS 187300 188103 - 0 Name=9476153C9706DC42D61F0D29E5AFCB3C;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=9476153C9706DC42D61F0D29E5AFCB3C
17
     WebApollo
  exon 187239 188103 - Name=D5B4AB70DFFD424DEC402125C93752D4;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=D5B4AB70DFFD424DEC402125C93752D4
17
     WebApollo
  exon 188209 188341 - Name=460FD97D6C764D23003F0D375AE74D8B;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=460FD97D6C764D23003F0D375AE74D8B



*******************************************
               Hélder Pedro
     PhytoPath Bioinformatician
            EnsemblGenomes

European Bioinformatics Institute     
Wellcome Trust Genome Campus 
        Hinxton  CB10 1SD - UK 

Phone: +44.1223.49.25.92
********************************************











This list is for the Apollo Annotation Editing Tool. Info at http://genomearchitect.org/
If you wish to unsubscribe from the Apollo List: 1. From the address with which you subscribed to the list, send a message to [hidden email] | 2. In the subject line of your email type: unsubscribe apollo | 3. Leave the message body blank.

Reply | Threaded
Open this post in threaded view
|

Re: gff export with wrong format

nathandunn

In your gff3_config.xml what do you have specified as source?

This is the default:

    <!-- value to use in the source column (column 2) of the generated
    GFF3 file. -->
    <source>.</source>

If you changed it to:

 <source>
WebApollo
</source>

you would probably get that result.  If not, we can keep looking.

Nathan


> On Nov 24, 2015, at 1:57 AM, Helder Pedro <[hidden email]> wrote:
>
> Hi,
>
> I am using the gff file generated when exporting the annotations on webapollo 1.0
>
> Why is each feature spanning 3 lines? Is this normal? Also, there is no value for score (which should be denoted as a . in case it is null or so I read), is the expected? The online gff validator online says it is mandatory.
>
> GenomeTools error: could not parse score '' on line 3 in file '/var/www/servers/genometools.org/htdocs/cgi-bin/gff3/Bcinerea_V1.0.gff'
>
>
> Many thanks for any help.
>
> Cheers
>
> Helder Pedro
>
> The file looks like this:
>
> ##gff-version 3
> ##sequence-region 17 1 247158
> 17
>      WebApollo
>   gene 187239 188341 - Name=Bcin17g00180;owner=jvk;ID=0C419286F8B24F2EC4B8F44CE79C863F
> 17
>      WebApollo
>   mRNA 187239 188341 - Name=Bcin17g00180.1;Parent=0C419286F8B24F2EC4B8F44CE79C863F;owner=jvk;ID=0B70164208E3C9EA088F077A9E45DCAB
> 17
>      WebApollo
>   CDS 188209 188232 - 0 Name=9476153C9706DC42D61F0D29E5AFCB3C;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=9476153C9706DC42D61F0D29E5AFCB3C
> 17
>      WebApollo
>   CDS 187300 188103 - 0 Name=9476153C9706DC42D61F0D29E5AFCB3C;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=9476153C9706DC42D61F0D29E5AFCB3C
> 17
>      WebApollo
>   exon 187239 188103 - Name=D5B4AB70DFFD424DEC402125C93752D4;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=D5B4AB70DFFD424DEC402125C93752D4
> 17
>      WebApollo
>   exon 188209 188341 - Name=460FD97D6C764D23003F0D375AE74D8B;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=460FD97D6C764D23003F0D375AE74D8B
>
>
>
> *******************************************
>                Hélder Pedro
>      PhytoPath Bioinformatician
>             EnsemblGenomes
>
> European Bioinformatics Institute    
> Wellcome Trust Genome Campus
>         Hinxton  CB10 1SD - UK
>
> Phone: +44.1223.49.25.92
> ********************************************
>
>
>
>
>
>
>
>
>
>
> This list is for the Apollo Annotation Editing Tool. Info at http://genomearchitect.org/
> If you wish to unsubscribe from the Apollo List: 1. From the address with which you subscribed to the list, send a message to [hidden email] | 2. In the subject line of your email type: unsubscribe apollo | 3. Leave the message body blank.
>





This list is for the Apollo Annotation Editing Tool. Info at http://genomearchitect.org/
If you wish to unsubscribe from the Apollo List: 1. From the address with which you subscribed to the list, send a message to [hidden email] | 2. In the subject line of your email type: unsubscribe apollo | 3. Leave the message body blank.

Reply | Threaded
Open this post in threaded view
|

Re: gff export with wrong format

Pedrolance
Hello Nathan,

I want it as WebApollo. 

The problem is the exported gff has extra newlines and extra tabs that are messing up the format. I had to remove the extra newlines and extra tabs so it recognised each line fields  properly. Then I also had to replace the score field with a "." because the online authenticator says that "." represents a null value. http://genometools.org/cgi-bin/gff3validator.cgi

My main question is: Why the 2 newlines for each feature? It is not how gff works at all. 

Here is my configuration file:

<?xml version="1.0" encoding="UTF-8"?>

<!-- configuration file for GFF3 data adapter -->

<gff3_config>

  <!-- path to where to put generated GFF3 file.  This path is a
  relative path that will be where you deployed your WebApollo
  instance (so that it's accessible from HTTP download requests) -->
  <tmp_dir>tmp</tmp_dir>

  <!-- value to use in the source column (column 2) of the generated
  GFF3 file. -->
  <source>
     WebApollo
  </source>

  <!-- which metadata to export as an attribute - optional.
  Default is to export everything except owner, date_creation, and date_last_modified -->

  <metadata_to_export>
    <metadata type="name" />
    <metadata type="symbol" />
    <metadata type="owner" />
  </metadata_to_export>

  <!-- whether to export underlying genomic sequence - optional.
  Defaults to true -->
  <export_source_genomic_sequence>true</export_source_genomic_sequence>

</gff3_config>

*******************************************
               Hélder Pedro
     PhytoPath Bioinformatician
            EnsemblGenomes

European Bioinformatics Institute     
Wellcome Trust Genome Campus 
        Hinxton  CB10 1SD - UK 

Phone: +44.1223.49.25.92
********************************************







On 24 Nov 2015, at 16:46, Nathan Dunn <[hidden email]> wrote:


In your gff3_config.xml what do you have specified as source?

This is the default:

   <!-- value to use in the source column (column 2) of the generated
   GFF3 file. -->
   <source>.</source>

If you changed it to:

<source>
WebApollo
</source>

you would probably get that result.  If not, we can keep looking.

Nathan


On Nov 24, 2015, at 1:57 AM, Helder Pedro <[hidden email]> wrote:

Hi,

I am using the gff file generated when exporting the annotations on webapollo 1.0

Why is each feature spanning 3 lines? Is this normal? Also, there is no value for score (which should be denoted as a . in case it is null or so I read), is the expected? The online gff validator online says it is mandatory.

GenomeTools error: could not parse score '' on line 3 in file '/var/www/servers/genometools.org/htdocs/cgi-bin/gff3/Bcinerea_V1.0.gff'


Many thanks for any help.

Cheers

Helder Pedro

The file looks like this:

##gff-version 3
##sequence-region 17 1 247158
17
    WebApollo
  gene 187239 188341 - Name=Bcin17g00180;owner=jvk;ID=0C419286F8B24F2EC4B8F44CE79C863F
17
    WebApollo
  mRNA 187239 188341 - Name=Bcin17g00180.1;Parent=0C419286F8B24F2EC4B8F44CE79C863F;owner=jvk;ID=0B70164208E3C9EA088F077A9E45DCAB
17
    WebApollo
  CDS 188209 188232 - 0 Name=9476153C9706DC42D61F0D29E5AFCB3C;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=9476153C9706DC42D61F0D29E5AFCB3C
17
    WebApollo
  CDS 187300 188103 - 0 Name=9476153C9706DC42D61F0D29E5AFCB3C;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=9476153C9706DC42D61F0D29E5AFCB3C
17
    WebApollo
  exon 187239 188103 - Name=D5B4AB70DFFD424DEC402125C93752D4;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=D5B4AB70DFFD424DEC402125C93752D4
17
    WebApollo
  exon 188209 188341 - Name=460FD97D6C764D23003F0D375AE74D8B;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=460FD97D6C764D23003F0D375AE74D8B



*******************************************
              Hélder Pedro
    PhytoPath Bioinformatician
           EnsemblGenomes

European Bioinformatics Institute     
Wellcome Trust Genome Campus
       Hinxton  CB10 1SD - UK

Phone: +44.1223.49.25.92
********************************************










This list is for the Apollo Annotation Editing Tool. Info at http://genomearchitect.org/
If you wish to unsubscribe from the Apollo List: 1. From the address with which you subscribed to the list, send a message to [hidden email] | 2. In the subject line of your email type: unsubscribe apollo | 3. Leave the message body blank.






This list is for the Apollo Annotation Editing Tool. Info at http://genomearchitect.org/
If you wish to unsubscribe from the Apollo List: 1. From the address with which you subscribed to the list, send a message to [hidden email] | 2. In the subject line of your email type: unsubscribe apollo | 3. Leave the message body blank.






This list is for the Apollo Annotation Editing Tool. Info at http://genomearchitect.org/
If you wish to unsubscribe from the Apollo List: 1. From the address with which you subscribed to the list, send a message to [hidden email] | 2. In the subject line of your email type: unsubscribe apollo | 3. Leave the message body blank.

Reply | Threaded
Open this post in threaded view
|

Re: gff export with wrong format

nathandunn

I think if you do this:

  <source>WebApollo</source>

it should give you the desired result.  

I agree that it should automatically trim the result, so I added it here:


If you download the code and checkout the 1.0 branch you should see the change, though its not an official release (will be concentrating on 2.0 for the foreseeable future):

git checkout 1.0

Nathan

On Nov 24, 2015, at 8:59 AM, Helder Pedro <[hidden email]> wrote:

Hello Nathan,

I want it as WebApollo. 

The problem is the exported gff has extra newlines and extra tabs that are messing up the format. I had to remove the extra newlines and extra tabs so it recognised each line fields  properly. Then I also had to replace the score field with a "." because the online authenticator says that "." represents a null value. http://genometools.org/cgi-bin/gff3validator.cgi

My main question is: Why the 2 newlines for each feature? It is not how gff works at all. 

Here is my configuration file:

<?xml version="1.0" encoding="UTF-8"?>

<!-- configuration file for GFF3 data adapter -->

<gff3_config>

  <!-- path to where to put generated GFF3 file.  This path is a
  relative path that will be where you deployed your WebApollo
  instance (so that it's accessible from HTTP download requests) -->
  <tmp_dir>tmp</tmp_dir>

  <!-- value to use in the source column (column 2) of the generated
  GFF3 file. -->
  <source>
     WebApollo
  </source>

  <!-- which metadata to export as an attribute - optional.
  Default is to export everything except owner, date_creation, and date_last_modified -->

  <metadata_to_export>
    <metadata type="name" />
    <metadata type="symbol" />
    <metadata type="owner" />
  </metadata_to_export>

  <!-- whether to export underlying genomic sequence - optional.
  Defaults to true -->
  <export_source_genomic_sequence>true</export_source_genomic_sequence>

</gff3_config>

*******************************************
               Hélder Pedro
     PhytoPath Bioinformatician
            EnsemblGenomes

European Bioinformatics Institute     
Wellcome Trust Genome Campus 
        Hinxton  CB10 1SD - UK 

Phone: +44.1223.49.25.92
********************************************







On 24 Nov 2015, at 16:46, Nathan Dunn <[hidden email]> wrote:


In your gff3_config.xml what do you have specified as source?

This is the default:

   <!-- value to use in the source column (column 2) of the generated
   GFF3 file. -->
   <source>.</source>

If you changed it to:

<source>
WebApollo
</source>

you would probably get that result.  If not, we can keep looking.

Nathan


On Nov 24, 2015, at 1:57 AM, Helder Pedro <[hidden email]> wrote:

Hi,

I am using the gff file generated when exporting the annotations on webapollo 1.0

Why is each feature spanning 3 lines? Is this normal? Also, there is no value for score (which should be denoted as a . in case it is null or so I read), is the expected? The online gff validator online says it is mandatory.

GenomeTools error: could not parse score '' on line 3 in file '/var/www/servers/genometools.org/htdocs/cgi-bin/gff3/Bcinerea_V1.0.gff'


Many thanks for any help.

Cheers

Helder Pedro

The file looks like this:

##gff-version 3
##sequence-region 17 1 247158
17
    WebApollo
  gene 187239 188341 - Name=Bcin17g00180;owner=jvk;ID=0C419286F8B24F2EC4B8F44CE79C863F
17
    WebApollo
  mRNA 187239 188341 - Name=Bcin17g00180.1;Parent=0C419286F8B24F2EC4B8F44CE79C863F;owner=jvk;ID=0B70164208E3C9EA088F077A9E45DCAB
17
    WebApollo
  CDS 188209 188232 - 0 Name=9476153C9706DC42D61F0D29E5AFCB3C;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=9476153C9706DC42D61F0D29E5AFCB3C
17
    WebApollo
  CDS 187300 188103 - 0 Name=9476153C9706DC42D61F0D29E5AFCB3C;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=9476153C9706DC42D61F0D29E5AFCB3C
17
    WebApollo
  exon 187239 188103 - Name=D5B4AB70DFFD424DEC402125C93752D4;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=D5B4AB70DFFD424DEC402125C93752D4
17
    WebApollo
  exon 188209 188341 - Name=460FD97D6C764D23003F0D375AE74D8B;Parent=0B70164208E3C9EA088F077A9E45DCAB;owner=jvk;ID=460FD97D6C764D23003F0D375AE74D8B



*******************************************
              Hélder Pedro
    PhytoPath Bioinformatician
           EnsemblGenomes

European Bioinformatics Institute     
Wellcome Trust Genome Campus
       Hinxton  CB10 1SD - UK

Phone: +44.1223.49.25.92
********************************************










This list is for the Apollo Annotation Editing Tool. Info at http://genomearchitect.org/
If you wish to unsubscribe from the Apollo List: 1. From the address with which you subscribed to the list, send a message to [hidden email] | 2. In the subject line of your email type: unsubscribe apollo | 3. Leave the message body blank.






This list is for the Apollo Annotation Editing Tool. Info at http://genomearchitect.org/
If you wish to unsubscribe from the Apollo List: 1. From the address with which you subscribed to the list, send a message to [hidden email] | 2. In the subject line of your email type: unsubscribe apollo | 3. Leave the message body blank.





This list is for the Apollo Annotation Editing Tool. Info at http://genomearchitect.org/
If you wish to unsubscribe from the Apollo List: 1. From the address with which you subscribed to the list, send a message to [hidden email] | 2. In the subject line of your email type: unsubscribe apollo | 3. Leave the message body blank.






This list is for the Apollo Annotation Editing Tool. Info at http://genomearchitect.org/
If you wish to unsubscribe from the Apollo List: 1. From the address with which you subscribed to the list, send a message to [hidden email] | 2. In the subject line of your email type: unsubscribe apollo | 3. Leave the message body blank.