GFF3 loading error

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

GFF3 loading error

Paulo Nuin
Hi everyone


We are having problems loading correctly the current GFF exon lines

I       WormBase        exon    55293   55450   .       -       .       ID=Exon1;Parent=Y48G1C.5
I       WormBase        exon    55890   56047   .       -       .       ID=Exon2;Parent=Y48G1C.5
I       WormBase        exon    56124   56293   .       -       .       ID=Exon3;Parent=Y48G1C.5
I       WormBase        exon    56537   56687   .       -       .       ID=Exon4;Parent=Y48G1C.5
I       WormBase        exon    56872   57053   .       -       .       ID=Exon5;Parent=Y48G1C.5
I       WormBase        exon    57257   57625   .       -       .       ID=Exon6;Parent=Y48G1C.5
I       WormBase        exon    59045   60236   .       -       .       ID=Exon7;Parent=Y48G1C.5
I       WormBase        exon    61236   62051   .       -       .       ID=Exon8;Parent=Y48G1C.5
I       WormBase        exon    62774   62974   .       -       .       ID=Exon9;Parent=Y48G1C.5
I       WormBase        exon    63797   63856   .       -       .       ID=Exon10;Parent=Y48G1C.5
I       WormBase        exon    63904   63972   .       -       .       ID=Exon11;Parent=Y48G1C.5
I       WormBase        exon    64018   64066   .       -       .       ID=Exon12;Parent=Y48G1C.5
I       WormBase        exon    3747    3909    .       -       .       ID=Exon13;Parent=Y74C9A.6

The parent is not being stored correctly in the backend exon table

 secondaryidentifier | symbol | primaryidentifier | lastupdated | length |   id    | name | scoretype | score | chromosomeid | geneid | chromosomelocationid | organismid | sequenceid | sequenceontologytermid |            class
---------------------+--------+-------------------+-------------+--------+---------+------+-----------+-------+--------------+--------+----------------------+------------+------------+------------------------+------------------------------
                     |        | Exon1             |             |    158 | 4000002 |      |           |       |      4000000 |        |              4000001 |    1000000 |            |                4000003 | org.intermine.model.bio.Exon
(1 row)

While the parent transcript seems fine on the transcript table.

 secondaryidentifier | intermine_method |  symbol  | primaryidentifier | lastupdated | length |   id    | name | scoretype | score | chromosomeid | chromosomelocationid | organismid | proteinid | sequenceid | sequenceontologytermid | geneid |            class
---------------------+------------------+----------+-------------------+-------------+--------+---------+------+-----------+-------+--------------+----------------------+------------+-----------+------------+------------------------+--------+------------------------------
                     |                  | Y74C9A.3 | Y74C9A.3          |             |   6115 | 4000002 |      |           |       |      4000000 |              4000001 |    3000000 |           |            |                4000003 |        | org.intermine.model.bio.MRNA
(1 row)

Any ideas why we are not capturing the exon's parent? Any help appreciated.

Thanks

Paulo


_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: GFF3 loading error

sergio contrino-2
hi paulo!
i'll run a test locally, but have you checked if the table exonstranscripts
has been populated?
i think this is the one that would be filled if the relationship is a
many to many one.
thanks
sergio

On 11/12/17 16:56, Paulo Nuin wrote:

> Hi everyone
>
>
> We are having problems loading correctly the current GFF exon lines
>
> I       WormBase        exon    55293   55450   .       -       .       ID=Exon1;Parent=Y48G1C.5
> I       WormBase        exon    55890   56047   .       -       .       ID=Exon2;Parent=Y48G1C.5
> I       WormBase        exon    56124   56293   .       -       .       ID=Exon3;Parent=Y48G1C.5
> I       WormBase        exon    56537   56687   .       -       .       ID=Exon4;Parent=Y48G1C.5
> I       WormBase        exon    56872   57053   .       -       .       ID=Exon5;Parent=Y48G1C.5
> I       WormBase        exon    57257   57625   .       -       .       ID=Exon6;Parent=Y48G1C.5
> I       WormBase        exon    59045   60236   .       -       .       ID=Exon7;Parent=Y48G1C.5
> I       WormBase        exon    61236   62051   .       -       .       ID=Exon8;Parent=Y48G1C.5
> I       WormBase        exon    62774   62974   .       -       .       ID=Exon9;Parent=Y48G1C.5
> I       WormBase        exon    63797   63856   .       -       .       ID=Exon10;Parent=Y48G1C.5
> I       WormBase        exon    63904   63972   .       -       .       ID=Exon11;Parent=Y48G1C.5
> I       WormBase        exon    64018   64066   .       -       .       ID=Exon12;Parent=Y48G1C.5
> I       WormBase        exon    3747    3909    .       -       .       ID=Exon13;Parent=Y74C9A.6
>
> The parent is not being stored correctly in the backend exon table
>
>   secondaryidentifier | symbol | primaryidentifier | lastupdated | length |   id    | name | scoretype | score | chromosomeid | geneid | chromosomelocationid | organismid | sequenceid | sequenceontologytermid |            class
> ---------------------+--------+-------------------+-------------+--------+---------+------+-----------+-------+--------------+--------+----------------------+------------+------------+------------------------+------------------------------
>                       |        | Exon1             |             |    158 | 4000002 |      |           |       |      4000000 |        |              4000001 |    1000000 |            |                4000003 | org.intermine.model.bio.Exon
> (1 row)
>
> While the parent transcript seems fine on the transcript table.
>
>   secondaryidentifier | intermine_method |  symbol  | primaryidentifier | lastupdated | length |   id    | name | scoretype | score | chromosomeid | chromosomelocationid | organismid | proteinid | sequenceid | sequenceontologytermid | geneid |            class
> ---------------------+------------------+----------+-------------------+-------------+--------+---------+------+-----------+-------+--------------+----------------------+------------+-----------+------------+------------------------+--------+------------------------------
>                       |                  | Y74C9A.3 | Y74C9A.3          |             |   6115 | 4000002 |      |           |       |      4000000 |              4000001 |    3000000 |           |            |                4000003 |        | org.intermine.model.bio.MRNA
> (1 row)
>
> Any ideas why we are not capturing the exon's parent? Any help appreciated.
>
> Thanks
>
> Paulo
>
>
> _______________________________________________
> dev mailing list
> [hidden email]
> https://lists.intermine.org/mailman/listinfo/dev
>

--
sergio contrino                  InterMine, University of Cambridge
https://sergiocontrino.github.io           http://www.intermine.org
_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: GFF3 loading error

Paulo Nuin
Ciao Sergio

I checked before and there’s nothing in the exonstranscript table, most of our exons (at least in this test phase)  are a one (transcript) to many exons. Is there any other loader that would organize this after the GFF is loaded?

Cheers
Paulo



> On Dec 13, 2017, at 7:25 AM, sergio contrino <[hidden email]> wrote:
>
> hi paulo!
> i'll run a test locally, but have you checked if the table exonstranscripts
> has been populated?
> i think this is the one that would be filled if the relationship is a many to many one.
> thanks
> sergio
>
> On 11/12/17 16:56, Paulo Nuin wrote:
>> Hi everyone
>> We are having problems loading correctly the current GFF exon lines
>> I       WormBase        exon    55293   55450   .       -       .       ID=Exon1;Parent=Y48G1C.5
>> I       WormBase        exon    55890   56047   .       -       .       ID=Exon2;Parent=Y48G1C.5
>> I       WormBase        exon    56124   56293   .       -       .       ID=Exon3;Parent=Y48G1C.5
>> I       WormBase        exon    56537   56687   .       -       .       ID=Exon4;Parent=Y48G1C.5
>> I       WormBase        exon    56872   57053   .       -       .       ID=Exon5;Parent=Y48G1C.5
>> I       WormBase        exon    57257   57625   .       -       .       ID=Exon6;Parent=Y48G1C.5
>> I       WormBase        exon    59045   60236   .       -       .       ID=Exon7;Parent=Y48G1C.5
>> I       WormBase        exon    61236   62051   .       -       .       ID=Exon8;Parent=Y48G1C.5
>> I       WormBase        exon    62774   62974   .       -       .       ID=Exon9;Parent=Y48G1C.5
>> I       WormBase        exon    63797   63856   .       -       .       ID=Exon10;Parent=Y48G1C.5
>> I       WormBase        exon    63904   63972   .       -       .       ID=Exon11;Parent=Y48G1C.5
>> I       WormBase        exon    64018   64066   .       -       .       ID=Exon12;Parent=Y48G1C.5
>> I       WormBase        exon    3747    3909    .       -       .       ID=Exon13;Parent=Y74C9A.6
>> The parent is not being stored correctly in the backend exon table
>>  secondaryidentifier | symbol | primaryidentifier | lastupdated | length |   id    | name | scoretype | score | chromosomeid | geneid | chromosomelocationid | organismid | sequenceid | sequenceontologytermid |            class
>> ---------------------+--------+-------------------+-------------+--------+---------+------+-----------+-------+--------------+--------+----------------------+------------+------------+------------------------+------------------------------
>>                      |        | Exon1             |             |    158 | 4000002 |      |           |       |      4000000 |        |              4000001 |    1000000 |            |                4000003 | org.intermine.model.bio.Exon
>> (1 row)
>> While the parent transcript seems fine on the transcript table.
>>  secondaryidentifier | intermine_method |  symbol  | primaryidentifier | lastupdated | length |   id    | name | scoretype | score | chromosomeid | chromosomelocationid | organismid | proteinid | sequenceid | sequenceontologytermid | geneid |            class
>> ---------------------+------------------+----------+-------------------+-------------+--------+---------+------+-----------+-------+--------------+----------------------+------------+-----------+------------+------------------------+--------+------------------------------
>>                      |                  | Y74C9A.3 | Y74C9A.3          |             |   6115 | 4000002 |      |           |       |      4000000 |              4000001 |    3000000 |           |            |                4000003 |        | org.intermine.model.bio.MRNA
>> (1 row)
>> Any ideas why we are not capturing the exon's parent? Any help appreciated.
>> Thanks
>> Paulo
>> _______________________________________________
>> dev mailing list
>> [hidden email]
>> https://lists.intermine.org/mailman/listinfo/dev
>
> --
> sergio contrino                  InterMine, University of Cambridge
> https://sergiocontrino.github.io           http://www.intermine.org

_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: GFF3 loading error

Paulo Nuin
Hi Sergio

I will generate the files and send to you.

Thanks a lot

Paulo



> On Dec 14, 2017, at 4:27 AM, sergio contrino <[hidden email]> wrote:
>
> hi paulo,
> could you send me the files for test here?
> the 2 example records you sent have same id, so possibly they come from different test loads?
> thanks!
> sergio
>
> On 13/12/17 20:05, Paulo Nuin wrote:
>> Ciao Sergio
>> I checked before and there’s nothing in the exonstranscript table, most of our exons (at least in this test phase)  are a one (transcript) to many exons. Is there any other loader that would organize this after the GFF is loaded?
>> Cheers
>> Paulo
>>> On Dec 13, 2017, at 7:25 AM, sergio contrino <[hidden email]> wrote:
>>>
>>> hi paulo!
>>> i'll run a test locally, but have you checked if the table exonstranscripts
>>> has been populated?
>>> i think this is the one that would be filled if the relationship is a many to many one.
>>> thanks
>>> sergio
>>>
>>> On 11/12/17 16:56, Paulo Nuin wrote:
>>>> Hi everyone
>>>> We are having problems loading correctly the current GFF exon lines
>>>> I       WormBase        exon    55293   55450   .       -       .       ID=Exon1;Parent=Y48G1C.5
>>>> I       WormBase        exon    55890   56047   .       -       .       ID=Exon2;Parent=Y48G1C.5
>>>> I       WormBase        exon    56124   56293   .       -       .       ID=Exon3;Parent=Y48G1C.5
>>>> I       WormBase        exon    56537   56687   .       -       .       ID=Exon4;Parent=Y48G1C.5
>>>> I       WormBase        exon    56872   57053   .       -       .       ID=Exon5;Parent=Y48G1C.5
>>>> I       WormBase        exon    57257   57625   .       -       .       ID=Exon6;Parent=Y48G1C.5
>>>> I       WormBase        exon    59045   60236   .       -       .       ID=Exon7;Parent=Y48G1C.5
>>>> I       WormBase        exon    61236   62051   .       -       .       ID=Exon8;Parent=Y48G1C.5
>>>> I       WormBase        exon    62774   62974   .       -       .       ID=Exon9;Parent=Y48G1C.5
>>>> I       WormBase        exon    63797   63856   .       -       .       ID=Exon10;Parent=Y48G1C.5
>>>> I       WormBase        exon    63904   63972   .       -       .       ID=Exon11;Parent=Y48G1C.5
>>>> I       WormBase        exon    64018   64066   .       -       .       ID=Exon12;Parent=Y48G1C.5
>>>> I       WormBase        exon    3747    3909    .       -       .       ID=Exon13;Parent=Y74C9A.6
>>>> The parent is not being stored correctly in the backend exon table
>>>>  secondaryidentifier | symbol | primaryidentifier | lastupdated | length |   id    | name | scoretype | score | chromosomeid | geneid | chromosomelocationid | organismid | sequenceid | sequenceontologytermid |            class
>>>> ---------------------+--------+-------------------+-------------+--------+---------+------+-----------+-------+--------------+--------+----------------------+------------+------------+------------------------+------------------------------
>>>>                      |        | Exon1             |             |    158 | 4000002 |      |           |       |      4000000 |        |              4000001 |    1000000 |            |                4000003 | org.intermine.model.bio.Exon
>>>> (1 row)
>>>> While the parent transcript seems fine on the transcript table.
>>>>  secondaryidentifier | intermine_method |  symbol  | primaryidentifier | lastupdated | length |   id    | name | scoretype | score | chromosomeid | chromosomelocationid | organismid | proteinid | sequenceid | sequenceontologytermid | geneid |            class
>>>> ---------------------+------------------+----------+-------------------+-------------+--------+---------+------+-----------+-------+--------------+----------------------+------------+-----------+------------+------------------------+--------+------------------------------
>>>>                      |                  | Y74C9A.3 | Y74C9A.3          |             |   6115 | 4000002 |      |           |       |      4000000 |              4000001 |    3000000 |           |            |                4000003 |        | org.intermine.model.bio.MRNA
>>>> (1 row)
>>>> Any ideas why we are not capturing the exon's parent? Any help appreciated.
>>>> Thanks
>>>> Paulo
>>>> _______________________________________________
>>>> dev mailing list
>>>> [hidden email]
>>>> https://lists.intermine.org/mailman/listinfo/dev
>>>
>>> --
>>> sergio contrino                  InterMine, University of Cambridge
>>> https://sergiocontrino.github.io           http://www.intermine.org
>
> --
> sergio contrino                  InterMine, University of Cambridge
> https://sergiocontrino.github.io           http://www.intermine.org

_______________________________________________
dev mailing list
[hidden email]
https://lists.intermine.org/mailman/listinfo/dev