GTF sniffer

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

GTF sniffer

SHAUN WEBB
I have a tool that uses .gtf input, when I upload a .gtf file it is  
automatically recognised as a .gff file meaning I have to manually  
change the format to gtf.

I know gff and gtf are very similar but is it possible to have a gtf sniffer?

Out of interest is there any documentation relating to writing  
sniffers for different datatypes I should probably have a go at  
writing a few for my on.

Thanks.

Shaun Webb

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: GTF sniffer

Jeremy Goecks
> I know gff and gtf are very similar but is it possible to have a gtf sniffer?

There is a GTF sniffer in Galaxy, and it should detect GTF files as such assuming your datatypes_conf.xml file is set up appropriately. To correctly sniff GTF files, make sure you have the following line in your <sniffers> section and that it appears above the GFF entry:

...
        <sniffer type="galaxy.datatypes.interval:Gtf"/>
        <sniffer type="galaxy.datatypes.interval:Gff"/>
        <sniffer type="galaxy.datatypes.interval:Gff3"/>
...

If you have a GTF file that's still not being recognized, please send it our way and we'll take a look.

> Out of interest is there any documentation relating to writing sniffers for different datatypes I should probably have a go at writing a few for my on.

I don't see anything in our wiki off hand. However, if you're going to write your own, looking at an existing sniffer should make it clear what needs to happen. For instance, see  any of the sniff() functions in /lib/galaxy/datatypes/interval.py

Best,
J.




_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: GTF sniffer

Jeremy Goecks
I don't see anything in our wiki off hand. However, if you're going to write your own, looking at an existing sniffer should make it clear what needs to happen. For instance, see  any of the sniff() functions in /lib/galaxy/datatypes/interval.py

I was mistaken about the lack of documentation about sniffing. Here's some useful documentation within the context of adding a datatype to Galaxy:


J.

_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: GTF sniffer

SHAUN WEBB
In reply to this post by Jeremy Goecks

Thanks,

the gtf sniffer was not in the datatypes_conf.xml.sample so I assumed  
it did not exist.

I have updated this to include the gtf sniffer but my file is still  
not being recognised.

I have attached the file, any help would be great.

Thanks
Shaun


Quoting Jeremy Goecks <[hidden email]>:

>> I know gff and gtf are very similar but is it possible to have a  
>> gtf sniffer?
>
> There is a GTF sniffer in Galaxy, and it should detect GTF files as  
> such assuming your datatypes_conf.xml file is set up appropriately.  
> To correctly sniff GTF files, make sure you have the following line  
> in your <sniffers> section and that it appears above the GFF entry:
>
> ...
>         <sniffer type="galaxy.datatypes.interval:Gtf"/>
>         <sniffer type="galaxy.datatypes.interval:Gff"/>
>         <sniffer type="galaxy.datatypes.interval:Gff3"/>
> ...
>
> If you have a GTF file that's still not being recognized, please  
> send it our way and we'll take a look.
>
>> Out of interest is there any documentation relating to writing  
>> sniffers for different datatypes I should probably have a go at  
>> writing a few for my on.
>
> I don't see anything in our wiki off hand. However, if you're going  
> to write your own, looking at an existing sniffer should make it  
> clear what needs to happen. For instance, see  any of the sniff()  
> functions in /lib/galaxy/datatypes/interval.py
>
> Best,
> J.
>
>
>
>
>



--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

GTF attachment

SHAUN WEBB
In reply to this post by Jeremy Goecks

File attached now,
thanks.

Quoting Jeremy Goecks <[hidden email]>:

>> I don't see anything in our wiki off hand. However, if you're going  
>> to write your own, looking at an existing sniffer should make it  
>> clear what needs to happen. For instance, see  any of the sniff()  
>> functions in /lib/galaxy/datatypes/interval.py
>
> I was mistaken about the lack of documentation about sniffing.  
> Here's some useful documentation within the context of adding a  
> datatype to Galaxy:
>
> http://bitbucket.org/galaxy/galaxy-central/wiki/AddingDatatypes
>
> J.


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev

Saccharomyces_cerevisiae.EF2.59.1.0.gtf (6M) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: GTF sniffer

Jeremy Goecks
In reply to this post by SHAUN WEBB

I have attached the file, any help would be great.

Here's the first line in the file:

--
Mito intergenic_50 exon 1 680 . + . gene_id "INT50_3749" ; transcript_name "INT50_3749"; transcript_id "INT50_3749"; gene_name "INT50_3749" ; Note "intergenic regions 50 nt from a coding sequence"
--

As it turns out, the file is not technically a GTF file. The GTF spec is here:


and the requirement broken by this file is:

--
The attribute list must begin with the two mandatory attributes:

• gene_id value - A globally unique identifier for the genomic source of the sequence.
• transcript_id value - A globally unique identifier for the predicted transcript. 
--

Where did you get this file? If there are prominent tools that are producing almost-but-not-quite-compliant GTF files, it may be necessary for Galaxy to relax its requirements a bit.

Thanks,
J.

_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev