Gbrowse_syn loading format generation for OrthoCluster data

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Gbrowse_syn loading format generation for OrthoCluster data

Olaf Mueller
Hi,

I have completed an orthocluster analysis for two genomes, with the
OrthoCluster program from genome.sfu.ca, and would like to visualize
synteny in Gbrowse_syn. Before I reinvent the wheel, is there a tool
available, which can generate the 12 column intermediate data file
required as loading format by gbrowse_syn_load_alignment_database.pl?

I've read the loading format documentation, and the required data
appears to be scattered over several OrthoCluster in- and output files.
It is particularly unclear to me how the strand information for each
cluster block on each landmark (fields 5 + 11 in the loading file) is
obtained. The OrthoCluster xxx.cluster output file just lists genes for
each cluster and genome, but the gene orientation is naturally random.
If for example genes in a 3 gene cluster have + - + orientation, which
strand orientation is defined for the whole synteny block?

Thanks
Olaf



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Gbrowse_syn loading format generation for OrthoCluster data

Jason Stajich-4
you can see what I wrote for processing mercator - though generally the synteny was defined on a per-nt basis not on a per gene basis but should work either way.
This was my script which was on the stable branch - I don't know what happened to the gbrowse_syn folder on the master when things moved to github.


we outline some of the mercator loading in the chapter here

On Oct 14, 2011, at 9:09 AM, Olaf Mueller wrote:

Hi,

I have completed an orthocluster analysis for two genomes, with the
OrthoCluster program from genome.sfu.ca, and would like to visualize
synteny in Gbrowse_syn. Before I reinvent the wheel, is there a tool
available, which can generate the 12 column intermediate data file
required as loading format by gbrowse_syn_load_alignment_database.pl?

I've read the loading format documentation, and the required data
appears to be scattered over several OrthoCluster in- and output files.
It is particularly unclear to me how the strand information for each
cluster block on each landmark (fields 5 + 11 in the loading file) is
obtained. The OrthoCluster xxx.cluster output file just lists genes for
each cluster and genome, but the gene orientation is naturally random.
If for example genes in a 3 gene cluster have + - + orientation, which
strand orientation is defined for the whole synteny block?

Thanks
Olaf



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Gbrowse_syn loading format generation for OrthoCluster data

Olaf Mueller
Thanks Jason! The article link contains basically all information I need. Is there an alternative download link for the supplementary orthocluster.tar.gz tar ball, which contains gff32gbrowse_syn.pl?

Thanks
Olaf

On 10/14/2011 01:24 PM, Jason Stajich wrote:
you can see what I wrote for processing mercator - though generally the synteny was defined on a per-nt basis not on a per gene basis but should work either way.
This was my script which was on the stable branch - I don't know what happened to the gbrowse_syn folder on the master when things moved to github.


we outline some of the mercator loading in the chapter here

On Oct 14, 2011, at 9:09 AM, Olaf Mueller wrote:

Hi,

I have completed an orthocluster analysis for two genomes, with the
OrthoCluster program from genome.sfu.ca, and would like to visualize
synteny in Gbrowse_syn. Before I reinvent the wheel, is there a tool
available, which can generate the 12 column intermediate data file
required as loading format by gbrowse_syn_load_alignment_database.pl?

I've read the loading format documentation, and the required data
appears to be scattered over several OrthoCluster in- and output files.
It is particularly unclear to me how the strand information for each
cluster block on each landmark (fields 5 + 11 in the loading file) is
obtained. The OrthoCluster xxx.cluster output file just lists genes for
each cluster and genome, but the gene orientation is naturally random.
If for example genes in a 3 gene cluster have + - + orientation, which
strand orientation is defined for the whole synteny block?

Thanks
Olaf



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Gbrowse_syn loading format generation for OrthoCluster data

Jason Stajich-4
Olaf - 
I'm not sure - is the link not available on the journal's site? Sheldon - did you end up putting a version on GMOD website? I don't have it handy.
Jason

On Oct 14, 2011, at 12:39 PM, Olaf Mueller wrote:

Thanks Jason! The article link contains basically all information I need. Is there an alternative download link for the supplementary orthocluster.tar.gz tar ball, which contains gff32gbrowse_syn.pl?

Thanks
Olaf

On 10/14/2011 01:24 PM, Jason Stajich wrote:
you can see what I wrote for processing mercator - though generally the synteny was defined on a per-nt basis not on a per gene basis but should work either way.
This was my script which was on the stable branch - I don't know what happened to the gbrowse_syn folder on the master when things moved to github.


we outline some of the mercator loading in the chapter here

On Oct 14, 2011, at 9:09 AM, Olaf Mueller wrote:

Hi,

I have completed an orthocluster analysis for two genomes, with the
OrthoCluster program from genome.sfu.ca, and would like to visualize
synteny in Gbrowse_syn. Before I reinvent the wheel, is there a tool
available, which can generate the 12 column intermediate data file
required as loading format by gbrowse_syn_load_alignment_database.pl?

I've read the loading format documentation, and the required data
appears to be scattered over several OrthoCluster in- and output files.
It is particularly unclear to me how the strand information for each
cluster block on each landmark (fields 5 + 11 in the loading file) is
obtained. The OrthoCluster xxx.cluster output file just lists genes for
each cluster and genome, but the gene orientation is naturally random.
If for example genes in a 3 gene cluster have + - + orientation, which
strand orientation is defined for the whole synteny block?

Thanks
Olaf



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Gbrowse_syn loading format generation for OrthoCluster data

Olaf Mueller
Jason,

The content of the orthocluster tarball is just described in the publication on Pubmed, and mentioned in the footnotes. I at least couldn't find a download link, and I also dug in the page's source for it. I then went to Current Protocols in Bioinformatics, but their link to this tarball is unfortunately broken. I contacted Wiley, and they are looking into it right now.

I found a copy of othclu2gff3.pl a bit hidden on the OrthoCluster site: http://genome.sfu.ca/projects/orthocluster/gbrowse_syn/

Don't know if this is the same version you have used, but it worked for me.

Olaf

On 10/18/2011 02:04 AM, Jason Stajich wrote:
Olaf - 
I'm not sure - is the link not available on the journal's site? Sheldon - did you end up putting a version on GMOD website? I don't have it handy.
Jason

On Oct 14, 2011, at 12:39 PM, Olaf Mueller wrote:

Thanks Jason! The article link contains basically all information I need. Is there an alternative download link for the supplementary orthocluster.tar.gz tar ball, which contains gff32gbrowse_syn.pl?

Thanks
Olaf

On 10/14/2011 01:24 PM, Jason Stajich wrote:
you can see what I wrote for processing mercator - though generally the synteny was defined on a per-nt basis not on a per gene basis but should work either way.
This was my script which was on the stable branch - I don't know what happened to the gbrowse_syn folder on the master when things moved to github.


we outline some of the mercator loading in the chapter here

On Oct 14, 2011, at 9:09 AM, Olaf Mueller wrote:

Hi,

I have completed an orthocluster analysis for two genomes, with the
OrthoCluster program from genome.sfu.ca, and would like to visualize
synteny in Gbrowse_syn. Before I reinvent the wheel, is there a tool
available, which can generate the 12 column intermediate data file
required as loading format by gbrowse_syn_load_alignment_database.pl?

I've read the loading format documentation, and the required data
appears to be scattered over several OrthoCluster in- and output files.
It is particularly unclear to me how the strand information for each
cluster block on each landmark (fields 5 + 11 in the loading file) is
obtained. The OrthoCluster xxx.cluster output file just lists genes for
each cluster and genome, but the gene orientation is naturally random.
If for example genes in a 3 gene cluster have + - + orientation, which
strand orientation is defined for the whole synteny block?

Thanks
Olaf



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Gbrowse_syn loading format generation for OrthoCluster data

Scott Cain
Hi Olaf,

Is the broken link one that should resolve to something on gmod.org?  If so, it's possible that I broke it when I migrated server.

Scott


On Tue, Oct 18, 2011 at 9:53 AM, Olaf Mueller <[hidden email]> wrote:
Jason,

The content of the orthocluster tarball is just described in the publication on Pubmed, and mentioned in the footnotes. I at least couldn't find a download link, and I also dug in the page's source for it. I then went to Current Protocols in Bioinformatics, but their link to this tarball is unfortunately broken. I contacted Wiley, and they are looking into it right now.

I found a copy of othclu2gff3.pl a bit hidden on the OrthoCluster site: http://genome.sfu.ca/projects/orthocluster/gbrowse_syn/

Don't know if this is the same version you have used, but it worked for me.

Olaf


On 10/18/2011 02:04 AM, Jason Stajich wrote:
Olaf - 
I'm not sure - is the link not available on the journal's site? Sheldon - did you end up putting a version on GMOD website? I don't have it handy.
Jason

On Oct 14, 2011, at 12:39 PM, Olaf Mueller wrote:

Thanks Jason! The article link contains basically all information I need. Is there an alternative download link for the supplementary orthocluster.tar.gz tar ball, which contains gff32gbrowse_syn.pl?

Thanks
Olaf

On 10/14/2011 01:24 PM, Jason Stajich wrote:
you can see what I wrote for processing mercator - though generally the synteny was defined on a per-nt basis not on a per gene basis but should work either way.
This was my script which was on the stable branch - I don't know what happened to the gbrowse_syn folder on the master when things moved to github.


we outline some of the mercator loading in the chapter here

On Oct 14, 2011, at 9:09 AM, Olaf Mueller wrote:

Hi,

I have completed an orthocluster analysis for two genomes, with the
OrthoCluster program from genome.sfu.ca, and would like to visualize
synteny in Gbrowse_syn. Before I reinvent the wheel, is there a tool
available, which can generate the 12 column intermediate data file
required as loading format by gbrowse_syn_load_alignment_database.pl?

I've read the loading format documentation, and the required data
appears to be scattered over several OrthoCluster in- and output files.
It is particularly unclear to me how the strand information for each
cluster block on each landmark (fields 5 + 11 in the loading file) is
obtained. The OrthoCluster xxx.cluster output file just lists genes for
each cluster and genome, but the gene orientation is naturally random.
If for example genes in a 3 gene cluster have + - + orientation, which
strand orientation is defined for the whole synteny block?

Thanks
Olaf



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse




--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Gbrowse_syn loading format generation for OrthoCluster data

Olaf Mueller
Hi Scott,

No, the broken link to orthocluster.tar.gz is on the 'Current Protocols in Bioinformatics' site. At least when I tried to access it from my Duke account. The link to orthocluster.tar.gz on gmod.org is working:

ftp://ftp.gmod.org/pub/gmod/GBrowse_syn/orthocluster.tar.gz

If I stand correct this version of the tarball was used in Gmod summer school. It does however not contain the processing scripts, which convert OrthoCluster output to the intermediate format required by gbrowse_syn.

Olaf
 

On 10/18/2011 11:53 AM, Scott Cain wrote:
Hi Olaf,

Is the broken link one that should resolve to something on gmod.org?  If so, it's possible that I broke it when I migrated server.

Scott


On Tue, Oct 18, 2011 at 9:53 AM, Olaf Mueller <[hidden email]> wrote:
Jason,

The content of the orthocluster tarball is just described in the publication on Pubmed, and mentioned in the footnotes. I at least couldn't find a download link, and I also dug in the page's source for it. I then went to Current Protocols in Bioinformatics, but their link to this tarball is unfortunately broken. I contacted Wiley, and they are looking into it right now.

I found a copy of othclu2gff3.pl a bit hidden on the OrthoCluster site: http://genome.sfu.ca/projects/orthocluster/gbrowse_syn/

Don't know if this is the same version you have used, but it worked for me.

Olaf


On 10/18/2011 02:04 AM, Jason Stajich wrote:
Olaf - 
I'm not sure - is the link not available on the journal's site? Sheldon - did you end up putting a version on GMOD website? I don't have it handy.
Jason

On Oct 14, 2011, at 12:39 PM, Olaf Mueller wrote:

Thanks Jason! The article link contains basically all information I need. Is there an alternative download link for the supplementary orthocluster.tar.gz tar ball, which contains gff32gbrowse_syn.pl?

Thanks
Olaf

On 10/14/2011 01:24 PM, Jason Stajich wrote:
you can see what I wrote for processing mercator - though generally the synteny was defined on a per-nt basis not on a per gene basis but should work either way.
This was my script which was on the stable branch - I don't know what happened to the gbrowse_syn folder on the master when things moved to github.


we outline some of the mercator loading in the chapter here

On Oct 14, 2011, at 9:09 AM, Olaf Mueller wrote:

Hi,

I have completed an orthocluster analysis for two genomes, with the
OrthoCluster program from genome.sfu.ca, and would like to visualize
synteny in Gbrowse_syn. Before I reinvent the wheel, is there a tool
available, which can generate the 12 column intermediate data file
required as loading format by gbrowse_syn_load_alignment_database.pl?

I've read the loading format documentation, and the required data
appears to be scattered over several OrthoCluster in- and output files.
It is particularly unclear to me how the strand information for each
cluster block on each landmark (fields 5 + 11 in the loading file) is
obtained. The OrthoCluster xxx.cluster output file just lists genes for
each cluster and genome, but the gene orientation is naturally random.
If for example genes in a 3 gene cluster have + - + orientation, which
strand orientation is defined for the whole synteny block?

Thanks
Olaf



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse




--
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Gbrowse_syn loading format generation for OrthoCluster data

Fields, Christopher J
Those scripts are linked to from the GBrowse_syn wiki; that seems to be still pointing at SourceForge, though.  This is the github link ('stable' branch):

   https://github.com/GMOD/GBrowse/tree/stable/bin/gbrowse_syn

However, they also appear under the master branch as 'gbrowse_syn_load_alignment_database.pl' and 'gbrowse_syn_load_alignment_msa.pl':

   https://github.com/GMOD/GBrowse/tree/master/bin

My guess is you should have them installed under the latter names if you have GBrowse v.2.x installed.

chris

On Oct 18, 2011, at 11:18 AM, Olaf Mueller wrote:

> Hi Scott,
>
> No, the broken link to orthocluster.tar.gz is on the 'Current Protocols in Bioinformatics' site. At least when I tried to access it from my Duke account. The link to orthocluster.tar.gz on gmod.org is working:
>
> ftp://ftp.gmod.org/pub/gmod/GBrowse_syn/orthocluster.tar.gz
>
> If I stand correct this version of the tarball was used in Gmod summer school. It does however not contain the processing scripts, which convert OrthoCluster output to the intermediate format required by gbrowse_syn.
>
> Olaf
>  
>
> On 10/18/2011 11:53 AM, Scott Cain wrote:
>> Hi Olaf,
>>
>> Is the broken link one that should resolve to something on gmod.org?  If so, it's possible that I broke it when I migrated server.
>>
>> Scott
>>
>>
>> On Tue, Oct 18, 2011 at 9:53 AM, Olaf Mueller <[hidden email]> wrote:
>> Jason,
>>
>> The content of the orthocluster tarball is just described in the publication on Pubmed, and mentioned in the footnotes. I at least couldn't find a download link, and I also dug in the page's source for it. I then went to Current Protocols in Bioinformatics, but their link to this tarball is unfortunately broken. I contacted Wiley, and they are looking into it right now.
>>
>> I found a copy of othclu2gff3.pl a bit hidden on the OrthoCluster site: http://genome.sfu.ca/projects/orthocluster/gbrowse_syn/
>>
>> Don't know if this is the same version you have used, but it worked for me.
>>
>> Olaf
>>
>>
>> On 10/18/2011 02:04 AM, Jason Stajich wrote:
>>> Olaf -
>>> I'm not sure - is the link not available on the journal's site? Sheldon - did you end up putting a version on GMOD website? I don't have it handy.
>>> Jason
>>>
>>> On Oct 14, 2011, at 12:39 PM, Olaf Mueller wrote:
>>>
>>>> Thanks Jason! The article link contains basically all information I need. Is there an alternative download link for the supplementary orthocluster.tar.gz tar ball, which contains gff32gbrowse_syn.pl?
>>>>
>>>> Thanks
>>>> Olaf
>>>>
>>>> On 10/14/2011 01:24 PM, Jason Stajich wrote:
>>>>> you can see what I wrote for processing mercator - though generally the synteny was defined on a per-nt basis not on a per gene basis but should work either way.
>>>>> This was my script which was on the stable branch - I don't know what happened to the gbrowse_syn folder on the master when things moved to github.
>>>>>
>>>>> https://github.com/GMOD/GBrowse/blob/stable/bin/gbrowse_syn/mercatoraln_to_synhits.pl
>>>>>
>>>>> we outline some of the mercator loading in the chapter here
>>>>>  http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3162311/?tool=pubmed
>>>>>
>>>>> On Oct 14, 2011, at 9:09 AM, Olaf Mueller wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have completed an orthocluster analysis for two genomes, with the
>>>>>> OrthoCluster program from genome.sfu.ca, and would like to visualize
>>>>>> synteny in Gbrowse_syn. Before I reinvent the wheel, is there a tool
>>>>>> available, which can generate the 12 column intermediate data file
>>>>>> required as loading format by gbrowse_syn_load_alignment_database.pl?
>>>>>>
>>>>>> I've read the loading format documentation, and the required data
>>>>>> appears to be scattered over several OrthoCluster in- and output files.
>>>>>> It is particularly unclear to me how the strand information for each
>>>>>> cluster block on each landmark (fields 5 + 11 in the loading file) is
>>>>>> obtained. The OrthoCluster xxx.cluster output file just lists genes for
>>>>>> each cluster and genome, but the gene orientation is naturally random.
>>>>>> If for example genes in a 3 gene cluster have + - + orientation, which
>>>>>> strand orientation is defined for the whole synteny block?
>>>>>>
>>>>>> Thanks
>>>>>> Olaf
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> All the data continuously generated in your IT infrastructure contains a
>>>>>> definitive record of customers, application performance, security
>>>>>> threats, fraudulent activity and more. Splunk takes this data and makes
>>>>>> sense of it. Business sense. IT sense. Common sense.
>>>>>> http://p.sf.net/sfu/splunk-d2d-oct
>>>>>> _______________________________________________
>>>>>> Gmod-gbrowse mailing list
>>>>>> [hidden email]
>>>>>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>>>>>
>>>
>>
>> ------------------------------------------------------------------------------
>> All the data continuously generated in your IT infrastructure contains a
>> definitive record of customers, application performance, security
>> threats, fraudulent activity and more. Splunk takes this data and makes
>> sense of it. Business sense. IT sense. Common sense.
>> http://p.sf.net/sfu/splunk-d2d-oct
>> _______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>>
>>
>>
>>
>> --
>> ------------------------------------------------------------------------
>> Scott Cain, Ph. D.                                   scott at scottcain dot net
>> GMOD Coordinator (http://gmod.org/)                     216-392-3087
>> Ontario Institute for Cancer Research
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2d-oct_______________________________________________
> Gmod-gbrowse mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse


------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
Reply | Threaded
Open this post in threaded view
|

Re: Gbrowse_syn loading format generation for OrthoCluster data

Olaf Mueller
Chris,

The scripts I meant work upstream of
gbrowse_syn_load_alignment_database.pl. The latter expects OrthoCluster
data to be passed through in an intermediate 12 field tab delimited
format, and dumps it into MySQL. This upload format is created by
othclu2gff3.pl and gff32gbrowse_syn.pl from raw OrthoCluster .cluster
(and genome input) files. The protocol is explained in the paper Jason
linked below
(http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3162311/?tool=pubmed).

As to my knowledge gbrowse_syn_load_alignment_msa.pl would be required
for alignment but not orthocluster data.

Olaf



On 10/18/2011 03:06 PM, Fields, Christopher J wrote:

> Those scripts are linked to from the GBrowse_syn wiki; that seems to be still pointing at SourceForge, though.  This is the github link ('stable' branch):
>
>     https://github.com/GMOD/GBrowse/tree/stable/bin/gbrowse_syn
>
> However, they also appear under the master branch as 'gbrowse_syn_load_alignment_database.pl' and 'gbrowse_syn_load_alignment_msa.pl':
>
>     https://github.com/GMOD/GBrowse/tree/master/bin
>
> My guess is you should have them installed under the latter names if you have GBrowse v.2.x installed.
>
> chris
>
> On Oct 18, 2011, at 11:18 AM, Olaf Mueller wrote:
>
>> Hi Scott,
>>
>> No, the broken link to orthocluster.tar.gz is on the 'Current Protocols in Bioinformatics' site. At least when I tried to access it from my Duke account. The link to orthocluster.tar.gz on gmod.org is working:
>>
>> ftp://ftp.gmod.org/pub/gmod/GBrowse_syn/orthocluster.tar.gz
>>
>> If I stand correct this version of the tarball was used in Gmod summer school. It does however not contain the processing scripts, which convert OrthoCluster output to the intermediate format required by gbrowse_syn.
>>
>> Olaf
>>
>>
>> On 10/18/2011 11:53 AM, Scott Cain wrote:
>>> Hi Olaf,
>>>
>>> Is the broken link one that should resolve to something on gmod.org?  If so, it's possible that I broke it when I migrated server.
>>>
>>> Scott
>>>
>>>
>>> On Tue, Oct 18, 2011 at 9:53 AM, Olaf Mueller<[hidden email]>  wrote:
>>> Jason,
>>>
>>> The content of the orthocluster tarball is just described in the publication on Pubmed, and mentioned in the footnotes. I at least couldn't find a download link, and I also dug in the page's source for it. I then went to Current Protocols in Bioinformatics, but their link to this tarball is unfortunately broken. I contacted Wiley, and they are looking into it right now.
>>>
>>> I found a copy of othclu2gff3.pl a bit hidden on the OrthoCluster site: http://genome.sfu.ca/projects/orthocluster/gbrowse_syn/
>>>
>>> Don't know if this is the same version you have used, but it worked for me.
>>>
>>> Olaf
>>>
>>>
>>> On 10/18/2011 02:04 AM, Jason Stajich wrote:
>>>> Olaf -
>>>> I'm not sure - is the link not available on the journal's site? Sheldon - did you end up putting a version on GMOD website? I don't have it handy.
>>>> Jason
>>>>
>>>> On Oct 14, 2011, at 12:39 PM, Olaf Mueller wrote:
>>>>
>>>>> Thanks Jason! The article link contains basically all information I need. Is there an alternative download link for the supplementary orthocluster.tar.gz tar ball, which contains gff32gbrowse_syn.pl?
>>>>>
>>>>> Thanks
>>>>> Olaf
>>>>>
>>>>> On 10/14/2011 01:24 PM, Jason Stajich wrote:
>>>>>> you can see what I wrote for processing mercator - though generally the synteny was defined on a per-nt basis not on a per gene basis but should work either way.
>>>>>> This was my script which was on the stable branch - I don't know what happened to the gbrowse_syn folder on the master when things moved to github.
>>>>>>
>>>>>> https://github.com/GMOD/GBrowse/blob/stable/bin/gbrowse_syn/mercatoraln_to_synhits.pl
>>>>>>
>>>>>> we outline some of the mercator loading in the chapter here
>>>>>>   http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3162311/?tool=pubmed
>>>>>>
>>>>>> On Oct 14, 2011, at 9:09 AM, Olaf Mueller wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have completed an orthocluster analysis for two genomes, with the
>>>>>>> OrthoCluster program from genome.sfu.ca, and would like to visualize
>>>>>>> synteny in Gbrowse_syn. Before I reinvent the wheel, is there a tool
>>>>>>> available, which can generate the 12 column intermediate data file
>>>>>>> required as loading format by gbrowse_syn_load_alignment_database.pl?
>>>>>>>
>>>>>>> I've read the loading format documentation, and the required data
>>>>>>> appears to be scattered over several OrthoCluster in- and output files.
>>>>>>> It is particularly unclear to me how the strand information for each
>>>>>>> cluster block on each landmark (fields 5 + 11 in the loading file) is
>>>>>>> obtained. The OrthoCluster xxx.cluster output file just lists genes for
>>>>>>> each cluster and genome, but the gene orientation is naturally random.
>>>>>>> If for example genes in a 3 gene cluster have + - + orientation, which
>>>>>>> strand orientation is defined for the whole synteny block?
>>>>>>>
>>>>>>> Thanks
>>>>>>> Olaf
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------------------------------------------------------
>>>>>>> All the data continuously generated in your IT infrastructure contains a
>>>>>>> definitive record of customers, application performance, security
>>>>>>> threats, fraudulent activity and more. Splunk takes this data and makes
>>>>>>> sense of it. Business sense. IT sense. Common sense.
>>>>>>> http://p.sf.net/sfu/splunk-d2d-oct
>>>>>>> _______________________________________________
>>>>>>> Gmod-gbrowse mailing list
>>>>>>> [hidden email]
>>>>>>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>>> ------------------------------------------------------------------------------
>>> All the data continuously generated in your IT infrastructure contains a
>>> definitive record of customers, application performance, security
>>> threats, fraudulent activity and more. Splunk takes this data and makes
>>> sense of it. Business sense. IT sense. Common sense.
>>> http://p.sf.net/sfu/splunk-d2d-oct
>>> _______________________________________________
>>> Gmod-gbrowse mailing list
>>> [hidden email]
>>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>>>
>>>
>>>
>>>
>>> --
>>> ------------------------------------------------------------------------
>>> Scott Cain, Ph. D.                                   scott at scottcain dot net
>>> GMOD Coordinator (http://gmod.org/)                     216-392-3087
>>> Ontario Institute for Cancer Research
>> ------------------------------------------------------------------------------
>> All the data continuously generated in your IT infrastructure contains a
>> definitive record of customers, application performance, security
>> threats, fraudulent activity and more. Splunk takes this data and makes
>> sense of it. Business sense. IT sense. Common sense.
>> http://p.sf.net/sfu/splunk-d2d-oct_______________________________________________
>> Gmod-gbrowse mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> .
>

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse