cv relationships

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: cv relationships

Karl O. Pinc
Hi,

I don't know if the "modularization strategy"
is something worth pursuing in and of itself,
but if the point is to get around the 50MB
github limit there are other alternatives.

My first suggestion is to look into getting
support from the Open Source Lab at
Oregon State University.  Since the ontology
development is integral to GMOD and the mission
of the OSL is to support Open Source I would
think that they'd consider your proposal.

http://osuosl.org/services/hosting

You could surely get basic git support.  While
I think that github itself is closed source
there could be an Open Source substitute that
the OSL would manage for you.  If nothing else
there'd surely be a ticket tracking system
of some sort (Request Tracker, Bugzilla?)
that might serve as a substitute for pull
requests.  I bet they'd be happy to talk to you.

Or, an option cheaper than paying somebody
to do something to break the 50MB github
limit might be to pay github a small recurring
fee and avoid the limit with a paid account.

Or you could ask github for an exception.


On 04/04/2014 10:20:42 AM, Chris Mungall wrote:

> Hi Bob,
>
> Currently the gaz.obo is almost 4 times the size limit imposed by
> github
> (50m).
>
> The solution we have now seems to work (although Michael, I think you
> may need to rehare your dropbox).
>
> I would still like to pursue the modularization strategy. The
> challenge
> here is to do it in such a way that works in the confines of oboedit,
> which has limited multi-ontology capability. This means minimizing
> edges
> between modules, which may turn out to be a straightforward graph
> clustering problem. But the end result of this analysis would have to
> be
> thoroughly tested and a solution proposed for how to edit inter-
> module
>
> edges.


>
> On 4 Apr 2014, at 2:19, Bob MacCallum wrote:
>
> > Hi,
> >
> > ...resurrecting an old thread...
> >
> > It's vaguely possible that we might be able to fund a few months'
> > developer
> > time - (would have to be Imperial or UK-based telecommute to
> Imperial
> > I
> > think but not sure) - to get GAZ in a community-editable state in a
> > version
> > control system with a web-based hub (e.g. github/bitbucket).
> >
> > Does that sound like a good idea to anyone?  If so I can pursue it
> > further.
> >
> > And if so, do we have the expertise to spec it fully, or does the
> > developer
> > (there is no specific person in mind) need to figure it out also?  
> > (That's
> > a more dangerous proposition.)
> >
> > cheers,
> > Bob
> >
> >
> >
> >
> >
> > On Sat, Dec 21, 2013 at 7:50 PM, Chris Mungall <[hidden email]>
> > wrote:
> >
> >> As Suzi says, the obofoundry site now points to the latest version
> >> http://obofoundry.org/cgi-bin/detail.cgi?id=gazetteer
> >>
> >> (I already applied patches to resolve the syntax errors)
> >>
> >> We're by passing a version control system at the moment - our
> >> continuous
> >> integration server grabs the most recent version from Michael's
> >> dropbox and
> >> creates a release, unless errors are found
> >> (http://build.berkeleybop.org/job/build-gaz/ for those who are
> >> interested)
> >>
> >> This process also generates an OWL file, using a non-standard
> obo2owl
> >> translation, whereby the "classes" in GAZ are translated to
> >> instances, so
> >> the OWL version is the 'semantically correct' one (although the
> OBO
>
> >> version
> >> is more convenient for editing).
> >>
> >> We're still looking for solutions to modularize GAZ, as this would
> >> make it
> >> easier both to edit, and to manage in version control. Longer term
> we
> >> may
> >> be looking at web-based editing.
> >>
> >>
> >>
> >>
> >> On Sat, Dec 21, 2013 at 10:49 AM, Suzanna Lewis
> >> <[hidden email]>wrote:
> >>
> >>> Hi Bob,
> >>>
> >>> We've just had Michael put the file into dropbox and Chris M. is
> >>> updating
> >>> the obofoundry.org GAZ from there (so that will be the best site
> to
> >>> pull
> >>> if from).
> >>>
> >>> Chris also noticed some syntax errors which he's removed from the
> >>> public
> >>> release. He's sent the list of corrections to Michael and
> hopefully
> >>> he'll
> >>> get them patched up quickly.
> >>>
> >>> It would be absolutely fantastic to have you and others keeping
> the
> >>> ball
> >>> rolling on maintenance! Really great. GAZ already has close to
> 3/4
>
> >>> million
> >>> place names, so a lot of work would be lost otherwise.
> >>>
> >>> Sidd, yes GAZ could indeed be maintained the same way as SO. The
> >>> difference is that right now SO has funding and GAZ is in need of
> >>> someone
> >>> to take over from Michael. It's simply a matter of someone
> minding
>
> >>> the
> >>> store and being there to respond to requests.
> >>>
> >>> GAZ and EnvO originated at the same time (place names in GAZ and
> >>> habitats/biomes in EnvO). EnvO by and large is now maintained by
> >>> folks in
> >>> the meta-genomics community (Oxford and Bremen), and is slowly
> >>> maturing.
> >>>
> >>> Scott and Karl, format is OBO format. (
> >>> http://www.geneontology.org/GO.format.obo-1_2.shtml) and there
> are
> >>> converters to OWL (not sure how lossy these are). I agree it
> could
>
> >>> be
> >>> broken up, perhaps by continent like gondwanaland. :-)
> >>>
> >>> -S
> >>>
> >>> On Dec 20, 2013, at 2:32 AM, Bob MacCallum
> >>> <[hidden email]>
> >>> wrote:
> >>>
> >>> Great thanks Suzanna, yes let's get the ball rolling to open up
> >>> maintenance.
> >>>
> >>> I hadn't thought about file size limits, but 50MB/100MB is the
> >>> soft/hard
> >>> limit at github
> >>> https://help.github.com/articles/working-with-large-files
> >>> and the 1.51 obo file was 135MB so we have to figure something
> some
> >>> workable alternative.
> >>>
> >>> The overall repository size shouldn't be a problem (1GB is the
> >>> recommended max)
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Dec 19, 2013 at 11:37 PM, Suzanna Lewis
> >>> <[hidden email]>wrote:
> >>>
> >>>> p.s. I think the only Web presence is under obo (
> >>>> http://obofoundry.org/cgi-bin/detail.cgi?id=gazetteer) or envo (
> >>>> http://environmentontology.org/)
> >>>>
> >>>> On Dec 19, 2013, at 3:34 PM, Suzanna Lewis
> <[hidden email]>
>
> >>>> wrote:
> >>>>
> >>>> I agree as well. That's the first (and essential) first step.
> >>>>
> >>>> Michael (copied here) is the primary content person, but he is
> now
> >>>> officially retired. He's still interested in GAZ, but to be
> honest
> >>>> its care
> >>>> and content needs to be brought into the modern age. Moving it
> to
>
> >>>> github
> >>>> (or equivalent) is essential operationally to keep it Open. Make
> it
> >>>> so that
> >>>> it's easier for other people to work on it as well.
> >>>>
> >>>> I know it ran into problems with SVN because of the file size.
> >>>> Chris M.
> >>>> can speak to that.
> >>>>
> >>>> I'll try and do what I can to help out because it'd be a shame
> to
>
> >>>> throw
> >>>> years of effort away. Let me see what I can do.
> >>>>
> >>>> Plus you're in London? Same time zone as Michael, perhaps
> arrange
> a
> >>>> call
> >>>> with him?
> >>>>
> >>>> -S
> >>>>
> >>>> On Dec 19, 2013, at 3:03 PM, Bob MacCallum
> >>>> <[hidden email]>
> >>>> wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Dec 19, 2013 at 10:30 PM, Suzanna Lewis
> >>>> <[hidden email]>wrote:
> >>>>
> >>>>> Hi Sanjura,
> >>>>>
> >>>>> Actually, what would be ideal is that you add what's missing to
> >>>>> GAZ.
> >>>>> That way the shared community would have these available and it
> >>>>> would not
> >>>>> require everyone to repeatedly make those inferences
> individually
> >>>>> for
> >>>>> themselves alone. The
> >>>>>
> >>>>
> >>>> indeed, but how?  In the past (and also just now) I've never
> been
>
> >>>> able
> >>>> to find an up to date web presence for GAZ.  My colleagues have
> >>>> contacts
> >>>> with the GAZ developers and so we've been able to request some
> >>>> changes, but
> >>>> shouldn't the whole thing just be on github so we can contribute
> in
> >>>> a more
> >>>> transparent way?
> >>>>
> >>>>
> >>>>> idea behind all of the ontologies is to work collaboratively to
> >>>>> incrementally improve these commonly held resources. Every time
> >>>>> someone
> >>>>> finds a hole or error and consequently decides to roll their
> own
>
> >>>>> there
> >>>>> can't be further progress. The ontologies are akin to any open
> >>>>> source
> >>>>> software project - just as GMOD provides a foundation to work
> >>>>> upon, same
> >>>>> here.
> >>>>>
> >>>>> Be nice to see convergence rather than divergence.
> >>>>> -S
> >>>>>
> >>>>> On Dec 19, 2013, at 12:53 PM, Sanjuro Jogdeo
> >>>>> <[hidden email]>
> >>>>> wrote:
> >>>>>
> >>>>> A good point but looking at the GAZ entries more closely, there
> >>>>> are
> >>>>> many states/provinces that do not follow the ISO standard that
> we
> >>>>> have used
> >>>>> to date.  We have the latitude and longitude associated with
> each
> >>>>> collection and if needed, the community can make region
> inferences
> >>>>> from
> >>>>> those.
> >>>>>
> >>>>> s
> >>>>>
> >>>>>
> >>>>> On Thu, Dec 19, 2013 at 10:42 AM, Suzanna Lewis
> >>>>> <[hidden email]>wrote:
> >>>>>
> >>>>>> But you will use the GAZ IDs? (even if you don't load the
> entire
> >>>>>> thing).
> >>>>>>
> >>>>>> It would make it so very much easier for the community if you
> >>>>>> didn't
> >>>>>> use raw strings or invent y.a. ID system.
> >>>>>>
> >>>>>> -S
> >>>>>>
> >>>>>> On Dec 19, 2013, at 10:31 AM, Sanjuro Jogdeo
> >>>>>> <[hidden email]>
> >>>>>> wrote:
> >>>>>>
> >>>>>> Thanks for the suggestions!  GAZ does look like it has the ISO
> >>>>>> codes
> >>>>>> we need but also a ton we don't need.  I think I'm going to go
> >>>>>> with my
> >>>>>> original plan of just two cvterm ids and nd_geolocationprop
> >>>>>> values for the
> >>>>>> country and province names.  We're already using a
> pre-processing
> >>>>>> script to
> >>>>>> validate country and province names so we hopefully won't get
> >>>>>> invalid
> >>>>>> names.  All of the rest of the properties will be key value
> pairs
> >>>>>> so it
> >>>>>> might simplify queries as well.
> >>>>>>
> >>>>>> Thanks so much for the input!
> >>>>>>
> >>>>>> Sanjuro
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Dec 18, 2013 at 9:20 AM, Karl O. Pinc <[hidden email]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> On 12/18/2013 06:24:18 AM, Bob MacCallum wrote:
> >>>>>>>
> >>>>>>>> When you ask what's the correct way of doing it, you have to
> >>>>>>>> think
> >>>>>>>> about
> >>>>>>>> end uses for the database - do you need to search on broader
> >>>>>>>> geographic
> >>>>>>>> areas*?
> >>>>>>>
> >>>>>>> An off-hand, un-informed comment:
> >>>>>>>
> >>>>>>> If I wanted to search geographic areas I'd try to involve
> >>>>>>> Postgis because that's what it's for.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Karl <[hidden email]>
> >>>>>>> Free Software:  "You don't pay back, you pay forward."
> >>>>>>>               -- Robert A. Heinlein
> >>>>>>>
> >>>>>>>
> >>>>>>>
> ------------------------------------------------------------------------------
> >>>>>>> Rapidly troubleshoot problems before they affect your
> business.
> >>>>>>> Most
> >>>>>>> IT
> >>>>>>> organizations don't have a clear picture of how application
> >>>>>>> performance
> >>>>>>> affects their revenue. With AppDynamics, you get 100%
> visibility
> >>>>>>> into
> >>>>>>> your
> >>>>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
> >>>>>>> AppDynamics Pro!
> >>>>>>>
> >>>>>>> http://pubads.g.doubleclick.net/gampad/clk?
> id=84349831&iu=/4140/ostg.clktrk
> >>>>>>> _______________________________________________
> >>>>>>> Gmod-schema mailing list
> >>>>>>> [hidden email]
> >>>>>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> ------------------------------------------------------------------------------
> >>>>>> Rapidly troubleshoot problems before they affect your
> business.
>
> >>>>>> Most
> >>>>>> IT
> >>>>>> organizations don't have a clear picture of how application
> >>>>>> performance
> >>>>>> affects their revenue. With AppDynamics, you get 100%
> visibility
> >>>>>> into
> >>>>>> your
> >>>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
> >>>>>> AppDynamics Pro!
> >>>>>>
> >>>>>> http://pubads.g.doubleclick.net/gampad/clk?
> id=84349831&iu=/4140/
> ostg.clktrk_______________________________________________
> >>>>>> Gmod-schema mailing list
> >>>>>> [hidden email]
> >>>>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> ------------------------------------------------------------------------------
> >>>>> Rapidly troubleshoot problems before they affect your business.
> >>>>> Most IT
> >>>>> organizations don't have a clear picture of how application
> >>>>> performance
> >>>>> affects their revenue. With AppDynamics, you get 100%
> visibility
>
> >>>>> into
> >>>>> your
> >>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
> >>>>> AppDynamics Pro!
> >>>>>
> >>>>> http://pubads.g.doubleclick.net/gampad/clk?
> id=84349831&iu=/4140/ostg.clktrk
> >>>>> _______________________________________________
> >>>>> Gmod-schema mailing list
> >>>>> [hidden email]
> >>>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Gmod-schema mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>
>




Karl <[hidden email]>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

------------------------------------------------------------------------------
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: cv relationships

Nathan Weeks
In reply to this post by Sanjuro Jogdeo
Regarding the 50M file size limit: if GitHub won't grant an exception,
another option is BitBucket, which doesn't have a hard file size
limit:

https://confluence.atlassian.com/pages/viewpage.action?pageId=273877699

--
Nathan Weeks
IT Specialist
USDA-ARS Corn Insects and Crop Genetics Research Unit
Crop Genome Informatics Laboratory
Iowa State University
http://weeks.public.iastate.edu/



On Sat, Apr 5, 2014 at 8:54 AM,
<[hidden email]> wrote:

> ------------------------------
>
> Message: 3
> Date: Sat, 05 Apr 2014 08:53:36 -0500
> From: "Karl O. Pinc" <[hidden email]>
> Subject: Re: [Gmod-schema] cv relationships
> To: Chris Mungall <[hidden email]>
> Cc: Suzanna Lewis <[hidden email]>,       Michael Ashburner
>         <[hidden email]>,   gmod-schema List
>         <[hidden email]>,    Sanjuro Jogdeo
>         <[hidden email]>
> Message-ID: <1396706016.7686.2@slate>
> Content-Type: text/plain; charset=us-ascii
>
> Hi,
>
> I don't know if the "modularization strategy"
> is something worth pursuing in and of itself,
> but if the point is to get around the 50MB
> github limit there are other alternatives.
>
> My first suggestion is to look into getting
> support from the Open Source Lab at
> Oregon State University.  Since the ontology
> development is integral to GMOD and the mission
> of the OSL is to support Open Source I would
> think that they'd consider your proposal.
>
> http://osuosl.org/services/hosting
>
> You could surely get basic git support.  While
> I think that github itself is closed source
> there could be an Open Source substitute that
> the OSL would manage for you.  If nothing else
> there'd surely be a ticket tracking system
> of some sort (Request Tracker, Bugzilla?)
> that might serve as a substitute for pull
> requests.  I bet they'd be happy to talk to you.
>
> Or, an option cheaper than paying somebody
> to do something to break the 50MB github
> limit might be to pay github a small recurring
> fee and avoid the limit with a paid account.
>
> Or you could ask github for an exception.
>
>
> On 04/04/2014 10:20:42 AM, Chris Mungall wrote:
>> Hi Bob,
>>
>> Currently the gaz.obo is almost 4 times the size limit imposed by
>> github
>> (50m).
>>
>> The solution we have now seems to work (although Michael, I think you
>> may need to rehare your dropbox).
>>
>> I would still like to pursue the modularization strategy. The
>> challenge
>> here is to do it in such a way that works in the confines of oboedit,
>> which has limited multi-ontology capability. This means minimizing
>> edges
>> between modules, which may turn out to be a straightforward graph
>> clustering problem. But the end result of this analysis would have to
>> be
>> thoroughly tested and a solution proposed for how to edit inter-
>> module
>>
>> edges.
>
>
>>
>> On 4 Apr 2014, at 2:19, Bob MacCallum wrote:
>>
>> > Hi,
>> >
>> > ...resurrecting an old thread...
>> >
>> > It's vaguely possible that we might be able to fund a few months'
>> > developer
>> > time - (would have to be Imperial or UK-based telecommute to
>> Imperial
>> > I
>> > think but not sure) - to get GAZ in a community-editable state in a
>> > version
>> > control system with a web-based hub (e.g. github/bitbucket).
>> >
>> > Does that sound like a good idea to anyone?  If so I can pursue it
>> > further.
>> >
>> > And if so, do we have the expertise to spec it fully, or does the
>> > developer
>> > (there is no specific person in mind) need to figure it out also?
>> > (That's
>> > a more dangerous proposition.)
>> >
>> > cheers,
>> > Bob
>> >
>> >
>> >
>> >
>> >
>> > On Sat, Dec 21, 2013 at 7:50 PM, Chris Mungall <[hidden email]>
>> > wrote:
>> >
>> >> As Suzi says, the obofoundry site now points to the latest version
>> >> http://obofoundry.org/cgi-bin/detail.cgi?id=gazetteer
>> >>
>> >> (I already applied patches to resolve the syntax errors)
>> >>
>> >> We're by passing a version control system at the moment - our
>> >> continuous
>> >> integration server grabs the most recent version from Michael's
>> >> dropbox and
>> >> creates a release, unless errors are found
>> >> (http://build.berkeleybop.org/job/build-gaz/ for those who are
>> >> interested)
>> >>
>> >> This process also generates an OWL file, using a non-standard
>> obo2owl
>> >> translation, whereby the "classes" in GAZ are translated to
>> >> instances, so
>> >> the OWL version is the 'semantically correct' one (although the
>> OBO
>>
>> >> version
>> >> is more convenient for editing).
>> >>
>> >> We're still looking for solutions to modularize GAZ, as this would
>> >> make it
>> >> easier both to edit, and to manage in version control. Longer term
>> we
>> >> may
>> >> be looking at web-based editing.
>> >>
>> >>
>> >>
>> >>
>> >> On Sat, Dec 21, 2013 at 10:49 AM, Suzanna Lewis
>> >> <[hidden email]>wrote:
>> >>
>> >>> Hi Bob,
>> >>>
>> >>> We've just had Michael put the file into dropbox and Chris M. is
>> >>> updating
>> >>> the obofoundry.org GAZ from there (so that will be the best site
>> to
>> >>> pull
>> >>> if from).
>> >>>
>> >>> Chris also noticed some syntax errors which he's removed from the
>> >>> public
>> >>> release. He's sent the list of corrections to Michael and
>> hopefully
>> >>> he'll
>> >>> get them patched up quickly.
>> >>>
>> >>> It would be absolutely fantastic to have you and others keeping
>> the
>> >>> ball
>> >>> rolling on maintenance! Really great. GAZ already has close to
>> 3/4
>>
>> >>> million
>> >>> place names, so a lot of work would be lost otherwise.
>> >>>
>> >>> Sidd, yes GAZ could indeed be maintained the same way as SO. The
>> >>> difference is that right now SO has funding and GAZ is in need of
>> >>> someone
>> >>> to take over from Michael. It's simply a matter of someone
>> minding
>>
>> >>> the
>> >>> store and being there to respond to requests.
>> >>>
>> >>> GAZ and EnvO originated at the same time (place names in GAZ and
>> >>> habitats/biomes in EnvO). EnvO by and large is now maintained by
>> >>> folks in
>> >>> the meta-genomics community (Oxford and Bremen), and is slowly
>> >>> maturing.
>> >>>
>> >>> Scott and Karl, format is OBO format. (
>> >>> http://www.geneontology.org/GO.format.obo-1_2.shtml) and there
>> are
>> >>> converters to OWL (not sure how lossy these are). I agree it
>> could
>>
>> >>> be
>> >>> broken up, perhaps by continent like gondwanaland. :-)
>> >>>
>> >>> -S
>> >>>
>> >>> On Dec 20, 2013, at 2:32 AM, Bob MacCallum
>> >>> <[hidden email]>
>> >>> wrote:
>> >>>
>> >>> Great thanks Suzanna, yes let's get the ball rolling to open up
>> >>> maintenance.
>> >>>
>> >>> I hadn't thought about file size limits, but 50MB/100MB is the
>> >>> soft/hard
>> >>> limit at github
>> >>> https://help.github.com/articles/working-with-large-files
>> >>> and the 1.51 obo file was 135MB so we have to figure something
>> some
>> >>> workable alternative.
>> >>>
>> >>> The overall repository size shouldn't be a problem (1GB is the
>> >>> recommended max)
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Thu, Dec 19, 2013 at 11:37 PM, Suzanna Lewis
>> >>> <[hidden email]>wrote:
>> >>>
>> >>>> p.s. I think the only Web presence is under obo (
>> >>>> http://obofoundry.org/cgi-bin/detail.cgi?id=gazetteer) or envo (
>> >>>> http://environmentontology.org/)
>> >>>>
>> >>>> On Dec 19, 2013, at 3:34 PM, Suzanna Lewis
>> <[hidden email]>
>>
>> >>>> wrote:
>> >>>>
>> >>>> I agree as well. That's the first (and essential) first step.
>> >>>>
>> >>>> Michael (copied here) is the primary content person, but he is
>> now
>> >>>> officially retired. He's still interested in GAZ, but to be
>> honest
>> >>>> its care
>> >>>> and content needs to be brought into the modern age. Moving it
>> to
>>
>> >>>> github
>> >>>> (or equivalent) is essential operationally to keep it Open. Make
>> it
>> >>>> so that
>> >>>> it's easier for other people to work on it as well.
>> >>>>
>> >>>> I know it ran into problems with SVN because of the file size.
>> >>>> Chris M.
>> >>>> can speak to that.
>> >>>>
>> >>>> I'll try and do what I can to help out because it'd be a shame
>> to
>>
>> >>>> throw
>> >>>> years of effort away. Let me see what I can do.
>> >>>>
>> >>>> Plus you're in London? Same time zone as Michael, perhaps
>> arrange
>> a
>> >>>> call
>> >>>> with him?
>> >>>>
>> >>>> -S
>> >>>>
>> >>>> On Dec 19, 2013, at 3:03 PM, Bob MacCallum
>> >>>> <[hidden email]>
>> >>>> wrote:
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Thu, Dec 19, 2013 at 10:30 PM, Suzanna Lewis
>> >>>> <[hidden email]>wrote:
>> >>>>
>> >>>>> Hi Sanjura,
>> >>>>>
>> >>>>> Actually, what would be ideal is that you add what's missing to
>> >>>>> GAZ.
>> >>>>> That way the shared community would have these available and it
>> >>>>> would not
>> >>>>> require everyone to repeatedly make those inferences
>> individually
>> >>>>> for
>> >>>>> themselves alone. The
>> >>>>>
>> >>>>
>> >>>> indeed, but how?  In the past (and also just now) I've never
>> been
>>
>> >>>> able
>> >>>> to find an up to date web presence for GAZ.  My colleagues have
>> >>>> contacts
>> >>>> with the GAZ developers and so we've been able to request some
>> >>>> changes, but
>> >>>> shouldn't the whole thing just be on github so we can contribute
>> in
>> >>>> a more
>> >>>> transparent way?
>> >>>>
>> >>>>
>> >>>>> idea behind all of the ontologies is to work collaboratively to
>> >>>>> incrementally improve these commonly held resources. Every time
>> >>>>> someone
>> >>>>> finds a hole or error and consequently decides to roll their
>> own
>>
>> >>>>> there
>> >>>>> can't be further progress. The ontologies are akin to any open
>> >>>>> source
>> >>>>> software project - just as GMOD provides a foundation to work
>> >>>>> upon, same
>> >>>>> here.
>> >>>>>
>> >>>>> Be nice to see convergence rather than divergence.
>> >>>>> -S
>> >>>>>
>> >>>>> On Dec 19, 2013, at 12:53 PM, Sanjuro Jogdeo
>> >>>>> <[hidden email]>
>> >>>>> wrote:
>> >>>>>
>> >>>>> A good point but looking at the GAZ entries more closely, there
>> >>>>> are
>> >>>>> many states/provinces that do not follow the ISO standard that
>> we
>> >>>>> have used
>> >>>>> to date.  We have the latitude and longitude associated with
>> each
>> >>>>> collection and if needed, the community can make region
>> inferences
>> >>>>> from
>> >>>>> those.
>> >>>>>
>> >>>>> s
>> >>>>>
>> >>>>>
>> >>>>> On Thu, Dec 19, 2013 at 10:42 AM, Suzanna Lewis
>> >>>>> <[hidden email]>wrote:
>> >>>>>
>> >>>>>> But you will use the GAZ IDs? (even if you don't load the
>> entire
>> >>>>>> thing).
>> >>>>>>
>> >>>>>> It would make it so very much easier for the community if you
>> >>>>>> didn't
>> >>>>>> use raw strings or invent y.a. ID system.
>> >>>>>>
>> >>>>>> -S
>> >>>>>>
>> >>>>>> On Dec 19, 2013, at 10:31 AM, Sanjuro Jogdeo
>> >>>>>> <[hidden email]>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>> Thanks for the suggestions!  GAZ does look like it has the ISO
>> >>>>>> codes
>> >>>>>> we need but also a ton we don't need.  I think I'm going to go
>> >>>>>> with my
>> >>>>>> original plan of just two cvterm ids and nd_geolocationprop
>> >>>>>> values for the
>> >>>>>> country and province names.  We're already using a
>> pre-processing
>> >>>>>> script to
>> >>>>>> validate country and province names so we hopefully won't get
>> >>>>>> invalid
>> >>>>>> names.  All of the rest of the properties will be key value
>> pairs
>> >>>>>> so it
>> >>>>>> might simplify queries as well.
>> >>>>>>
>> >>>>>> Thanks so much for the input!
>> >>>>>>
>> >>>>>> Sanjuro
>> >>>>>>
>> >>>>>>
>> >>>>>> On Wed, Dec 18, 2013 at 9:20 AM, Karl O. Pinc <[hidden email]>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>> On 12/18/2013 06:24:18 AM, Bob MacCallum wrote:
>> >>>>>>>
>> >>>>>>>> When you ask what's the correct way of doing it, you have to
>> >>>>>>>> think
>> >>>>>>>> about
>> >>>>>>>> end uses for the database - do you need to search on broader
>> >>>>>>>> geographic
>> >>>>>>>> areas*?
>> >>>>>>>
>> >>>>>>> An off-hand, un-informed comment:
>> >>>>>>>
>> >>>>>>> If I wanted to search geographic areas I'd try to involve
>> >>>>>>> Postgis because that's what it's for.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Karl <[hidden email]>
>> >>>>>>> Free Software:  "You don't pay back, you pay forward."
>> >>>>>>>               -- Robert A. Heinlein
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> ------------------------------------------------------------------------------
>> >>>>>>> Rapidly troubleshoot problems before they affect your
>> business.
>> >>>>>>> Most
>> >>>>>>> IT
>> >>>>>>> organizations don't have a clear picture of how application
>> >>>>>>> performance
>> >>>>>>> affects their revenue. With AppDynamics, you get 100%
>> visibility
>> >>>>>>> into
>> >>>>>>> your
>> >>>>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
>> >>>>>>> AppDynamics Pro!
>> >>>>>>>
>> >>>>>>> http://pubads.g.doubleclick.net/gampad/clk?
>> id=84349831&iu=/4140/ostg.clktrk
>> >>>>>>> _______________________________________________
>> >>>>>>> Gmod-schema mailing list
>> >>>>>>> [hidden email]
>> >>>>>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> ------------------------------------------------------------------------------
>> >>>>>> Rapidly troubleshoot problems before they affect your
>> business.
>>
>> >>>>>> Most
>> >>>>>> IT
>> >>>>>> organizations don't have a clear picture of how application
>> >>>>>> performance
>> >>>>>> affects their revenue. With AppDynamics, you get 100%
>> visibility
>> >>>>>> into
>> >>>>>> your
>> >>>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
>> >>>>>> AppDynamics Pro!
>> >>>>>>
>> >>>>>> http://pubads.g.doubleclick.net/gampad/clk?
>> id=84349831&iu=/4140/
>> ostg.clktrk_______________________________________________
>> >>>>>> Gmod-schema mailing list
>> >>>>>> [hidden email]
>> >>>>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> ------------------------------------------------------------------------------
>> >>>>> Rapidly troubleshoot problems before they affect your business.
>> >>>>> Most IT
>> >>>>> organizations don't have a clear picture of how application
>> >>>>> performance
>> >>>>> affects their revenue. With AppDynamics, you get 100%
>> visibility
>>
>> >>>>> into
>> >>>>> your
>> >>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
>> >>>>> AppDynamics Pro!
>> >>>>>
>> >>>>> http://pubads.g.doubleclick.net/gampad/clk?
>> id=84349831&iu=/4140/ostg.clktrk
>> >>>>> _______________________________________________
>> >>>>> Gmod-schema mailing list
>> >>>>> [hidden email]
>> >>>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Gmod-schema mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>
>>
>
>
>
>
> Karl <[hidden email]>
> Free Software:  "You don't pay back, you pay forward."
>                  -- Robert A. Heinlein

------------------------------------------------------------------------------
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: cv relationships

Chris Mungall
In reply to this post by Karl O. Pinc
Github's limit is hardwired I believe.

We were using gold ol' CVS on sourceforge, but it became far too slow.
Not sure the extent to which this was (a) CVS and (b) sourceforge. I
haven't done the required tests for different combinations of VCS and
hosting solution.

On 5 Apr 2014, at 6:53, Karl O. Pinc wrote:

> Hi,
>
> I don't know if the "modularization strategy"
> is something worth pursuing in and of itself,
> but if the point is to get around the 50MB
> github limit there are other alternatives.
>
> My first suggestion is to look into getting
> support from the Open Source Lab at
> Oregon State University.  Since the ontology
> development is integral to GMOD and the mission
> of the OSL is to support Open Source I would
> think that they'd consider your proposal.
>
> http://osuosl.org/services/hosting
>
> You could surely get basic git support.  While
> I think that github itself is closed source
> there could be an Open Source substitute that
> the OSL would manage for you.  If nothing else
> there'd surely be a ticket tracking system
> of some sort (Request Tracker, Bugzilla?)
> that might serve as a substitute for pull
> requests.  I bet they'd be happy to talk to you.
>
> Or, an option cheaper than paying somebody
> to do something to break the 50MB github
> limit might be to pay github a small recurring
> fee and avoid the limit with a paid account.
>
> Or you could ask github for an exception.
>
>
> On 04/04/2014 10:20:42 AM, Chris Mungall wrote:
>> Hi Bob,
>>
>> Currently the gaz.obo is almost 4 times the size limit imposed by
>> github
>> (50m).
>>
>> The solution we have now seems to work (although Michael, I think you
>> may need to rehare your dropbox).
>>
>> I would still like to pursue the modularization strategy. The
>> challenge
>> here is to do it in such a way that works in the confines of oboedit,
>> which has limited multi-ontology capability. This means minimizing
>> edges
>> between modules, which may turn out to be a straightforward graph
>> clustering problem. But the end result of this analysis would have to
>> be
>> thoroughly tested and a solution proposed for how to edit inter-
>> module
>>
>> edges.
>
>
>>
>> On 4 Apr 2014, at 2:19, Bob MacCallum wrote:
>>
>>> Hi,
>>>
>>> ...resurrecting an old thread...
>>>
>>> It's vaguely possible that we might be able to fund a few months'
>>> developer
>>> time - (would have to be Imperial or UK-based telecommute to
>> Imperial
>>> I
>>> think but not sure) - to get GAZ in a community-editable state in a
>>> version
>>> control system with a web-based hub (e.g. github/bitbucket).
>>>
>>> Does that sound like a good idea to anyone?  If so I can pursue it
>>> further.
>>>
>>> And if so, do we have the expertise to spec it fully, or does the
>>> developer
>>> (there is no specific person in mind) need to figure it out also?
>>> (That's
>>> a more dangerous proposition.)
>>>
>>> cheers,
>>> Bob
>>>
>>>
>>>
>>>
>>>
>>> On Sat, Dec 21, 2013 at 7:50 PM, Chris Mungall <[hidden email]>
>>> wrote:
>>>
>>>> As Suzi says, the obofoundry site now points to the latest version
>>>> http://obofoundry.org/cgi-bin/detail.cgi?id=gazetteer
>>>>
>>>> (I already applied patches to resolve the syntax errors)
>>>>
>>>> We're by passing a version control system at the moment - our
>>>> continuous
>>>> integration server grabs the most recent version from Michael's
>>>> dropbox and
>>>> creates a release, unless errors are found
>>>> (http://build.berkeleybop.org/job/build-gaz/ for those who are
>>>> interested)
>>>>
>>>> This process also generates an OWL file, using a non-standard
>> obo2owl
>>>> translation, whereby the "classes" in GAZ are translated to
>>>> instances, so
>>>> the OWL version is the 'semantically correct' one (although the
>> OBO
>>
>>>> version
>>>> is more convenient for editing).
>>>>
>>>> We're still looking for solutions to modularize GAZ, as this would
>>>> make it
>>>> easier both to edit, and to manage in version control. Longer term
>> we
>>>> may
>>>> be looking at web-based editing.
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Dec 21, 2013 at 10:49 AM, Suzanna Lewis
>>>> <[hidden email]>wrote:
>>>>
>>>>> Hi Bob,
>>>>>
>>>>> We've just had Michael put the file into dropbox and Chris M. is
>>>>> updating
>>>>> the obofoundry.org GAZ from there (so that will be the best site
>> to
>>>>> pull
>>>>> if from).
>>>>>
>>>>> Chris also noticed some syntax errors which he's removed from the
>>>>> public
>>>>> release. He's sent the list of corrections to Michael and
>> hopefully
>>>>> he'll
>>>>> get them patched up quickly.
>>>>>
>>>>> It would be absolutely fantastic to have you and others keeping
>> the
>>>>> ball
>>>>> rolling on maintenance! Really great. GAZ already has close to
>> 3/4
>>
>>>>> million
>>>>> place names, so a lot of work would be lost otherwise.
>>>>>
>>>>> Sidd, yes GAZ could indeed be maintained the same way as SO. The
>>>>> difference is that right now SO has funding and GAZ is in need of
>>>>> someone
>>>>> to take over from Michael. It's simply a matter of someone
>> minding
>>
>>>>> the
>>>>> store and being there to respond to requests.
>>>>>
>>>>> GAZ and EnvO originated at the same time (place names in GAZ and
>>>>> habitats/biomes in EnvO). EnvO by and large is now maintained by
>>>>> folks in
>>>>> the meta-genomics community (Oxford and Bremen), and is slowly
>>>>> maturing.
>>>>>
>>>>> Scott and Karl, format is OBO format. (
>>>>> http://www.geneontology.org/GO.format.obo-1_2.shtml) and there
>> are
>>>>> converters to OWL (not sure how lossy these are). I agree it
>> could
>>
>>>>> be
>>>>> broken up, perhaps by continent like gondwanaland. :-)
>>>>>
>>>>> -S
>>>>>
>>>>> On Dec 20, 2013, at 2:32 AM, Bob MacCallum
>>>>> <[hidden email]>
>>>>> wrote:
>>>>>
>>>>> Great thanks Suzanna, yes let's get the ball rolling to open up
>>>>> maintenance.
>>>>>
>>>>> I hadn't thought about file size limits, but 50MB/100MB is the
>>>>> soft/hard
>>>>> limit at github
>>>>> https://help.github.com/articles/working-with-large-files
>>>>> and the 1.51 obo file was 135MB so we have to figure something
>> some
>>>>> workable alternative.
>>>>>
>>>>> The overall repository size shouldn't be a problem (1GB is the
>>>>> recommended max)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Dec 19, 2013 at 11:37 PM, Suzanna Lewis
>>>>> <[hidden email]>wrote:
>>>>>
>>>>>> p.s. I think the only Web presence is under obo (
>>>>>> http://obofoundry.org/cgi-bin/detail.cgi?id=gazetteer) or envo (
>>>>>> http://environmentontology.org/)
>>>>>>
>>>>>> On Dec 19, 2013, at 3:34 PM, Suzanna Lewis
>> <[hidden email]>
>>
>>>>>> wrote:
>>>>>>
>>>>>> I agree as well. That's the first (and essential) first step.
>>>>>>
>>>>>> Michael (copied here) is the primary content person, but he is
>> now
>>>>>> officially retired. He's still interested in GAZ, but to be
>> honest
>>>>>> its care
>>>>>> and content needs to be brought into the modern age. Moving it
>> to
>>
>>>>>> github
>>>>>> (or equivalent) is essential operationally to keep it Open. Make
>> it
>>>>>> so that
>>>>>> it's easier for other people to work on it as well.
>>>>>>
>>>>>> I know it ran into problems with SVN because of the file size.
>>>>>> Chris M.
>>>>>> can speak to that.
>>>>>>
>>>>>> I'll try and do what I can to help out because it'd be a shame
>> to
>>
>>>>>> throw
>>>>>> years of effort away. Let me see what I can do.
>>>>>>
>>>>>> Plus you're in London? Same time zone as Michael, perhaps
>> arrange
>> a
>>>>>> call
>>>>>> with him?
>>>>>>
>>>>>> -S
>>>>>>
>>>>>> On Dec 19, 2013, at 3:03 PM, Bob MacCallum
>>>>>> <[hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Dec 19, 2013 at 10:30 PM, Suzanna Lewis
>>>>>> <[hidden email]>wrote:
>>>>>>
>>>>>>> Hi Sanjura,
>>>>>>>
>>>>>>> Actually, what would be ideal is that you add what's missing to
>>>>>>> GAZ.
>>>>>>> That way the shared community would have these available and it
>>>>>>> would not
>>>>>>> require everyone to repeatedly make those inferences
>> individually
>>>>>>> for
>>>>>>> themselves alone. The
>>>>>>>
>>>>>>
>>>>>> indeed, but how?  In the past (and also just now) I've never
>> been
>>
>>>>>> able
>>>>>> to find an up to date web presence for GAZ.  My colleagues have
>>>>>> contacts
>>>>>> with the GAZ developers and so we've been able to request some
>>>>>> changes, but
>>>>>> shouldn't the whole thing just be on github so we can contribute
>> in
>>>>>> a more
>>>>>> transparent way?
>>>>>>
>>>>>>
>>>>>>> idea behind all of the ontologies is to work collaboratively to
>>>>>>> incrementally improve these commonly held resources. Every time
>>>>>>> someone
>>>>>>> finds a hole or error and consequently decides to roll their
>> own
>>
>>>>>>> there
>>>>>>> can't be further progress. The ontologies are akin to any open
>>>>>>> source
>>>>>>> software project - just as GMOD provides a foundation to work
>>>>>>> upon, same
>>>>>>> here.
>>>>>>>
>>>>>>> Be nice to see convergence rather than divergence.
>>>>>>> -S
>>>>>>>
>>>>>>> On Dec 19, 2013, at 12:53 PM, Sanjuro Jogdeo
>>>>>>> <[hidden email]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> A good point but looking at the GAZ entries more closely, there
>>>>>>> are
>>>>>>> many states/provinces that do not follow the ISO standard that
>> we
>>>>>>> have used
>>>>>>> to date.  We have the latitude and longitude associated with
>> each
>>>>>>> collection and if needed, the community can make region
>> inferences
>>>>>>> from
>>>>>>> those.
>>>>>>>
>>>>>>> s
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Dec 19, 2013 at 10:42 AM, Suzanna Lewis
>>>>>>> <[hidden email]>wrote:
>>>>>>>
>>>>>>>> But you will use the GAZ IDs? (even if you don't load the
>> entire
>>>>>>>> thing).
>>>>>>>>
>>>>>>>> It would make it so very much easier for the community if you
>>>>>>>> didn't
>>>>>>>> use raw strings or invent y.a. ID system.
>>>>>>>>
>>>>>>>> -S
>>>>>>>>
>>>>>>>> On Dec 19, 2013, at 10:31 AM, Sanjuro Jogdeo
>>>>>>>> <[hidden email]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Thanks for the suggestions!  GAZ does look like it has the ISO
>>>>>>>> codes
>>>>>>>> we need but also a ton we don't need.  I think I'm going to go
>>>>>>>> with my
>>>>>>>> original plan of just two cvterm ids and nd_geolocationprop
>>>>>>>> values for the
>>>>>>>> country and province names.  We're already using a
>> pre-processing
>>>>>>>> script to
>>>>>>>> validate country and province names so we hopefully won't get
>>>>>>>> invalid
>>>>>>>> names.  All of the rest of the properties will be key value
>> pairs
>>>>>>>> so it
>>>>>>>> might simplify queries as well.
>>>>>>>>
>>>>>>>> Thanks so much for the input!
>>>>>>>>
>>>>>>>> Sanjuro
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Dec 18, 2013 at 9:20 AM, Karl O. Pinc <[hidden email]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On 12/18/2013 06:24:18 AM, Bob MacCallum wrote:
>>>>>>>>>
>>>>>>>>>> When you ask what's the correct way of doing it, you have to
>>>>>>>>>> think
>>>>>>>>>> about
>>>>>>>>>> end uses for the database - do you need to search on broader
>>>>>>>>>> geographic
>>>>>>>>>> areas*?
>>>>>>>>>
>>>>>>>>> An off-hand, un-informed comment:
>>>>>>>>>
>>>>>>>>> If I wanted to search geographic areas I'd try to involve
>>>>>>>>> Postgis because that's what it's for.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Karl <[hidden email]>
>>>>>>>>> Free Software:  "You don't pay back, you pay forward."
>>>>>>>>>            -- Robert A. Heinlein
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>> ------------------------------------------------------------------------------
>>>>>>>>> Rapidly troubleshoot problems before they affect your
>> business.
>>>>>>>>> Most
>>>>>>>>> IT
>>>>>>>>> organizations don't have a clear picture of how application
>>>>>>>>> performance
>>>>>>>>> affects their revenue. With AppDynamics, you get 100%
>> visibility
>>>>>>>>> into
>>>>>>>>> your
>>>>>>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
>>>>>>>>> AppDynamics Pro!
>>>>>>>>>
>>>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?
>> id=84349831&iu=/4140/ostg.clktrk
>>>>>>>>> _______________________________________________
>>>>>>>>> Gmod-schema mailing list
>>>>>>>>> [hidden email]
>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>> ------------------------------------------------------------------------------
>>>>>>>> Rapidly troubleshoot problems before they affect your
>> business.
>>
>>>>>>>> Most
>>>>>>>> IT
>>>>>>>> organizations don't have a clear picture of how application
>>>>>>>> performance
>>>>>>>> affects their revenue. With AppDynamics, you get 100%
>> visibility
>>>>>>>> into
>>>>>>>> your
>>>>>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
>>>>>>>> AppDynamics Pro!
>>>>>>>>
>>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?
>> id=84349831&iu=/4140/
>> ostg.clktrk_______________________________________________
>>>>>>>> Gmod-schema mailing list
>>>>>>>> [hidden email]
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>> ------------------------------------------------------------------------------
>>>>>>> Rapidly troubleshoot problems before they affect your business.
>>>>>>> Most IT
>>>>>>> organizations don't have a clear picture of how application
>>>>>>> performance
>>>>>>> affects their revenue. With AppDynamics, you get 100%
>> visibility
>>
>>>>>>> into
>>>>>>> your
>>>>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
>>>>>>> AppDynamics Pro!
>>>>>>>
>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?
>> id=84349831&iu=/4140/ostg.clktrk
>>>>>>> _______________________________________________
>>>>>>> Gmod-schema mailing list
>>>>>>> [hidden email]
>>>>>>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Gmod-schema mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/gmod-schema
>>
>>
>
>
>
>
> Karl <[hidden email]>
> Free Software:  "You don't pay back, you pay forward."
>               -- Robert A. Heinlein

------------------------------------------------------------------------------
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: cv relationships

Bob MacCallum
Thanks for the ideas - I've been doing some further digging, although I will start with asking what do we want from the central repo?

In my mind I think the "pretty diffs" (e.g. when you click on a commit in github [1]), which I know are just nicely formatted text, make the change history instantly accessible to outsiders, which is good.  The other most useful thing I think is allowing forking of the whole repo so that people can start collaborating without asking for permission.  Later on, the quality of the contributions can be assessed and merged as appropriate.

So here's what I dug up... all new to me today, so please chip in if you have prior experience.

Central repo:

gitlab.org - seems to be the open source web interface very close to github - there is also gitlab.com offering support but I guess we could figure out how to install/run it ourselves...   This sounds like a good candidate to run on http://osuosl.org/services/hosting which I agree sounds like a good bet.

I also found Gitolite [2] which seems to be a central repo server without the web front end.  There's also Gitorious which I haven't checked out.

Large files:

Maybe we can get away with a custom install of gitlab.org and it will work OK for us - perhaps a bit slow?

Otherwise there is git-annex which is a fairly mature plugin for git which moves large files around and stores only symlinks and metadata in git.  github.com definitely won't do "pretty diffs" for large files in the annex - all it will store is a "broken" symlink.

It may be possible to set up gitlab to perform the git-annex operations on the central repo back end too, but I have no idea if that will produce nice diffs in the web UI.

I think git-annex would store full size copies of each version of a 'bigfile' (rather than diffs) but I am not sure.  I'm not convinced this is the right thing for us - since the git-annex documentation mostly talks about video and audio files.

Modularisation:

I haven't really looked into this.

OWL:

Then *if* there's an option to switch to OWL entirely (I take on board the comments above/below about editing being easier for OBO) then this tool https://github.com/utapyngo/owl2vcs may make commandline diffs meaningful (I read on the web that OWL file order is non-deterministic and diffs are therefore meaningless).  I don't think owl2vcs tackles the large file problem.



Finally I'll just add another question for fun... would we want to start adding support for lat/long in GAZ (or indeed, is it possible)?



[1] https://github.com/bobular/VBPopBio/commit/ef69ea5fbf99624bb08160c51b0ee3e04b03b434
[2] https://github.com/sitaramc/gitolite

On Sat, Apr 5, 2014 at 11:27 PM, Chris Mungall <[hidden email]> wrote:
Github's limit is hardwired I believe.

We were using gold ol' CVS on sourceforge, but it became far too slow. Not sure the extent to which this was (a) CVS and (b) sourceforge. I haven't done the required tests for different combinations of VCS and hosting solution.


On 5 Apr 2014, at 6:53, Karl O. Pinc wrote:

Hi,

I don't know if the "modularization strategy"
is something worth pursuing in and of itself,
but if the point is to get around the 50MB
github limit there are other alternatives.

My first suggestion is to look into getting
support from the Open Source Lab at
Oregon State University.  Since the ontology
development is integral to GMOD and the mission
of the OSL is to support Open Source I would
think that they'd consider your proposal.

http://osuosl.org/services/hosting

You could surely get basic git support.  While
I think that github itself is closed source
there could be an Open Source substitute that
the OSL would manage for you.  If nothing else
there'd surely be a ticket tracking system
of some sort (Request Tracker, Bugzilla?)
that might serve as a substitute for pull
requests.  I bet they'd be happy to talk to you.

Or, an option cheaper than paying somebody
to do something to break the 50MB github
limit might be to pay github a small recurring
fee and avoid the limit with a paid account.

Or you could ask github for an exception.


On 04/04/2014 10:20:42 AM, Chris Mungall wrote:
Hi Bob,

Currently the gaz.obo is almost 4 times the size limit imposed by
github
(50m).

The solution we have now seems to work (although Michael, I think you
may need to rehare your dropbox).

I would still like to pursue the modularization strategy. The
challenge
here is to do it in such a way that works in the confines of oboedit,
which has limited multi-ontology capability. This means minimizing
edges
between modules, which may turn out to be a straightforward graph
clustering problem. But the end result of this analysis would have to
be
thoroughly tested and a solution proposed for how to edit inter-
module

edges.



On 4 Apr 2014, at 2:19, Bob MacCallum wrote:

Hi,

...resurrecting an old thread...

It's vaguely possible that we might be able to fund a few months'
developer
time - (would have to be Imperial or UK-based telecommute to
Imperial
I
think but not sure) - to get GAZ in a community-editable state in a
version
control system with a web-based hub (e.g. github/bitbucket).

Does that sound like a good idea to anyone?  If so I can pursue it
further.

And if so, do we have the expertise to spec it fully, or does the
developer
(there is no specific person in mind) need to figure it out also?
(That's
a more dangerous proposition.)

cheers,
Bob





On Sat, Dec 21, 2013 at 7:50 PM, Chris Mungall <[hidden email]>
wrote:

As Suzi says, the obofoundry site now points to the latest version
http://obofoundry.org/cgi-bin/detail.cgi?id=gazetteer

(I already applied patches to resolve the syntax errors)

We're by passing a version control system at the moment - our
continuous
integration server grabs the most recent version from Michael's
dropbox and
creates a release, unless errors are found
(http://build.berkeleybop.org/job/build-gaz/ for those who are
interested)

This process also generates an OWL file, using a non-standard
obo2owl
translation, whereby the "classes" in GAZ are translated to
instances, so
the OWL version is the 'semantically correct' one (although the
OBO

version
is more convenient for editing).

We're still looking for solutions to modularize GAZ, as this would
make it
easier both to edit, and to manage in version control. Longer term
we
may
be looking at web-based editing.




On Sat, Dec 21, 2013 at 10:49 AM, Suzanna Lewis
<[hidden email]>wrote:

Hi Bob,

We've just had Michael put the file into dropbox and Chris M. is
updating
the obofoundry.org GAZ from there (so that will be the best site
to
pull
if from).

Chris also noticed some syntax errors which he's removed from the
public
release. He's sent the list of corrections to Michael and
hopefully
he'll
get them patched up quickly.

It would be absolutely fantastic to have you and others keeping
the
ball
rolling on maintenance! Really great. GAZ already has close to
3/4

million
place names, so a lot of work would be lost otherwise.

Sidd, yes GAZ could indeed be maintained the same way as SO. The
difference is that right now SO has funding and GAZ is in need of
someone
to take over from Michael. It's simply a matter of someone
minding

the
store and being there to respond to requests.

GAZ and EnvO originated at the same time (place names in GAZ and
habitats/biomes in EnvO). EnvO by and large is now maintained by
folks in
the meta-genomics community (Oxford and Bremen), and is slowly
maturing.

Scott and Karl, format is OBO format. (
http://www.geneontology.org/GO.format.obo-1_2.shtml) and there
are
converters to OWL (not sure how lossy these are). I agree it
could

be
broken up, perhaps by continent like gondwanaland. :-)

-S

On Dec 20, 2013, at 2:32 AM, Bob MacCallum
<[hidden email]>
wrote:

Great thanks Suzanna, yes let's get the ball rolling to open up
maintenance.

I hadn't thought about file size limits, but 50MB/100MB is the
soft/hard
limit at github
https://help.github.com/articles/working-with-large-files
and the 1.51 obo file was 135MB so we have to figure something
some
workable alternative.

The overall repository size shouldn't be a problem (1GB is the
recommended max)




On Thu, Dec 19, 2013 at 11:37 PM, Suzanna Lewis
<[hidden email]>wrote:

p.s. I think the only Web presence is under obo (
http://obofoundry.org/cgi-bin/detail.cgi?id=gazetteer) or envo (
http://environmentontology.org/)

On Dec 19, 2013, at 3:34 PM, Suzanna Lewis
<[hidden email]>

wrote:

I agree as well. That's the first (and essential) first step.

Michael (copied here) is the primary content person, but he is
now
officially retired. He's still interested in GAZ, but to be
honest
its care
and content needs to be brought into the modern age. Moving it
to

github
(or equivalent) is essential operationally to keep it Open. Make
it
so that
it's easier for other people to work on it as well.

I know it ran into problems with SVN because of the file size.
Chris M.
can speak to that.

I'll try and do what I can to help out because it'd be a shame
to

throw
years of effort away. Let me see what I can do.

Plus you're in London? Same time zone as Michael, perhaps
arrange
a
call
with him?

-S

On Dec 19, 2013, at 3:03 PM, Bob MacCallum
<[hidden email]>
wrote:




On Thu, Dec 19, 2013 at 10:30 PM, Suzanna Lewis
<[hidden email]>wrote:

Hi Sanjura,

Actually, what would be ideal is that you add what's missing to
GAZ.
That way the shared community would have these available and it
would not
require everyone to repeatedly make those inferences
individually
for
themselves alone. The


indeed, but how?  In the past (and also just now) I've never
been

able
to find an up to date web presence for GAZ.  My colleagues have
contacts
with the GAZ developers and so we've been able to request some
changes, but
shouldn't the whole thing just be on github so we can contribute
in
a more
transparent way?


idea behind all of the ontologies is to work collaboratively to
incrementally improve these commonly held resources. Every time
someone
finds a hole or error and consequently decides to roll their
own

there
can't be further progress. The ontologies are akin to any open
source
software project - just as GMOD provides a foundation to work
upon, same
here.

Be nice to see convergence rather than divergence.
-S

On Dec 19, 2013, at 12:53 PM, Sanjuro Jogdeo
<[hidden email]>
wrote:

A good point but looking at the GAZ entries more closely, there
are
many states/provinces that do not follow the ISO standard that
we
have used
to date.  We have the latitude and longitude associated with
each
collection and if needed, the community can make region
inferences
from
those.

s


On Thu, Dec 19, 2013 at 10:42 AM, Suzanna Lewis
<[hidden email]>wrote:

But you will use the GAZ IDs? (even if you don't load the
entire
thing).

It would make it so very much easier for the community if you
didn't
use raw strings or invent y.a. ID system.

-S

On Dec 19, 2013, at 10:31 AM, Sanjuro Jogdeo
<[hidden email]>
wrote:

Thanks for the suggestions!  GAZ does look like it has the ISO
codes
we need but also a ton we don't need.  I think I'm going to go
with my
original plan of just two cvterm ids and nd_geolocationprop
values for the
country and province names.  We're already using a
pre-processing
script to
validate country and province names so we hopefully won't get
invalid
names.  All of the rest of the properties will be key value
pairs
so it
might simplify queries as well.

Thanks so much for the input!

Sanjuro


On Wed, Dec 18, 2013 at 9:20 AM, Karl O. Pinc <[hidden email]>
wrote:

On 12/18/2013 06:24:18 AM, Bob MacCallum wrote:

When you ask what's the correct way of doing it, you have to
think
about
end uses for the database - do you need to search on broader
geographic
areas*?

An off-hand, un-informed comment:

If I wanted to search geographic areas I'd try to involve
Postgis because that's what it's for.



Karl <[hidden email]>
Free Software:  "You don't pay back, you pay forward."
           -- Robert A. Heinlein



------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your
business.
Most
IT
organizations don't have a clear picture of how application
performance
affects their revenue. With AppDynamics, you get 100%
visibility
into
your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
AppDynamics Pro!

http://pubads.g.doubleclick.net/gampad/clk?
id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema




------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your
business.

Most
IT
organizations don't have a clear picture of how application
performance
affects their revenue. With AppDynamics, you get 100%
visibility
into
your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
AppDynamics Pro!

http://pubads.g.doubleclick.net/gampad/clk?
id=84349831&iu=/4140/
ostg.clktrk_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema







------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business.
Most IT
organizations don't have a clear picture of how application
performance
affects their revenue. With AppDynamics, you get 100%
visibility

into
your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of
AppDynamics Pro!

http://pubads.g.doubleclick.net/gampad/clk?
id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema









------------------------------------------------------------------------------
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema






Karl <[hidden email]>
Free Software:  "You don't pay back, you pay forward."
              -- Robert A. Heinlein


------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees_APR
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: cv relationships

Karl O. Pinc
On 04/07/2014 09:48:04 AM, Bob MacCallum wrote:


> In my mind I think the "pretty diffs" (e.g. when you click on a
> commit
> in
> github [1]), which I know are just nicely formatted text, make the
> change
> history instantly accessible to outsiders, which is good.  The other
> most
> useful thing I think is allowing forking of the whole repo so that
> people
> can start collaborating without asking for permission.

Aren't these both properties of about every web interface
that exposes an underlying revision control system?

Or when you say "forking" are you thinking of not just
being able to get and fuss with a copy but also having
that copy exposed in the same web interface/on the
same server as the original?  And when you think of diffing
you're thinking of diffing between such forks?

This might possibly be marginally relevant
(would you want it installed on your server?):

http://papio.biology.duke.edu/babase_system_html/ch08s02.html#Wwwdiff

Karl <[hidden email]>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees_APR
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: cv relationships

Bob MacCallum



On Mon, Apr 7, 2014 at 4:22 PM, Karl O. Pinc <[hidden email]> wrote:
On 04/07/2014 09:48:04 AM, Bob MacCallum wrote:


> In my mind I think the "pretty diffs" (e.g. when you click on a
> commit
> in
> github [1]), which I know are just nicely formatted text, make the
> change
> history instantly accessible to outsiders, which is good.  The other
> most
> useful thing I think is allowing forking of the whole repo so that
> people
> can start collaborating without asking for permission.

Aren't these both properties of about every web interface
that exposes an underlying revision control system?


well yes, I've seen that with git and mercurial web interfaces so far.
 
Or when you say "forking" are you thinking of not just
being able to get and fuss with a copy but also having
that copy exposed in the same web interface/on the
same server as the original?  

yes I think having the forked copy with equal visibility as the original is important
 
And when you think of diffing
you're thinking of diffing between such forks?

no, not so much diffs between forks (except when merging), I meant mainly diffs between revisions.

 

This might possibly be marginally relevant
(would you want it installed on your server?):

http://papio.biology.duke.edu/babase_system_html/ch08s02.html#Wwwdiff

Karl <[hidden email]>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein


------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees_APR
_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
12