Dear Chado User Community,
Representatives from the Tripal (Stephen Ficklin, Lacey Sanderson) and Chado (Scott Cain) projects have combined efforts to work towards a v1.3 release of Chado. To do this, we have compiled a list of the requested changes that we knew about or that were posted to the GMOD Schema mailing list. You can find the list on the Google Doc at this link: https://docs.google.com/document/d/1IZ3VMpIoG1hhpbHYi6rbChImLgrlmbyy7Ewms-EpaeU/edit?usp=sharing We are requesting comments on the document. For v1.3 we are proposing a quick release that will include mostly new linking and property tables to existing Chado tables (see Google doc for complete list). If you have any additional linking tables that you would like to request for the v1.3 release please make a suggestion so we can add them to the list for consideration. Aside from these linking tables we are considering the following changes to the v1.3 release. 1) Add a new 'infraspecific' field for the organism table to allow for storing the names of subspecies, varieties, subvarieties, forma and subforma. However, we would like to know.... should the infraspecific field be used for storing names of strains and cultivars? If so, then the recommendation would be to store details about individual strains and cultivars in the Stock module tables. Alternatively, FlyBase has suggested a separate set of tables for storing strains. Please comment on the Google Doc if you have opinions on the best way to represent/store strains/cultivars in Chado. 2) The addition of an 'organism_relationship' table that allows for storing relationships (not taxonomy) between organisms. An example use case would be for storing breeding relationships (e.g. sterile_with, incompatible_with, fertile_with). 3) Move the 'db' and 'dbxref' tables into a new module called 'DB'. This will not require any SQL changes, just a name change in the documentation. 4) Change 'feature.seqlen' to a bigint to accommodate longer sequences. The more complex issues we are reserving for a potential v1.4 release after more discussion is held. Thanks for any input! Stephen ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Gmod-schema mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gmod-schema |
HI Stephen,
just to add another totally unrelated request: Could we have timestamps for creation time (maybe also update) in the nd_experiment and stock (possibly project) tables? In breeding programs, people would like to know when and how many things have been added… cheers Lukas > On Mar 9, 2015, at 9:20 AM, Stephen Ficklin <[hidden email]> wrote: > > Dear Chado User Community, > > Representatives from the Tripal (Stephen Ficklin, Lacey Sanderson) and > Chado (Scott Cain) projects have combined efforts to work towards a v1.3 > release of Chado. To do this, we have compiled a list of the > requested changes that we knew about or that were posted to the GMOD > Schema mailing list. You can find the list on the Google Doc at this link: > > https://docs.google.com/document/d/1IZ3VMpIoG1hhpbHYi6rbChImLgrlmbyy7Ewms-EpaeU/edit?usp=sharing > > We are requesting comments on the document. For v1.3 we are proposing > a quick release that will include mostly new linking and property tables > to existing Chado tables (see Google doc for complete list). If you > have any additional linking tables that you would like to request for > the v1.3 release please make a suggestion so we can add them to the list > for consideration. > > Aside from these linking tables we are considering the following changes > to the v1.3 release. > > 1) Add a new 'infraspecific' field for the organism table to allow for > storing the names of subspecies, varieties, subvarieties, forma and > subforma. However, we would like to know.... should the infraspecific > field be used for storing names of strains and cultivars? If so, then > the recommendation would be to store details about individual strains > and cultivars in the Stock module tables. Alternatively, FlyBase has > suggested a separate set of tables for storing strains. Please comment > on the Google Doc if you have opinions on the best way to > represent/store strains/cultivars in Chado. > > 2) The addition of an 'organism_relationship' table that allows for > storing relationships (not taxonomy) between organisms. An example use > case would be for storing breeding relationships (e.g. sterile_with, > incompatible_with, fertile_with). > > 3) Move the 'db' and 'dbxref' tables into a new module called 'DB'. > This will not require any SQL changes, just a name change in the > documentation. > > 4) Change 'feature.seqlen' to a bigint to accommodate longer sequences. > > The more complex issues we are reserving for a potential v1.4 release > after more discussion is held. > > Thanks for any input! > Stephen > > > > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Gmod-schema mailing list > [hidden email] > https://lists.sourceforge.net/lists/listinfo/gmod-schema ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Gmod-schema mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gmod-schema |
Hi, I recently also realized that it would be really useful to have timestamps in the tables that Lukas mentioned.Thanks Sook On Mon, Mar 9, 2015 at 10:37 AM, Lukas A. Mueller <[hidden email]> wrote: HI Stephen, ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Gmod-schema mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gmod-schema |
In reply to this post by Stephen Ficklin-2
Hi Stephen-
it's great to see this moving forward! here are a couple of additional very minor changes from our ongoing work with the phylogeny module that I think would fit the spirit of the proposed v1.3 release: add phylotreeprop table add index to phylonode.parent_phylonode_id (we had some serious performance issues with tree deletions until the omission of this index was discovered) Also, unrelated to phylogeny but possibly worth including in the set of linker tables to be added this round: biomaterial_project we found this useful when representing BioSample/BioProject info taken from NCBI; but, we also made some associated changes to existing tables that may be outside the scope of the v1.3 release:
hope that is helpful; let us know if you need more info to justify their inclusion or whatever else would make it easier for you to get the changes incorporated into v 1.3 (DDL, etc). thanks again Andrew Farmer On 3/9/15 7:20 AM, Stephen Ficklin
wrote:
Dear Chado User Community, Representatives from the Tripal (Stephen Ficklin, Lacey Sanderson) and Chado (Scott Cain) projects have combined efforts to work towards a v1.3 release of Chado. To do this, we have compiled a list of the requested changes that we knew about or that were posted to the GMOD Schema mailing list. You can find the list on the Google Doc at this link: https://docs.google.com/document/d/1IZ3VMpIoG1hhpbHYi6rbChImLgrlmbyy7Ewms-EpaeU/edit?usp=sharing We are requesting comments on the document. For v1.3 we are proposing a quick release that will include mostly new linking and property tables to existing Chado tables (see Google doc for complete list). If you have any additional linking tables that you would like to request for the v1.3 release please make a suggestion so we can add them to the list for consideration. Aside from these linking tables we are considering the following changes to the v1.3 release. 1) Add a new 'infraspecific' field for the organism table to allow for storing the names of subspecies, varieties, subvarieties, forma and subforma. However, we would like to know.... should the infraspecific field be used for storing names of strains and cultivars? If so, then the recommendation would be to store details about individual strains and cultivars in the Stock module tables. Alternatively, FlyBase has suggested a separate set of tables for storing strains. Please comment on the Google Doc if you have opinions on the best way to represent/store strains/cultivars in Chado. 2) The addition of an 'organism_relationship' table that allows for storing relationships (not taxonomy) between organisms. An example use case would be for storing breeding relationships (e.g. sterile_with, incompatible_with, fertile_with). 3) Move the 'db' and 'dbxref' tables into a new module called 'DB'. This will not require any SQL changes, just a name change in the documentation. 4) Change 'feature.seqlen' to a bigint to accommodate longer sequences. The more complex issues we are reserving for a potential v1.4 release after more discussion is held. Thanks for any input! Stephen ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Gmod-schema mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gmod-schema -- ...all concepts in which an entire process is semiotically concentrated elude definition; only that which has no history is definable. Friedrich Nietzsche ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Gmod-schema mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gmod-schema |
In reply to this post by Stephen Ficklin-2
On Mon, 09 Mar 2015 09:20:49 -0400
Stephen Ficklin <[hidden email]> wrote: > Dear Chado User Community, > > Representatives from the Tripal (Stephen Ficklin, Lacey Sanderson) > and Chado (Scott Cain) projects have combined efforts to work towards > a v1.3 release of Chado. To do this, we have compiled a list of > the requested changes that we knew about or that were posted to the > GMOD Schema mailing list. You can find the list on the Google Doc > at this link: > > https://docs.google.com/document/d/1IZ3VMpIoG1hhpbHYi6rbChImLgrlmbyy7Ewms-EpaeU/edit?usp=sharing > > We are requesting comments on the document. For v1.3 we are > proposing a quick release that will include mostly new linking and > property tables to existing Chado tables (see Google doc for complete > list). If you have any additional linking tables that you would like > to request for the v1.3 release please make a suggestion so we can > add them to the list for consideration. You may wish to consider these indexes. I've found them essential to making the queries we run perform well. -- The effectiveness of this will vary based on -- whether you have more subjects or objects. create index feature_relationship_idx1b on feature_relationship (object_id, subject_id, type_id); create index featureloc_idx1b on featureloc (feature_id, fmin, fmax); create index feature_idx1b on feature (feature_id, dbxref_id) where dbxref_id is not null; --------------------------- I would like to see the installation process change to be _way_ more friendly. The following would be a good start: Do _not_ delete tables before installing. (If the table already exists the transaction should roll back.) Do not be so chatty, display only warnings and errors, not informational messages. -------------------------- I would like to have it be possible to install Chado into it's own schema. The first step for this is to get rid of the multiple schemas that Chado currently uses. ------------------------------ I would like to see Chado be able to be installed modularly. At present this is rather-to-very difficult. (At least it's difficult if you want only Chado. I don't know the process if Tripal is involved.) The way to make this happen is to remove dependencies between modules regards installation, as far as is possible. You'd do this by doing the following: Separate each module.sql file into pieces: Cascading destruction of the objects in the module. Creation of the module's tables. Creation of the module's constraints and indexes. Creation of the module's views. Creation of the module's triggers and functions. By doing this a user can then create the tables of each of the desired modules in any order and not have to know ahead of time which modules require other modules. No foreign key constraints get in the way of table creation. Afterwards create the constraints and triggers in any order desired. Afterwards create the views. Wrap the execution of each file in a transaction. If an error is raised at any point during constraint or view creation then you're missing a required table (really, a module). Roll back the transaction and install the tables (and later the constraints) of the missing modules. Then re-install until it works. (Note that this assumes that it's possible to notice that an error is raised during the install. The present system is so chatty that it's very easy to miss error messages.) This can be automated and much improved, if you want to get fancy. The design proposed here also solves the problem of "linking tables" (e.g., analysis_feature) as regards determining in which module such tables belong. Linking tables and views would not belong to any module but would be installed on an as-requirements-are-met basis. There is also the advantage, once setup, of very little on-going maintenance. (In my opinion if you're going to go through the work of re-structuring the DDL statements you may as well go all the way. I believe that restructuring the existing DDL files will be more labor intensive than the programming required, although this may be optimistic.) The first goal is to make it easy for a program to tell what tables and views exist in each module, and what object has triggers, constraints and indexes. The idea is to reflect this information in file/directory names within a modified version of the existing one-module-per-directory structure. Likewise the proposed directory structure reveals which tables are linking tables. Instead of having a single file for all tables in a module, separate out the linking tables from the regular tables. Put each of the CREATE TABLE statements for the regular tables into files, one per table, in a "tables" directory within the directory for the module. Throughout the design proposed here each file would have a name that is the name of the table it creates (or creates constraints and indexes for, or creates triggers for, etc.). Likewise, within each module there must be a directory containing per-table files holding each table's constraints and indexes, and an analogous structure for triggers. Put each of the CREATE TABLE statements for each linking tables into a module-level directory, shared by all modules, with one file per linking table. Do the same for the linking tables constraints and indexes, etc. Put all the views in all modules into a single view directory for all modules, as with the linking tables. The views would be defined one-per-file and each file name would be the name of the view. Note that views may have triggers. There would be another directory for these. After installing all of Chado into a test db Postgres' introspection can be used to determine which tables have non-null columns containing foreign keys, including which linking tables require which other tables, and which views require which other tables or views. Note that installing all of Chado is easy. Since all the tables will exist, the files which comprise the totality of each step, table creation, index and constraint creation, etc., can be installed in any order. The only un-addressed problem is views that depend on other views. But this is only a problem for the developers of Chado, need only be solved once and that can be on an ad-hoc basis. It is not a problem for the user who wants to install individual Chado modules. Because it is now known what db objects require other db objects a program that installs modules can ensure that the tables in pre-requisite modules are installed. It would know that a module is a pre-requisite of another if the latter contains a table with a non-NULL column storing a foreign key of the former. Likewise, an installation program can look through all the linking tables that exist, and all the views that exist, and determine if the prerequisites are met for the installation of any given linking table or view, and install said linking table or view when and only when it's pre-requisites are met. In summary, module installation should then be as simple as giving an install program a module name. I've not thought through module deletion, but it seems like deletion could also be as straightforward for the user. (Since creation and deletion would be easy, a deletion/creation cycle provides a handy way of removing all content from a module -- working around Chado's cascading deletes that could impact data in other modules. This could provide a way to "do-over" the loading of data into an unfamiliar module.) Note that the db introspection does not have to occur at the time of install. The results of introspection on a complete Chado install can be cached at development time and distributed as part of the Chado distribution. Triggers (functions really) are the only difficult part. In the case of the "simple version" and manual piecemeal module install, above, since triggers and functions don't raise errors until they are used you won't know that you're missing a dependent module. As regards a more complete revamp of the installation system there's no way to use introspection to determine what tables/views a given trigger requires. So, triggers require the same sort of hack that's used now for whole modules, a manually written list of the tables and views the trigger uses, to inform the installation system of each trigger's prerequisites. Fortunately, there don't seem to be many triggers. Regards, Karl <[hidden email]> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Gmod-schema mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gmod-schema |
On Wed, 11 Mar 2015 11:49:51 -0500
"Karl O. Pinc" <[hidden email]> wrote: > You may wish to consider these indexes. I've found them > essential to making the queries we run perform well. Note that I chose index names so as not to collide with official Chado index names. You will want to change the index names from what was written. > -------------------------- > > I would like to have it be possible to install > Chado into it's own schema. The first step > for this is to get rid of the multiple > schemas that Chado currently uses. A second step might be to have chado install into a schema named "chado". Right now it installs, IIRC, some stuff directly into "public" and other stuff into whatever happens to be at the front of the search_path, and other stuff into the couple of other some schema names that are hardcoded. Of course a recommendation (or default change) to frob the default search path to contain "chado" would be required. Karl <[hidden email]> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Gmod-schema mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gmod-schema |
In reply to this post by Stephen Ficklin-2
Hi,
Thanks for taking this step. Really appreciate that. On Mon, 09 Mar 2015, Stephen Ficklin wrote: > Dear Chado User Community, > > Representatives from the Tripal (Stephen Ficklin, Lacey Sanderson) and > Chado (Scott Cain) projects have combined efforts to work towards a v1.3 > release of Chado. To do this, we have compiled a list of the > requested changes that we knew about or that were posted to the GMOD > Schema mailing list. You can find the list on the Google Doc at this link: > > https://docs.google.com/document/d/1IZ3VMpIoG1hhpbHYi6rbChImLgrlmbyy7Ewms-EpaeU/edit?usp=sharing > > We are requesting comments on the document. For v1.3 we are proposing > a quick release that will include mostly new linking and property tables > to existing Chado tables (see Google doc for complete list). If you > have any additional linking tables that you would like to request for > the v1.3 release please make a suggestion so we can add them to the list > for consideration. > > Aside from these linking tables we are considering the following changes > to the v1.3 release. > > 1) Add a new 'infraspecific' field for the organism table to allow for > storing the names of subspecies, varieties, subvarieties, forma and > subforma. However, we would like to know.... should the infraspecific > field be used for storing names of strains and cultivars? If so, then > the recommendation would be to store details about individual strains > and cultivars in the Stock module tables. Alternatively, FlyBase has > suggested a separate set of tables for storing strains. Please comment > on the Google Doc if you have opinions on the best way to > represent/store strains/cultivars in Chado. genome as that is what got sequenced and we have all the annotations for. For example, dictyostelium discoideum AX4, dictyostelium discoideum AX2, dictyostelium discoideum NC4 etc. At this i do append the strain to the species. Is this change designed to take care of this limitation(having a separate column instead of stringifying). No, using Stock module does not address this problem. > > 2) The addition of an 'organism_relationship' table that allows for > storing relationships (not taxonomy) between organisms. An example use > case would be for storing breeding relationships (e.g. sterile_with, > incompatible_with, fertile_with). Seemed reasonable to me, as long as i could also store any arbitary relationships. > > 3) Move the 'db' and 'dbxref' tables into a new module called 'DB'. > This will not require any SQL changes, just a name change in the > documentation. > > 4) Change 'feature.seqlen' to a bigint to accommodate longer sequences. Great. One more thing that come to my mind is to have a datetime column for most of the central tables for example in the pub table. It simply allows me to have the state of row with making changes to the core tables or adding additional linking tables and application logic. The idea is similar to what ruby on rails framework add to every table once you run the migration through it(date_created, date_updated). thanks, -siddhartha > > The more complex issues we are reserving for a potential v1.4 release > after more discussion is held. > > Thanks for any input! > Stephen > > > > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Gmod-schema mailing list > [hidden email] > https://lists.sourceforge.net/lists/listinfo/gmod-schema ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Gmod-schema mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gmod-schema |
One more thing, is there any timeline of v1.3 release.
And what would be the versioning schema look like, will it use semantic versioning or something arbitary. -siddhartha On Wed, 11 Mar 2015, Siddhartha Basu wrote: > Hi, > Thanks for taking this step. Really appreciate that. > > On Mon, 09 Mar 2015, Stephen Ficklin wrote: > > > Dear Chado User Community, > > > > Representatives from the Tripal (Stephen Ficklin, Lacey Sanderson) and > > Chado (Scott Cain) projects have combined efforts to work towards a v1.3 > > release of Chado. To do this, we have compiled a list of the > > requested changes that we knew about or that were posted to the GMOD > > Schema mailing list. You can find the list on the Google Doc at this link: > > > > https://docs.google.com/document/d/1IZ3VMpIoG1hhpbHYi6rbChImLgrlmbyy7Ewms-EpaeU/edit?usp=sharing > > > > We are requesting comments on the document. For v1.3 we are proposing > > a quick release that will include mostly new linking and property tables > > to existing Chado tables (see Google doc for complete list). If you > > have any additional linking tables that you would like to request for > > the v1.3 release please make a suggestion so we can add them to the list > > for consideration. > > > > Aside from these linking tables we are considering the following changes > > to the v1.3 release. > > > > 1) Add a new 'infraspecific' field for the organism table to allow for > > storing the names of subspecies, varieties, subvarieties, forma and > > subforma. However, we would like to know.... should the infraspecific > > field be used for storing names of strains and cultivars? If so, then > > the recommendation would be to store details about individual strains > > and cultivars in the Stock module tables. Alternatively, FlyBase has > > suggested a separate set of tables for storing strains. Please comment > > on the Google Doc if you have opinions on the best way to > > represent/store strains/cultivars in Chado. > We do need to store the name of our strain which actually ties it to > genome as that is what got sequenced and we have all the annotations for. > For example, dictyostelium discoideum AX4, dictyostelium > discoideum AX2, dictyostelium discoideum NC4 etc. At this i do append > the strain to the species. Is this change designed to take care of this > limitation(having a separate column instead of stringifying). No, > using Stock module does not address this problem. > > > > > > 2) The addition of an 'organism_relationship' table that allows for > > storing relationships (not taxonomy) between organisms. An example use > > case would be for storing breeding relationships (e.g. sterile_with, > > incompatible_with, fertile_with). > Seemed reasonable to me, as long as i could also store any arbitary relationships. > > > > 3) Move the 'db' and 'dbxref' tables into a new module called 'DB'. > > This will not require any SQL changes, just a name change in the > > documentation. > > > > 4) Change 'feature.seqlen' to a bigint to accommodate longer sequences. > Great. > > One more thing that come to my mind is to have a datetime column for > most of the central tables for example in the pub table. It simply allows me to > have the state of row with making changes to the core tables or > adding additional linking tables and application logic. The idea is > similar to what ruby on rails framework add to every table once you run > the migration through it(date_created, date_updated). > > thanks, > -siddhartha > > > > > > > The more complex issues we are reserving for a potential v1.4 release > > after more discussion is held. > > > > Thanks for any input! > > Stephen > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > > by Intel and developed in partnership with Slashdot Media, is your hub for all > > things parallel software development, from weekly thought leadership blogs to > > news, videos, case studies, tutorials and more. Take a look and join the > > conversation now. http://goparallel.sourceforge.net/ > > _______________________________________________ > > Gmod-schema mailing list > > [hidden email] > > https://lists.sourceforge.net/lists/listinfo/gmod-schema ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Gmod-schema mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gmod-schema |
In reply to this post by Karl O. Pinc
Hi,
Some great feedback from Karl, should take a while to process everthing. By quickly looking few things i would like to agree and add on top of those(more might come later).... * We need to have a default/official way to manage version upgrade. * Allow to install in a schema of choice(as karl suggested). * Decouple the schema from data loading softwares, Its just a database schema, release the packaged ddl only. If we have a version management software that piece could be shipped together in that case, or people could easily that using a package management file. Something like this ... * To just install chado, download tarball, untar and run psql on some sql file. that's all for now. thanks, -siddhartha On Wed, 11 Mar 2015, Karl O. Pinc wrote: > On Mon, 09 Mar 2015 09:20:49 -0400 > Stephen Ficklin <[hidden email]> wrote: > > > Dear Chado User Community, > > > > Representatives from the Tripal (Stephen Ficklin, Lacey Sanderson) > > and Chado (Scott Cain) projects have combined efforts to work towards > > a v1.3 release of Chado. To do this, we have compiled a list of > > the requested changes that we knew about or that were posted to the > > GMOD Schema mailing list. You can find the list on the Google Doc > > at this link: > > > > https://docs.google.com/document/d/1IZ3VMpIoG1hhpbHYi6rbChImLgrlmbyy7Ewms-EpaeU/edit?usp=sharing > > > > We are requesting comments on the document. For v1.3 we are > > proposing a quick release that will include mostly new linking and > > property tables to existing Chado tables (see Google doc for complete > > list). If you have any additional linking tables that you would like > > to request for the v1.3 release please make a suggestion so we can > > add them to the list for consideration. > > You may wish to consider these indexes. I've found them > essential to making the queries we run perform well. > > -- The effectiveness of this will vary based on > -- whether you have more subjects or objects. > create index feature_relationship_idx1b > on feature_relationship (object_id, subject_id, type_id); > > create index featureloc_idx1b > on featureloc (feature_id, fmin, fmax); > > create index feature_idx1b > on feature (feature_id, dbxref_id) > where dbxref_id is not null; > > --------------------------- > > I would like to see the installation process change to > be _way_ more friendly. The following would be a good > start: > > Do _not_ delete tables before installing. (If the table > already exists the transaction should roll back.) > > Do not be so chatty, display only warnings > and errors, not informational messages. > > -------------------------- > > I would like to have it be possible to install > Chado into it's own schema. The first step > for this is to get rid of the multiple > schemas that Chado currently uses. > > ------------------------------ > > I would like to see Chado be able to be installed > modularly. At present this is rather-to-very difficult. > (At least it's difficult if you want only Chado. I > don't know the process if Tripal is involved.) > > The way to make this happen is to remove dependencies > between modules regards installation, as far as is possible. > You'd do this by doing the following: > > Separate each module.sql file into pieces: > > Cascading destruction of the objects in the module. > > Creation of the module's tables. > > Creation of the module's constraints and indexes. > > Creation of the module's views. > > Creation of the module's triggers and functions. > > By doing this a user can then create the tables of each > of the desired modules in any order and not have to know > ahead of time which modules require other modules. No > foreign key constraints get in the way of table > creation. Afterwards create the constraints and > triggers in any order desired. Afterwards create the > views. Wrap the execution of each file in a > transaction. If an error is raised at any point during > constraint or view creation then you're missing a > required table (really, a module). Roll back the > transaction and install the tables (and later the > constraints) of the missing modules. Then re-install > until it works. > > (Note that this assumes that it's possible to notice > that an error is raised during the install. The present > system is so chatty that it's very easy to miss error > messages.) > > This can be automated and much improved, if you want to > get fancy. The design proposed here also solves the > problem of "linking tables" (e.g., analysis_feature) as > regards determining in which module such tables belong. > Linking tables and views would not belong to any module > but would be installed on an as-requirements-are-met > basis. There is also the advantage, once setup, of very > little on-going maintenance. > > (In my opinion if you're going to go through the work of > re-structuring the DDL statements you may as well go all > the way. I believe that restructuring the existing DDL > files will be more labor intensive than the programming > required, although this may be optimistic.) > > The first goal is to make it easy for a program to tell > what tables and views exist in each module, and what > object has triggers, constraints and indexes. The idea > is to reflect this information in file/directory names > within a modified version of the existing > one-module-per-directory structure. Likewise the > proposed directory structure reveals which tables are > linking tables. > > Instead of having a single file for all tables in a > module, separate out the linking tables from the regular > tables. Put each of the CREATE TABLE statements for the > regular tables into files, one per table, in a "tables" > directory within the directory for the module. > Throughout the design proposed here each file would have > a name that is the name of the table it creates (or > creates constraints and indexes for, or creates triggers > for, etc.). Likewise, within each module there must be > a directory containing per-table files holding each > table's constraints and indexes, and an analogous > structure for triggers. Put each of the CREATE TABLE > statements for each linking tables into a module-level > directory, shared by all modules, with one file per > linking table. Do the same for the linking tables > constraints and indexes, etc. Put all the views in all > modules into a single view directory for all modules, as > with the linking tables. The views would be defined > one-per-file and each file name would be the name of the > view. Note that views may have triggers. There would > be another directory for these. > > After installing all of Chado into a test db Postgres' > introspection can be used to determine which tables have > non-null columns containing foreign keys, including > which linking tables require which other tables, and > which views require which other tables or views. > > Note that installing all of Chado is easy. Since all > the tables will exist, the files which comprise the > totality of each step, table creation, index and > constraint creation, etc., can be installed in any > order. The only un-addressed problem is views that > depend on other views. But this is only a problem for > the developers of Chado, need only be solved once and > that can be on an ad-hoc basis. It is not a problem for > the user who wants to install individual Chado modules. > > Because it is now known what db objects require other db > objects a program that installs modules can ensure that > the tables in pre-requisite modules are installed. It > would know that a module is a pre-requisite of another > if the latter contains a table with a non-NULL column > storing a foreign key of the former. Likewise, an > installation program can look through all the linking > tables that exist, and all the views that exist, and > determine if the prerequisites are met for the > installation of any given linking table or view, and > install said linking table or view when and only when > it's pre-requisites are met. > > In summary, module installation should then be as simple > as giving an install program a module name. I've not > thought through module deletion, but it seems like > deletion could also be as straightforward for the user. > (Since creation and deletion would be easy, a > deletion/creation cycle provides a handy way of removing > all content from a module -- working around Chado's > cascading deletes that could impact data in other > modules. This could provide a way to "do-over" the > loading of data into an unfamiliar module.) > > Note that the db introspection does not have to occur at > the time of install. The results of introspection on a > complete Chado install can be cached at development time > and distributed as part of the Chado distribution. > > Triggers (functions really) are the only difficult part. > In the case of the "simple version" and manual piecemeal > module install, above, since triggers and functions > don't raise errors until they are used you won't know > that you're missing a dependent module. As regards a > more complete revamp of the installation system there's > no way to use introspection to determine what > tables/views a given trigger requires. So, triggers > require the same sort of hack that's used now for whole > modules, a manually written list of the tables and views > the trigger uses, to inform the installation system of > each trigger's prerequisites. > > Fortunately, there don't seem to be many triggers. > > Regards, > > Karl <[hidden email]> > Free Software: "You don't pay back, you pay forward." > -- Robert A. Heinlein > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Gmod-schema mailing list > [hidden email] > https://lists.sourceforge.net/lists/listinfo/gmod-schema ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Gmod-schema mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gmod-schema |
In reply to this post by Stephen Ficklin-2
On Mon, 09 Mar 2015 09:20:49 -0400
Stephen Ficklin <[hidden email]> wrote: > Representatives from the Tripal (Stephen Ficklin, Lacey Sanderson) > and Chado (Scott Cain) projects have combined efforts to work towards > a v1.3 release of Chado. > We are requesting comments... You could make Papio anubis (baboon) organism_id 13.... Karl <[hidden email]> Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ------------------------------------------------------------------------------ BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF _______________________________________________ Gmod-schema mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gmod-schema |
Free forum by Nabble | Edit this page |