This code works, but is it the best way to do it?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

This code works, but is it the best way to do it?

joe carlson
Hi,

So this Im at the part of the project in which I have an code that works, but I’m still a bit mystified of some of the mechanics and whether I’m doing this in a way that fits in with the intended methods.

I’m writing a post processor for one of my datasets. I have a view of the operation which is

make a query and retrieve a List of ResultsRows (with ObjectStore.execute(Query)).
march through the objects on the list, setting attributes as needed,
call ObjectStoreWriter.store(Object) on the objects after each is modified.

Since I’m not touching the id of the object, this executes an update and maintains the id. Collections and references do not need to be touched.

With just a basic workflow as this, I get an error of “Sequence numbers do not match…”

The issue is that, as I can figure out, you associate an integer with each retrieved object in the query operation, and before the object is written back. The idea is to guard against other processes changing the datastore and messing things up. But since I know this is a one-at-a-time operation, I can bypass the datastore checking by calling ObjectStore.execute with call that accepts ObjectStore.SEQUENCE_IGNORE as the sequence parameter. That thing is just an empty map of objects-to-integers and no checking is done.

(I’ve always found that one of the more confusing things about biological databases is the different meanings of the word ‘sequence’. What it really is here is a UID for the database state. Is it generated by a postgresQL sequence?)

This works for me, but is this the best (or intended) way to do this?

Thanks,

Joe Carlson




_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev
Reply | Threaded
Open this post in threaded view
|

Re: This code works, but is it the best way to do it?

Richard Smith-2
Hi Joe,
I think the ObjectStore sequence was originally a Postgres sequence that incremented whenever changes were committed, at some point it changed to a map based on tables for finer grained control of what had changed.

The ObjectStore was originally intended to have very separate write and read operations, which read/write post process operations don't adhere to. I think setting SEQUENCE_IGNORE as you've done should be fine as long as you're certain what you're updating. In other postprocess tasks we always clone objects read from the database before altering and writing them. I don't recall the exact reasons for this but I think it's to do with avoiding altering something the ObjectStore has in cache.

Take a look at CreateReferences.java, e.g. line 157. It may be better to copy what this is doing.

All the best,
Richard.

On Fri, Nov 14, 2014 at 2:01 PM, Julie Sullivan <[hidden email]> wrote:



-------- Forwarded Message --------
Subject: [InterMine Dev] This code works, but is it the best way to do it?
Date: Fri, 7 Nov 2014 11:31:06 -0800
From: Joe Carlson <[hidden email]>
To: Intermine Developer List <[hidden email]>

Hi,

So this Im at the part of the project in which I have an code that works, but I’m still a bit mystified of some of the mechanics and whether I’m doing this in a way that fits in with the intended methods.

I’m writing a post processor for one of my datasets. I have a view of the operation which is

make a query and retrieve a List of ResultsRows (with ObjectStore.execute(Query)).
march through the objects on the list, setting attributes as needed,
call ObjectStoreWriter.store(Object) on the objects after each is modified.

Since I’m not touching the id of the object, this executes an update and maintains the id. Collections and references do not need to be touched.

With just a basic workflow as this, I get an error of “Sequence numbers do not match…”

The issue is that, as I can figure out, you associate an integer with each retrieved object in the query operation, and before the object is written back. The idea is to guard against other processes changing the datastore and messing things up. But since I know this is a one-at-a-time operation, I can bypass the datastore checking by calling ObjectStore.execute with call that accepts ObjectStore.SEQUENCE_IGNORE as the sequence parameter. That thing is just an empty map of objects-to-integers and no checking is done.

(I’ve always found that one of the more confusing things about biological databases is the different meanings of the word ‘sequence’. What it really is here is a UID for the database state. Is it generated by a postgresQL sequence?)

This works for me, but is this the best (or intended) way to do this?

Thanks,

Joe Carlson




_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev





_______________________________________________
dev mailing list
[hidden email]
http://mail.intermine.org/cgi-bin/mailman/listinfo/dev