Follow Eagle

Get Blog Updates By Email

Your email:

Learn More about Eagle Software and Services

Pipeline GIS Softwaredescribe the image

Eagle Pipeline GIS Blog

Current Articles | RSS Feed RSS Feed

GUIDs as EventIDs in APDM / PODS ESRI Spatial Geodatabases

  
  
  

One of the key goals of APDM 5 was to simplify manual data maintenance. Because PODS ESRI Spatial 5 borrows extensively from the APDM 5 core, many of the activities critical to the maintenance of an APDM 5 geodatabase also pertain to a PODS ESRI Spatial 5 geodatabase. This is the first of a series of practical articles dealing with manually populating an APDM 5 or PODS ESRI Spatial 5 geodatabase. This particular post deals with some necessary background information on key identifiers in the APDM / PODS ESRI Spatial. Future posts will get down to the nitty-gritty of data loading.

In an APDM or PODS ESRI Spatial geodatabase the primary key (or unique identifier) of each object / feature class is a field named EventID. You might ask, why not just use the geodatabase ObjectID? The answer is that ObjectIDs are not immutable; they can change when geodatabases are exported, copied or merged. If an ObjectID changes, all relationships that reference that ObjectID are broken. EventIDs never change, so relationships based on them cannot break.

The EventID field has a data type of GUID. GUID stands for Globally Unique ID (Identifier). Needless to say, in order to guarantee global uniqueness, a GUID cannot just be a simple integer (like an ArcMap ObjectID). Although the actual storage mechanism for GUIDs varies a bit depending on the underlying relational database, a GUID is actually a 16-byte integer value. This is a huge number. Given the randomness inherent in the GUID algorithm and the size of GUIDs, you could theoretically generate a million GUIDs per second for the next 100 years and still have only a one in two chance of generating a single duplicate.

GUIDs have been around and in use for a long time. For instance, every Microsoft COM class ever created is identified by a GUID, as a quick examination of the Windows registry will reveal:

Registry Editor ArcMap Identify Tool CLSID Entry

In ArcMap GUIDs appear as complex, thirty-eight character strings, e.g. {FF87DB81-8620-40D0-80FF-139286392ACD}. The curly brackets and dashes are just for (Microsoft CLSID-style) formatting; the remaining 32 alphanumeric characters are the actual hexadecimal representation of the GUID. (For this reason you'll never see a letter in a GUID higher than 'F.' Hexadecimal digits run from 0-9,A-F.)

On the plus side, GUIDs are guaranteed to be globally unique. This makes it easy to combine or merge different APDM or PODS ESRI Spatial geodatabases without any fear of primary key collisions. A bit more subtly, you don't actually have to request new key values for new records directly from the database; you can do it on the client side. This makes it very easy to create data in disconnected applications (e.g. field data collection) that can be loaded to the geodatabase with minimal massaging.

On the minus side, GUIDs are just a tad cumbersome; some might even say unaesthetic. There is also some underlying performance overhead with GUIDs. Keep in mind that a GUID requires 16 bytes of storage, whereas a long integer requires eight bytes and a regular integer only four bytes. For this reason, database indexes built on GUIDs are more voluminous than indexes built on integers. By extension, queries with joins based on GUIDs tend to be more expensive than queries with joins based on integers. Five to ten years ago when GUIDs first came into use as primary keys, a variety of legitimate performance concerns were raised. These concerns have largely been mitigated by hardware, operating system and database advances (particularly with Microsoft SQL Server). For even the largest pipeline databases, as long as you are running on current hardware and database software, query performance with GUIDs should not be an issue.

It ought to be fairly obvious that just banging out an arbitrary thirty-eight character string in no way guarantees a valid GUID value. So generating GUID values can be a royal pain in the derriere, unless you have the appropriate trick up you sleeve. The next article in the series teaches you that trick. And with that trick, generating GUIDs is a snap!

Comments

great article on GUID type in ArcGIS. Demystified GUID and provided useful example on how to generate it.
Posted @ Monday, November 28, 2011 3:35 PM by Danny Dong
I have a question regarding PipelineID column in APDM6. HOw significant is this column? One of our vendor claims that it is the key column in apdm6 and use it instead of eventids but I don't see it replacing Eventid GUID column which is enforcing referential integrity between feature class and tables of APDM6 Datamodel. This column is used in all offline, online events and centerline objects. Also, The name is misleading as it does not store id it stores pipeline names. Please advise whether this column should be dropped from APDM model.
Posted @ Tuesday, August 05, 2014 1:50 PM by sapna
Post Comment
Name
 *
Email
 *
Website (optional)
Comment
 *

Allowed tags: <a> link, <b> bold, <i> italics