Wednesday, September 8, 2021

Managing Data Models


A functioning data modeling environment could look like this:
 
 
Too complex? Not if you consider that most organizations need to maintain legacy systems in parallel to developing new applications – and both in a flexible or even agile manner.
 
Though delimitations between model levels may vary among readers, this image suggests the principle steps and resulting artifacts to proceed from institutional knowledge in  business departments to structures of files and databases.
 
While I may revisit this approach in future discussions and explanations, I will restrict myself here to a few key comments prioritized by recent discussions with clients.
 
1. First and foremost, using a structure like this effectively and efficiently obviously necessitates a professional tool set of well connected components. (Though I am biased* regarding the choice of tools, the approach outlined here should not impose a particular technical solution.)
 
2. In overcoming departmental silos, the Logical Enterprise Model plays a central role connecting business attributes with potentially multiple physical occurrences in different application systems and databases (Data Catalog). As opposed to the (failing) approach of creating a “complete” enterprise data model first, I suggest to build the Logical Enterprise Model by integrating partial Logical Data Models from ongoing business projects. Due to its central role, this integration process and the maintenance of that model needs to be assigned to a central BUSINESS unit, e.g. the Chief Data Office. Since naming and definitions are a mandatory part of business data modeling, a Business Glossary is an implicit component of the Logical Enterprise Model
 
3. It is not uncommon that organizations operate production databases which were developed in another (incompatible) project environment 20 or more years ago, i.e. those legacy databases may be insufficiently documented and/or the last knowledgeable programmer / database administrator has meanwhile left the organization. To simply shut down those databases may not on be an option, if they still support valuable business processes, at least not without first migrating the enclosed data to a new storage system. Both maintaining or migrating a database require to know its structure (tables, columns, keys, etc.). Fortunately, professional data modeling tools support “reverse engineering”, i.e. they can recreate the physical data model usually directly from a database.
 
However, the reverse engineering process may leave expectant users with frustration as cryptic, abbreviated physical names in databases are common and do not reveal the semantics necessary to understand and document the purpose of tables and columns. Also, the sheer number of tables can be overwhelming.
 
The best strategy “forward” to identify the relevant submodel (“Physical Business Data Model”) is to find the meaning of core tables (i.e. tables that have multiple relationships) and from the subset of interest to navigate to neighboring objects.
 
The necessary business semantics to enrich the model with logical names and definitions can be obtained
  • as part of the reverse engineering process if the developers of the database used in-line documentation features (as e.g. supported by Oracle or SQL Server).
  • from database-external sources (e.g. ERP systems often hold the necessary business information in a separate dictionary).
In absence of these tool-supported options, the reverse engineering approach is limited to tap some business users' knowledge to semanticize the reversed model. While this can be a tedious process including trial and error, it is inevitable to do it, if neither shutting down the database nor continuing to operate it as-is are viable options. Legal obligations (e.g. to comply with applicable data privacy laws) may cause additional pressure to follow this inconvenient, but necessary path to reduce technical debt.
 
I welcome your questions and comments.
 
[* In the spirit of full disclosure: I represent Grandite, the supplier of the SILVERRUN Business Architecture Tools]
 

No comments:

Post a Comment