Business Modeling Passion: Data modeling tools have produced DDL for all the major RDBMSs for years, but what about DDL for non-RDBMS-related storage?

This is a question I am frequently asked, as data modeling seems to be associated to SQL databases while there is no support (available, possible, necessary?) for so called NoSQL databases.

To put a possible short answer into perspective, I like to include some preliminary thoughts.

I strongly believe that (business) data modeling is the single most important discipline for any medium and large organization. To proceed effectively and efficiently, to acknowledge the principles of data governance, master data management, data quality etc., each organization should have one standardized method to create all their conceptual and logical data models and employ a data modeling tool that allows them to link and manage all their models under one hood. I think it's safe to say that, over the past decades and over all organizations that have practiced data modeling, relational modeling has proven to be the most relevant and viable method to represent business data models.

On the other hand, given that most medium and large organizations cannot avoid to deal with multiple software suppliers, given that requirements regarding data volume, velocity, views, human-machine interfaces etc. each may dictate (or at least strongly support) a particular software solution, organizations find themselves exposed to a myriad of storage technologies which include SQL, NoSQL, NewSQL (and whatever-type of) databases.

Despite the attempt of non-RDBMS suppliers to differentiate themselves from the competition by introducing new jargon, apparently all storage technologies are based on the concepts of database, table and column, i.e. the physical models of RDBMS- as well as non-RDBMS-related storages can be obtained by denormalizing logical models. An organization's data modeling tool of choice should allow to derive the physical model for the respective application and storage technology from the existing business data model while maintaining the link between the objects (tables, columns) of the business model and the physical storage model. Doing so does not only serve application development, but constitutes an important measure to ensure that the organization is in control of definitions, lineage, usage etc. of all their data elements in order to suffice legal obligations, achieve regulatory compliance or in general to be able to flexibly (re)act if business and/or technical requirements change.

As long as storage technologies can be reduced to the three-level container principle database-table-column (and potentially other common constructs such as indexes), the (denormalized) physical models already include the primary ingredients for the related DDL. The employed data modeling tool should be extensible to support generating the CREATE / DROP statements (or their equivalents) for the respective target storage system from a physical model. (E.g. the current version of Grandite's SILVERRUN modeling tools also offers to generate the DDL for Cassandra and Neo4j.)

If you like to discuss this more in detail or challenge me on the above, please be invited to comment here or to contact me by email (axel . troike [at] grandite . com)

[In the spirit of full disclosure: I represent Grandite, a supplier of data modeling tools]

Business Modeling Passion

Monday, February 22, 2016

Data modeling tools have produced DDL for all the major RDBMSs for years, but what about DDL for non-RDBMS-related storage?

No comments:

Post a Comment