In his recent article Data governance initiatives get more reliant on data lineage info, David Loshin pointed out that "data lineage management offers a compelling scenario for improving
the data governance process". Loshin distinguishes two aspects
to Data Lineage, one structural and the other related to data flows which I characterize as follows:
- Structural Data Lineage - mapping and tracking semantic data objects (and their synonyms) throughout the organization from elements of conceptual and logical schemas to their physical occurrences in databases
- Dynamic Data Lineage - mapping and tracking the flow of semantic data objects (and their synonyms) from their sources, through the processes and data stores of the organization to downstream consumers.
In my post How The GDPR Can Propel An Organization's Informational Infrastructure I mentioned that recording Data
Lineage is implicitly required
by multiple regulations, most prominently the General Data Protection Regulation
(GDPR).
Let's bring this to life using an example scenario
of the not so distant future:
Thomas, an EU resident, is client of the online
retailer xyzAnywhere Corp. which communicates with Thomas usually by email, but occasionally chooses to send him promotional letters by post mail. Thomas receives some of xyzAnywhere's promotional mail at his current residential address (as shown in his online profile), but also still some of their letters via mail forwarder as they are sent to his previous home. Thomas exercises his right granted by the GDPR to request a copy of the entire personal data that xyzAnywhere Corp. holds about him.
Upon receipt of that copy, Thomas realizes that the information provided to him does not include his previous residential address at all.
Regardless of how the communication between the
customer and the organization may continue and leaving aside whether and how regulatory authorities will consider the case and penalize the organization, we can conclude that the organization failed to comply with GDPR (Art. 15), as it did not make the complete set of the customer's "personal data undergoing processing" available.
How could the organization have avoided to
fail?
By employing a professional data modeling tool that
especially
- Features the creation of a business data dictionary where all semantic data objects can be uniquely named and well-defined for the entire organization
- Supports to map and trace all synonym occurrences that may exist throughout the organization related to a data dictionary entry
- Serves to represent a model of Master Entities and their physical distribution.
The data modeling tool SILVERRUN fully supports the above criteria and
helps you to build a solid foundation for Data Model Management, Master Data
Management and Data Governance.
Below please see how SILVERRUN reports Structural Data Lineage which would have helped in the above example scenario to identify all database columns that constitute synonyms e.g. of the data dictionary item "person last-name" and thus define the data model needed to systematically extract all personal data related to a particular customer.
Below please see how SILVERRUN reports Structural Data Lineage which would have helped in the above example scenario to identify all database columns that constitute synonyms e.g. of the data dictionary item "person last-name" and thus define the data model needed to systematically extract all personal data related to a particular customer.
Click to enlarge |
To be clear: Links between a data dictionary item (glossary entry) and its synonyms can only be created by "brainware", not by software (alone) since the semantics behind any data object has to be understood first. However, with human guidance, SILVERRUN can integrate the puzzle pieces that may be available through reverse engineering of databases, importing spreadsheets, reusing existing models and accessing other sources of documentation.
Once integrated, the resulting data model constitutes the solid ground to build a future-proof Master Data Management system and to flexibly respond to regulatory requirements as e.g. stipulated by the GDPR.
[In the spirit of full disclosure: I represent Grandite, the supplier of the SILVERRUN tools for data and process modeling.]
No comments:
Post a Comment