Data Lineage (Dictionary Entry)

Contexts

Data Management, IAM

Term

Data Lineage

Alternative Forms

N/A

Definitions

The documentation of how data flows throughout the organization, from capture or origination to consumption via transformations. It may be used as a basis to simplify data architecture, to spot unauthorized data-access points, or to identify data inconsistencies.

Related Terms

N/A

Quotes

Of all data-management capabilities in banking, data lineage often generates the most debate. Data-lineage documents how data flow throughout the organization—from the point of capture or origination to consumption by an end user or application, often including the transformations performed along the way. Little guidance has been provided on how far upstream banks should go when providing documentation, nor how detailed the documentation should be for each “hop” or step in the data flow. As a result of the lack of regulatory clarity, banks have taken almost every feasible approach to data-lineage documentation.

In some organizations, data-lineage standards are overengineered, making them costly and time consuming to document and maintain. For instance, one global bank spent about $100 million in just a few months to document the data lineage for a handful of models. But increasingly, overspending is more the exception than the rule. Most banks are working hard to extract some business value from data lineage; for example, by using it as a basis to simplify their data architecture or to spot unauthorized data-access points, or even to identify inconsistencies among data in different reports.

Our benchmarking revealed that more than half of banks are opting for the strictest data-lineage standards possible, tracing back to the system of record at the data-element level (Exhibit 3). We also found that leading institutions do not take a one-size-fits-all approach to data. The data-lineage standards they apply are more or less rigorous depending on the data elements involved. For example, they capture the full end-to-end data lineage (including depth and granularity) for critical data elements, while data lineage for less critical data elements extends only as far as systems of record or provisioning points.

(Ho et al., 2020)

Bibliography

See Also


Follow us on LinkedIn | Discuss on Slack | Support us with Patreon | Sign-up for a free membership.


This wiki is owned by Open Measure, a non-profit association. The original content we publish is licensed under a Creative Commons Attribution 4.0 International License.