
Wikipedia defines Data lineage as - well, actually it doesn’t have a definition in the context of business data. But a good general definition is … the tracking and management of data i.e. where it comes from, where it flows to, and how it's transformed as it travels through the enterprise.
Data lineage - The Challenge
There can be few more critical challenges for a company than the effective management of its operational data wherever it is and at whatever state it is at in any given time. These internal challenges are compounded by the external demands being made on the company i.e. for effective corporate regulation and data governance.
In summary, today’s corporate environment presents significant problems and risks associated with managing data in terms of operational efficiency and the security and privacy of sensitive corporate data.
To effectively handle these data management risks it is critical for a company to be able to have an accurate picture of all data sources as they move, or are transformed across and through the organisation.
But such a data lineage exercise involves the identification, recording and ongoing management of many different data sources through many different states and times across the organisation.
Clearly this a very significant data discovery and management challenge, which requires a comprehensive data map, showing the complete lineage of all of the companies data sources.
This data lineage information is of course held in many different tools and places and requires a systematic approach to its discovery, interpretation and documentation – a major metadata management challenge.
Essentially these are data modelling, or metadata management issues and as such data lineage raises the importance of such skills, techniques and tools.
Enterprise Applications – adding to the Data lineage modelling challenge
There is no immediate quick fix to this metadata challenge - it is an ongoing process of discovery, documentation and debate between technical & business communities in the organisation.
Enterprise Applications such as SAP, Siebel, PeopleSoft and Oracle EBS present unique problems for data modellers because of the complexity and opaqueness of their data architectures.
Given that such packages are now probably the major delivery mechanism of corporate business processes and sources of data, it is critical that they can be integrated in any data modelling or metadata management strategy in support of data lineage initiatives.
Most EA vendors will have an approach to Data Lineage but Saphir technology remains the only toolset dedicated to delivering such ERP metadata intelligence and directly interfacing to the leading data modeling tools.
|