5 min

Part 1: Metadata Discovery and Data Intelligence

Read about metadata discovery which covers the process of accessing, identifying, collecting, analyzing and making use of metadata in order to gain insights into the structure, relationships, and context of data assets.
Infographic showing metadata discovery project.

This series is written for data professionals working on governance, analytics, data catalog, migration, and integration projects, particularly those whose organisations run ERP applications from SAP, Microsoft, Salesforce, or Oracle.

Each part can be read independently, but together they trace a path from the fundamentals of metadata through to the practical challenges of working with large, complex enterprise systems.

 

Contents

Part 1: Metadata Discovery and Data Intelligence

Part 2: Using Metadata with Data Intelligence Projects

Part 3: The Challenges of Metadata Discovery

Part 4: SAP and ERP Metadata: A Suitable Case for Treatment?

Part 5: Using Safyr for SAP and ERP Metadata Discovery

Metadata Discovery and Data Intelligence

If you work with data (whether in a governance role, on an analytics project, or as part of a data catalog implementation) you will encounter metadata at every turn. Yet despite its central importance, the question of how to find, access, and make use of metadata is often treated as an afterthought. That oversight can be costly.

Metadata discovery covers the process of accessing, identifying, collecting, analysing, and making use of metadata in order to gain insights into the structure, relationships, and context of data assets. It is acritical foundation for virtually every data intelligence initiative: data governance, data cataloguing, data warehouse and analytics, data quality, data migration, data integration, and increasingly for Agentic AI projects. The reason is straightforward: without accurate, accessible metadata, none of these initiatives can deliver their full potential.

What exactly is metadata?

In the context of the source applications that feed into governance, catalog, and analytics systems (ERP platforms, CRM systems, databases, and more) metadata is the descriptive and structural information that defines, describes, and tracks the data those systems generate and store. It answers questions that every data team needs to be able to answer: What is this data? Where did it come from? How has it changed over time? Who owns it? How should it be used, or restricted?

There are several distinct categories of metadata, each serving a different purpose in a data intelligence project.

Technical metadata covers the structural detail: schema information such as table names, column types and sizes, indexes, data types, transformation logic, and pipeline configurations. This is often the easiest to find, but on its own it is rarely sufficient.

Operational metadata captures how data is processed: timestamps of updates, job runtimes, error logs, and system lineage. It is essential for understanding the health and history of data pipelines.

Business metadata is where the real meaning lives. It connects data to business concepts: definitions of key metrics such as customer churn rate, ownership assignments, data usage policies, and the glossary terms that make technical data comprehensible to non-technical stakeholders.

Finally, metadata relating to unstructured data assets (documents, spreadsheets, images, video, and other files) is increasingly acknowledged as important for data intelligence projects, as organisations seek to incorporate these alongside structured data.

Understanding and capturing all of these metadata types is essential if downstream users (analysts, data scientists, governance officers) are to trust, find, and interpret data effectively.

Why does this matter in practice?

Let me illustrate with two examples that will be familiar to most data professionals.

Reporting and analytics

Business managers rely on reports and dashboards to make decisions that affect company strategy, customer relationships, and investment. The data underpinning those decisions typically travels a long road: from one or more transactional systems, through some form of transformation, into a data warehouse where it may be combined and summarised, and finally onto a dashboard, or presented in the form of a detailed report.

If the data team cannot clearly and confidently describe the provenance of that data in terms the business can understand, and demonstrate the lineage of how it moved and changed along the way, then trust in the data erodes quickly. And once trust goes, it is very hard to rebuild.

Data governance

Imagine an organisation that needs to document its key data concepts so that people across different business units can find and use them consistently. Many of those people will not have the technical knowledge to locate that information themselves across the many enterprise applications and systems that hold it.

The solution is often a data catalog or governance platform that stores and organises the metadata from those applications, providing search and navigation facilities so users can find what they need. But that catalog is only as good as the metadata that feeds it. And, that metadata has to come from somewhere. Without a clear process for discovering, extracting, and curating metadata from source systems, the catalog will be incomplete, inaccurate, or both.

Without a clear understanding of the metadata in your files, databases, ERP, and CRM applications, data projects are significantly more likely to be delayed, to deliver less than anticipated, or to fail to build the business confidence in data that makes them worthwhile.

In the next blog, Part 2: Using Metadata with Data Intelligence Projects, we look at how metadata underpins each of the major Data Intelligence disciplines, and what happens when it is missing.

Previous blog
There is no previous blog.
Back to all posts
Next blog
There is no next blog.
Back to all posts