SAP and data catalogs
With much of your key data assets stored in core SAP applications, they are critical to the success of a data catalog, governance or lineage project. However, a data catalog and SAP can be difficult bedfellows. SAP ECC, S/4HANA, BW and BW/4HANA are large complex and often highly customized applications and so they can present a significant challenge to the delivery of a truly enterprise-wide solution. If SAP data can’t be assimilated easily into the new data platform, project failure could be on the horizon.
Read on to learn more about what to ask your data catalog or data governance vendors.
15 questions to ask your data catalog or data governance solution vendors:
- How does your product access the business-friendly metadata in SAP applications that will be valuable to our business users and data stewards?
- Does your product give us access to logical (business) names and descriptions for tables, fields and other objects? Can we see how tables are related? Can we see which tables are accessed by specific programs and components?
- What facilities does your product have for analyzing the SAP metadata so that we can easily identify and classify the metadata that is especially useful for us?
- What facilities do you have in your product to provide us with the internal lineage in SAP BW or SAP BW/4HANA?
- How does your product identify which tables in the source ERP system are the basis of passing data to BW?
- Our SAP system is customized. These changes are important to us. How does your product support these customizations?
- We have more than one SAP system. How can your product help us to identify any differences in their metadata?
- Our BW application changes regularly. How can we automate the process of updating the metadata in your data catalog?
- Our SAP BASIS team are often busy and it can take them a long time to find tables and attributes we need for reporting, data lineage, governance and integration. What tools does your product have to allow our data analysts and stewards to work with SAP metadata?
- We need to be able to identify PII or Personal Data attributes in SAP for CCPA, GDPR or other compliance regimes. How does your product help us to locate and use this metadata?
- What tools does your product provide to help us to identify which tables, related tables and attributes are associated with specific business concepts or application components in SAP?
- How does your product identify when metadata has been changed in SAP?
- Is your solution for SAP certified by SAP?
- If you are planning to profile the actual data in your SAP systems, how does your product help us to determine which tables are required for this. If all the tables are to be profiled how does your product also deliver the logical names for tables, related tables and attributes?
- If the metadata from SAP goes through an intermediate step before it is provisioned into your product, what mechanisms do you have for loading that metadata?
Why is it important for your data catalog project to ask these questions?
Firstly, if you think about the your data catalog and SAP in the context of its data structures there are a few key points to remember:
- The data model for an SAP ERP ECC or SAP S/4HANA system is very large. For either of them, it is likely to comprise over 90,000 base tables and at least 1 million fields (attributes).Navigating this model is difficult unless you are an SAP technical specialist and can use SAP’s tools, or you have access to dedicated metadata discovery products. Templates or reference models are of limited use because most SAP implementations have been customized to a greater or lesser degree.
- The RDBMS System Catalog only contains technical names for tables, fields and other objects. None of useful metadata by which I mean logical names and descriptions for tables, attributes and other objects plus the relationships between tables in available in the System Catalog. This means that scanning the underlying database is of no value. This is the same for SAP HANA as it is for Oracle, SQL Server etc.
- SAP BW or BW/4HANA are smaller, however their structures are even more complex due to their multi-dimensional ROLAP model. The logical, useful metadata is not in the System Catalog. In addition, there is internal lineage between the various objects in BW which are often required for a data catalog or lineage solution
- Traditional “data profiling” tools, “scanners” and associated techniques were built for yesterday’s data lakes not for modern day governance and catalog requirements.
Then, it is important to consider whether your project requires all the metadata to be imported into the data catalog, governance or lineage solution from your SAP systems.
For example, do you need to include tables which have no data and are therefore likely to be unused?
Also, are there certain business areas which are more important for early inclusion in your project? If so, then it may be that you need to create subsets of the metadata which represent those first and then work on other areas later.
Another question to answer is how you can keep the metadata in your solution up to date as it changes? This is unlikely to be a major issue with SAP ECC and S/4HANA. However, it’s common for SAP BW metadata to change frequently.
If you are also planning to profile the actual data in SAP or SAP BW for lineage, governance or data quality purposes, having some way to identify what the data in the tables and attributes actually relates to will be important.
Finally, it is important to consider how the useful metadata from your SAP systems will actually be ingested into your software solution. Remember this metadata is not available from the System Catalog so it is not worth scanning the metadata from there and importing it.
The valuable metadata is stored in a series of data dictionary tables so it is important to understand how this information will be accessed and rationalized so that you can make sense of it for your project.
Skipping over the complexities of SAP systems, when many vendors say they have a “connector”, could increase risk especially when business or the regulator comes calling.
These are the considerations which make working with SAP so different in the context of data transformation projects.