Satnav and Data Discovery
I’m always looking for new ways to explain what I do for a living. It’s made hard by the fact that it centres on a product that doesn’t do anything quite like any other tool in the ‘data’ space – which is to make it fast and easy to navigate the data models of big and complex systems early in a project to ensure accuracy and reduce risk of overrun or under-delivery.
Driving to work the other day I started to think about the similarities between Navigating in a car and Data Discovery – not as unrelated as you might first think.
My drive to work is about 20 miles. The first time I did it was in a pre-satnav era, but it’s pretty simple and I ‘know’ which way to go – the distances and locations of turning points are burned into my synapses and I could just about do the trip in my sleep.
Thinking of data discovery now, on systems that have a database with a few tens of tables, it’s usually quite easy to work out which tables store the required data and how to make appropriate joins. Often a project team will have a data model stuck on the wall showing relevant tables and relationships. You can trace the required tables and joins with your finger, maybe jotting down the relevant table names, columns and join conditions. Once someone has used a set of tables a few times they ‘know’ what the names are by heart – they can always look in the system catalog to get the exact details, but they have a pretty good idea already.
If I have to drive to somewhere new, I normally can’t rely on my mental map. In the pre-satnav era, I would pull out a road map, look at the start and end points for the journey and then work out a route.
For a project that needs access to a broader range of data, the data model printed on the wall might not be sufficient, especially if it’s not complete. The project team need to see what’s available in the database and then decide which tables are relevant. They could ‘reverse engineer’ the database with a tool like Erwin or PowerDesigner, giving a PC-based data model of the tables. Then they can start to make a sub-model from this of the tables they need access to.
If I have to undertake a major drive, maybe several hundred miles, the planning will be more demanding – but since I had a satnav, I tend to rely on it for more complex journeys. I normally ‘sanity check’ the route it chooses, just to be sure it’s taking me to the right place. I like to have a general idea which way the satnav is taking me and how long it will take, but leave it to its own devices at deciding the detail.
Now to the nub of my analogy – what happens on the data discovery side when the underlying database has thousands of tables? In theory a reverse engineered model can be made, but Data Modelling tools don’t lend themselves to large numbers of tables – they use a graphical paradigm to represent the model – which is great for 10s of tables but is not really practical for very large systems.
What’s needed is a ‘satnav’ for large systems, and in particular I mean ERP systems like those from SAP and Oracle. This ‘satnav’ would have a full set of table and relationship definitions for the ERP. It would have features to enable the relevant set of tables and joins to be ‘zoomed in’ on, so the user can quickly determine the right tables and joins to use – just like the real satnav has details of all roads and locations, but only shows those relevant to the journey.
“But”, I hear you ask, “how do people do it now?” Normally they:
a) Ask someone who knows – this will normally be someone with intimate knowledge of the ERP data structures – this is the equivalent of winding down the car window and asking “which way is it to …..?” b) Trial and error – most ‘data’ tools (like BI and ETL tools) have some kind of interface for ERPs, but they all assume you know pretty much up front which tables you require. Not so easy to draw a driving comparison here – I guess it would be like having a series of local maps and trying to use these to go on an interstate journey – not impossible but error-prone and time consuming, with a number of wrong turns and dead ends.
Of course the data discovery satnav is our product Safyr®. Explaining what it does is sometimes like going back in a time machine 30 years and trying to explain to someone what a satnav does:
“It tells you which route to take when you’re driving”
“But how can it do that – how does it know where you are and what all the roads are? Anyway, I don’t need one of those thanks; I’ve got a map book.”
How many of you have satnavs now?
See more about how Safyr could help you navigate the data models in your SAP or Oracle application here.
I’d never really thought about this area…interesting idea