So although I am sure very few of you will get the Ernest/Vern reference, I still find it funny (and isn’t that what really matters). This old post, originally posted almost a year ago, actually drove some of the content for this more in depth article about the evolution of both data (sort of) and the role of data analyst.

The importance of Being Ernest

The rise in popularity of CDPs (Customer Data Platform) is no surprise.  Marketers have been wanting this omnichannel view for decades.  CDPs seem like the smoking gun that will provide this view.  But the crux of this view relies on the underlying data.  Ugh.  This doesn’t seem like it should be a big deal, but it can be.  It can be shocking to see what you find once you get into the weeds.  Most companies feel that their data is pristine and easily ready for harmonization.  

Harmonization?  What is that?  If you want an omnichannel view of your customers, you need to marry multiple data sources.  To simplify, for B2C companies those data sources may include  online (ecommerce), offline (Point of Sale), Lifetime Value (CRM) and marketing sources.  Hmmm… how to do marry all of this?  We’ll you need to find a common ‘key’ to join all of this data.   So maybe it exists.  Maybe you have a common naming convention.  But no, most don’t.

Ideally a common key would be:
CRM – Lead Name ‘Joy Brazelle – Current Active Customer’
Point of Sale = ‘ Customer Name – Joy Brazelle’
Ecommerce Site – ‘Joy Brazelle – Conversion Customer Name’
Marketing Campaign ‘Retargeting/Abandoned Cart – Joy Brazelle’

You can join the three data sets with a ‘contains’ ‘Joy Brazelle’

But do you know how rare this actually is.  Normally what I see is :

CRM – Lead Name ‘Jo Brazelle’
Point of Sale – Customer Name ‘Brazele, J’
Ecommerce Site – ‘Conversions’
Marketing Campaign ‘Leads – Retargeting/Abandoned Cart ID 12123’

Hmmm… nothing in common.  So to combine these data sources will require the understanding of what currently exists and then a ransformation which is not always a straight forward task.

And here’s the bigger problem… you spend all this time to combine the data sources without first auditing the actual data.  So nothing works out as you expect.  Ughh, ughh, uggh.

So my lessons learned…

Starting even before CDPs were on the radar, there have always been ETL projects (enhance, transform & load) to join disparate data sets.  Several years ago I was working on a project to merge 20 some CRM data sources into Salesforce.  The project was over budget and past due.  The plan was to load the files into SPSS and join them.  Yet it was fraught with errors.  After a few frustrating days of failing to load the files, a lightbulb went off.  Open the files!  When I did it was crazily obvious.  The data was garbage,   full of bad characters and many other unexpected entries.  Once we realized the data was crap, we went back to the businesses, had them clean up the data and the ETL project was a piece of cake.

Net net – Don’t assume the data is pristine, no matter how confident the data owner seems.  Open the file, Vern!


Enjoy this blog? Please spread the word :)