Brent Dykes has spent more than 15 years in the analytics industry, consulting with some of the world’s most recognized brands, such as Microsoft, Sony, Nike, Amazon, and Comcast. Currently, he is the Senior Director of Insights and Storytelling at Blast Analytics. As an analyst, manager, and technology evangelist at companies such as Omniture, Adobe, and Domo, Brent witnessed first-hand the challenges of communicating data effectively.
Brent is a regular Forbes contributor with more than 30 published articles on various data topics. He is also the author of Effective Data Storytelling: How to Drive Change with Data, Narrative, and Visuals.
You have been in the digital analytics industry for more than 15 years. How often do you encounter data quality issues in your work with different customers?
BD: Data quality issues are everywhere. Making sure that the tagging is correct and the data is consistent is a constant concern. As a consultant, I would go in and try to perform analysis and run into basic data quality issues. For example, there may be five different versions of the home page in the top pages report. I hope most companies these days have figured out page naming and data aggregation, but campaign tracking continues to be a pain point for many companies, especially in the context of large organizations working with many different agencies who have their own methodology and their own processes around tagging.
The sad thing is that if you don't tag things correctly from the outset and if the data is not correct, in most cases you can't go back in time and correct it. You can't turn back time and retag or recollect data points that were missed or fix issues in the past. Once issues are discovered there is usually a path to clean things up from that point forward, but that might be too late. You might need the data from last month or last week, and you don't have it anymore. Not only are you losing the ability to make decisions, but you also lose comparables—tracking of trends week over week, month over month, or year over year. The data for all of these ranges needs to be reset and readjusted, which can be really frustrating for leaders. It can erode their confidence in the data and the analytics team.
I have worked with organizations where they don't trust the data all, so they start making gut decisions and the data is no longer a factor. "Flying blind" is a horrible term I hate to hear as an analytics professional.
If accurate data is the foundation of good data storytelling, can you share some practices that organizations employ to ensure the quality of their clickstream data (Adobe or Google Analytics)? Situations where data is being collected continuously and the laying of a data foundation is a process that never stops?
BD: We hear that scary word: "Governance." It comes up often, but the desire behind governance is to make sure that the quality of the data remains reliable and credible. It is not a case of trying to create good quality data one time, it is an ongoing process. Standards and policies need to be set in place to make sure that the data is high quality.
Data, as we all know, is never going to be perfect. There will always be data issues that occur, but it comes down to how quickly you can spot and resolve these problems to maintain your data quality.
Data quality is a shared responsibility and sometimes in organizations it may be portrayed as the responsibility of the analytics team or the data team. However, in many cases, business teams can play an important role in detecting anomalies. For example, there may be situations where the data may not make sense such as the number of orders being inflated because they’re being double counted. Data is a valuable asset and responsibility for the quality of the data should be shared. It is in everyone's best interest to make sure that the data is kept clean and as free as possible from errors and bugs that can erode the quality of the data.
I think what it comes down to ultimately is having a culture that recognizes the importance of data quality and then having these standards, processes, and tools in place to make sure that the data is kept clean and usable for gathering insights.
In context of data storytelling, what should come first, the data or the storytelling?
BD: In the cases of good storytelling, you always start with data. I talk about it in my book—data is the foundation of every data story. I can build a very pretty and compelling data story without reliable data, but in that case, I could be misleading the audience. It would be more of a lie, a data forgery, and not a data story. When you don't have reliable data underlying your insights, that's where you could be misleading people and leading them to make bad decisions about the business. The decisions may align with your agenda, but that does not mean that they are the right ones for the team or the overall business.
In the vein of data forgeries, another common type that a lot of analytics professional make is they assume that the data speaks to them so it should speak to other people equally well. The difference is the audience has not spent hours or days in the data like the analyst has. They may be lacking context and may not understand the charts the same way as the analyst does. I call this a "data cut" forgery, meaning it is like a director's cut/version of your data story where you really have not edited it for the end audience. It is going to be harder for the audience to interpret the data right way. An analyst may be able to see through the noise and spot the insight, but it will be challenging for other people to do so.
What do you see as the role of the analyst when data is found to be flawed in some serious way, yet there are compelling business requirements to perform analysis?
BD: As a consultant I have often come into situations where, in trying to answer certain business questions, I will uncover errors in the data. It is important to recognize that there are different degrees of errors. There are errors that can be looked past or acknowledged as errors, but at the same time, there is sufficient data for you to still extract insights to inform a business decision. We can’t just throw our hands in the air and decree "This data is not perfect" so we can’t use it at all.
I think a practical approach is necessary where analysts should try and use the data as best as they can. There may be limiting factors where the analyst won't be able to make as bold of an assertion as they could if the data were clean, but they can still get a general idea or uncover directional insights, trends, or patterns despite issues with the underlying data.
Unless you are in a university setting where the data has to be perfect and pristine, most of us realize that in the real world there are always going to be data issues. If an analyst is not comfortable in making do with what data is available, I think they are limiting their career because we need to be pragmatic, resourceful, and flexible.
Having said that, sometimes there are situations where the data is completely unusable, and we have to acknowledge that as well. In my experience, it would be very rare that the data is completely useless, but that has certainly happened. In such cases, we need to be transparent with our audience and let them know what we could and couldn’t ascertain with the available data.
With erroneous data, you can sometimes smooth over errors such as gaps in the collected data. However, if the data is completely blotched, you'll want to look at the process and fix the underlying reasons that led to such a mistake. Perhaps, it was caused by poor planning, incomplete requirements gathering, communication issues with a vendor, a bad handoff between teams, or skipping QA at the end. Whatever the underlying reason that contributed to the error, you'll want to address the process to avoid future issues.
Sometimes there are cases where the data itself is clean but analysts are under pressure to "uncover" insights that fit a certain narrative. For example, an organization and its executives may have invested heavily in a site redesign or a particular campaign which "has to pay off." So the data is cherry-picked to support a predetermined narrative. Have you run into such cases and how do you deal with them?
BD: I have seen this with ad agencies that want to demonstrate that their campaigns are successful. They will look for a metric, any metric, that improved as opposed to the original intended goal and established campaign objectives. Perhaps, the original goal of the campaign was to drive orders, but it gets shifted to click throughs or other metrics related to brand awareness. It is almost like setting up an alternative reality where the numbers and performance don't matter. The narrative supersedes everything else, including the facts.
It really comes down to what type of data culture exists in your organization. Nowadays all organizations have data, but the companies that continue to struggle are ones without a strong data culture. In such organizations, when push comes to shove, the narrative is more important than the data. In such cases, the data is accepted only when it supports the narrative and if that is not the case, the stakeholders will resort to all sorts of tricks—ignoring certain data points and even inventing data that supports the narrative.
As part of a healthy data culture, an organization should establish upfront the business goals and KPIs that will be used to measure the success of a given initiative. If the initiative doesn’t perform as expected, the organization should examine the reasons why a particular initiative failed. If an organization isn’t able to hold itself accountable, how can it learn and improve?
As an analyst, you can't necessarily change the culture of your organization. If you find yourself constantly bending the numbers to different narratives, you need to step back and evaluate whether this is an isolated or pervasive problem at your company. If you find you’re constantly cherry-picking numbers to support certain narratives and it is indeed a company-wide issue, you may want to reconsider your career path. There are many other companies out there that are looking to become more data-driven and respect the numbers—not just the narrative.
If you've ever witnessed (or imagined) a well-oiled and healthy dynamic between data collection and data storytelling, can you describe what that looked like?
BD: I think a lot of organizations are working towards this. I don't know if anyone has necessarily perfected it. I think it starts with laying out the plan, establishing the objectives, and making clear what KPIs are to be measured. Data storytelling comes into play when we find insights, and we want to communicate them to other people and get their buy-in or approval to act on a particular insight.
Campaign management is a good example where you'd want to continuously evaluate the numbers to figure out what works well and constantly adapt or modify the campaign to optimize for the end result, be it orders, revenue, leads, or a different KPI. Once the campaign ends, you'd want to perform a postmortem to identify any learnings so you can apply these insights to the next campaign.
It is a constant process of learning and optimizing. In some organizations, the data collection can end up being an afterthought. Most of the time and effort is spent on the strategy, creative or execution aspects. Measurement considerations are left to the very end, which can lead to problems. Measurement must be one of the key pillars of such initiatives. It informs many aspects of any such campaign—from determining what success looks, through the cadence and method used to communicate the data to different teams, to what actions the organization takes in different scenarios to mitigate or amplify campaign performance.
Terms used in the digital analytics space often become buzzwords and then quickly lose their meaning. Have you seen this happen to the term "data storytelling" and what are some of the factors that come into play?
BD: When I first started writing my book, I was very worried that "data storytelling" would be just another buzzword. We would lose it to the scrap pile of buzzwords that marketers latch on to. I feel "data storytelling" is a huge opportunity and something really powerful.
Some of the developments that I have seen erode its meaning include the perception that data storytelling is just about data visualization. You will see a lot of people focusing on the data visualization aspect, missing the importance of the data and narrative elements. This was one of the reasons that compelled me to write my book. If you think about us human beings, we have been telling stories for millennia—that's how we share important information, how we learn, and how our brains process information.
Another concern that I have is about technology and how technology can be perceived to hold all the answers. The role of storytelling is diminished to "just adding some text" as an annotation. And that's so wrong, storytelling provides meaning—we are not just describing the numbers, we are explaining them.
Artificial intelligence and natural language generation (NLG) can be used to provide a complementary description of the visualized data. However, that's a lower order to actually understanding what is causing certain developments in the data and providing an explanation. Just having some text that says, "there was an 83 percent increase in a metric on a given date" does not provide the context or answer the question of "what lead to the increase?" Maybe down the road technology will be able to offer more explanatory assistance, but I still think that human beings will be critical as storytellers. Humans possess both the context as well as the ability to make connections across different data sets and take into account external factors.
There is definitely a danger that "data storytelling" will be relegated to just being a buzzword, but this is exactly one of the reasons I wrote my book. As analytics professionals, we have an extraordinary opportunity to drive positive outcomes with our insights and communicating them effectively will rely on our data storytelling skills.
QA2L is a data governance platform specializing in the automated validation of tracking tags/pixels. We focus on making it easy to automate even the most complicated user journeys / flows and to QA all your KPIs in a robust set of tests that is a breeze to maintain.