Demystifying Data Quality
Three myths in realizing data quality
|Thursday, April 20, 2023|
By Prashanth Southekal, PHD, MBA, ICD.D
Director at Kral Ussery LLC
Increasingly more corporate boards and executives understand the importance of data and information flows in accounting, financial reporting, and ultimately improved business performance. It is not only the structured data in ERP systems, but also unstructured data such as text, audio, video, and images on customer sales contracts, inspection reports, customer complaints, and more that are very important in accounting and finance. However, the majority of the data in enterprises is of poor quality. According to a report in HBR (Harvard Business Review), just 3% of the data in a business enterprise meets quality standards and research by Carnegie Mellon reveals that an estimated 90% of data in an organization is never successfully used for any strategic purpose.
Improving the data quality levels in business has profound impacts on meeting regulations and enhancing performance. Business data that is accessible, accurate, timely, protected, valid, and verifiable is part of Principle 13 of COSO’s Internal Control – Integrated Framework (Framework) that states “The organization obtains or generates and uses relevant, quality information to support the functioning of internal control.” While COSO’s Framework is used by the vast majority of public companies in satisfying the SEC’s requirements of Management's Annual Report on Internal Control Over Financial, the impact of improved data quality is not just a regulatory matter. Quality data results in improved business performance. Research by MIT found that digitally mature firms are 26% more profitable than their peers. Mckinsey Consulting found that companies that are insight-driven report above-market growth and EBITDA (Earnings Before Interest, Taxes, Depreciation, and Amortization) increases of up to 25%.
But what exactly is data quality? Currently, there is no one universal definition on what is “data quality” and what attributes constitute the key data quality dimensions. Even though data quality is contextual and determined by the location, time, purpose, and other circumstances it can be defined and measured on different dimensions. In this regard, data is considered to be of high quality if it is fit for use in operations, compliance, and decision-making. This definition of data quality aligns well with the three objectives of operations, reporting, and compliance listed in COSO’s Framework. However, there are many myths associated with realizing data quality. These myths create misconceptions that potentially prevent organizations from achieving improved data quality. This article explores three important data quality myths and their corresponding realities.
Myth 1: Data is a Business Asset
The reality is data is a business asset only if it is managed well; if not data is a liability. Data has the potential to improve the company's revenue, reduce expenses, mitigate risk, and become a valuable business asset. But it has some serious limitations and can become a huge liability if not managed well. Here are four common scenarios where data can become a liability for the business:
1. Collecting and storing data without a defined business purpose will result in huge data volumes, ultimately resulting in increased data management complexity and cost.
2. Data takes up vast amounts of energy to store, secure, and process, resulting in an increase in the carbon footprint of the business. This makes it less attractive for investors considering their growing interest in ESG (environmental, social and governance) commitments these days.
3. Cybercriminals are drawn to organizations that have large volumes of data. In other words, having more data makes your organization an attractive target for Cybercriminals.
4. Managing data also entails privacy compliance. Facebook lost $35 billion in market value following the Cambridge Analytica data scandal. In addition, the scandal resulted in the permanent closure of Cambridge Analytica. While it was data that was responsible for the success and growth of Cambridge Analytica, it was the same data that resulted in the collapse and ultimate closure of the organization.
Overall, data is a business asset only if it is managed well; if not data is a liability.
Myth 2: 100% Data Quality is essential for Analytics
The reality is, in analytics – perfection is the enemy of progress. A 100% threshold of quality data simply doesn't exist for analytics. As mentioned earlier, business data is used for three main purposes – operations, compliance, and decision-making. Data is often originated and captured for operations and compliance in a defined and deterministic manner. But when data is used in analytics to derive insights for decision-making, the focus shifts from operations and compliance to improvement, innovation, experimentation, productivity, and more. All these initiatives are based on hypothesis and often the data is not always available. In other words, the more powerful or futuristic (or predictive) or prescriptive (what-if) your questions are, there is the likelihood that data is not available, given that data is always a record or evidence of a historical event. For this reason, it is often said – analytics is a compass and not a GPS.
Myth 3: All Data is valuable all the time
The reality is data value depends on the data type and the current DLC (data lifecycle) stage. Firstly, data value depends on the data type. While data can be classified from various perspectives, data can also be classified into reference data (on business categories such as production plants and chart of accounts), master data (on business entities such as customers, products, and GL accounts), and transactional or GL entry data (on business entities such as orders and invoices). While master data and reference data are relatively static, transactional data is contextual and has more business value. Basically, transactional data represents the consumption of business assets (such as equipment, customers, products, and vendors) and can provide insight into how these business assets are managed. Also, transactional data, unlike reference data and master data, represents the monetary value that could impact the firm’s profitability. This means compliance with accounting standards like US GAPP and IFRS are based largely on transactional data. Secondly, data value also depends on the DLC (data lifecycle) stage. The DLC in a business enterprise typically involves four stages – data capture, data integration, data science, and decision science. When data is captured, there is little value as the purpose is mainly for operations and compliance. As the data moves across the DLC, the purpose expands to analytics and decision-making. Forrester found that organizations that use data to derive insights for decision-making are almost three times more likely to achieve double-digit growth.
Overall, the practices for improving data quality vary from one company to another as the data quality factors are dependent on a host of diverse variables such as the industry type, size, operating characteristics, competitive landscape, associated risks, stakeholder groups, and more. However, the following are some great practices that will go a long way towards improving business performance;
• creating and managing a data catalog,
• maintaining critical data in the system of record (SoR) for standard business processes,
• implementing robust controls over spreadsheets and other unstructured data,
• maintaining sound data integration solutions,
• carrying out regular data literacy training programs, and
• instituting a data governance program including identifying the right roles and responsibilities.