The importance of Data Quality and Data Cleaning - what is good data - and why is it so important?
WHAT IS GOOD DATA – AND WHY IS IT SO IMPORTANT?
These days we hear a lot about big data, but in reality, what we need isn’t more data… it’s better data. Data drives our decisions. But if we don’t have good data, we won’t make good decisions. Data guides our goals, but if we don’t have the right data, we won’t achieve them. Data informs our ideas, but if we don’t have current data, our vision can be impaired. Having good quality data is important for success in so many areas of your organization including:
- Communicating effectively with core constituencies
- Successfully planning and executing events
- Segmenting your target markets, Clients or customers
- Providing superior customer service
- Understanding the needs of clients or customers
- Effectively developing new business
- Improving delivery and reducing costs of postal mailings
ELEMENTS OF GOOD DATA
There is a lot more to good data than just making sure that the data you do have is correct. You also need to make sure that you have enough of the right kinds of data and that the information is current, correct and complete. You also need to be able to access that data when you really need it. For instance, if you are having a local or regional event, you need good, complete address information to figure out who to invite. Additionally, if you want send information to key executives who would find it important, you have to have current job titles to determine the right recipients to send it to.
Several elements of good data include:
- Correct: The data you are using must be accurate to help you accomplish key goals.
- Complete: You have to have all the data that you need without any missing elements or data points
- Relevant: You need to collect all the necessary data that is required
- Timely: Having the most up-to-date information is essential
- Consistent: Data should be stored in a consistent format across all your systems
BIG, BAD DATA
Research indicates that up to 30% of an organization’s key contact data degrades each year. There are a number of reasons that information on people and companies may degrade:
- Are hired, fired, promoted and change jobs
- Move relocate and change addresses
- Get married and divorced
- Retire and even die
- Open and close locations
- Are bought, sold or acquired
- Relocate, move or add offices or locations
- Change their name
Without constant attention to maintaining contact information, all these changes can add up to a huge pile of data that can be incorrect or incomplete, duplicative or dated and contains information that is missing or mistaken.
Additionally, each piece of data can be connected to many more pieces of data in other systems throughout the organization. This means that each piece of flawed data can impact a number of other systems, operations or initiatives.
COSTS OF BAD DATA
The costs of bad data cannot be underestimated. In fact, recent research from IBM shows that poor data quality may be costing organizations more than $3 trillion yearly in the U.S. alone. This includes not only the costs of the data errors but also the costs of mistakes and bad decisions made by relying on the flawed data as well as the significant time that is spent correcting the errors.
Bad data can also sabotage your marketing and sales efforts and make it much harder to accomplish important goals and objectives:
- Communication becomes challenging: If you can’t reach your target audiences and communicate with customers, clients and prospects, your marketing efforts will be less effective.
- Events are ineffective: When invitations don’t reach the right audiences, you can’t connect with the right contacts and end up wasting time and money.
- Technology adoption plummets: If your systems are littered with duplicative, dated data, users won’t participate, and your technology investments will be wasted.
- Coordination is difficult: Unless everyone in your organization is working from the same data, it’s difficult to coordinate sales or business development efforts and activities. When the left hand (or department) doesn’t know what the right is doing (so to speak), it can affect your organization’s reputation.
- Client service suffers: Data is essential to enable you to understand what your customers or clients want and anticipate their needs.
- Costs of compliance increase: Global and regional data privacy rules and regulations such as CASL, GDPR and CCPA are increasing, and the costs for failure to comply can be significant. These types of laws require organizations to properly handle and maintain current data on contacts in order to honor their requests for information and prevent inappropriate communications.
- Competitive advantage is lost: If your organization is not concerned about data, you risk falling behind your competition, because they are definitely focused on it.
- Opportunities for new business are missed: Only by continually connecting with existing customers, clients and prospects can you find opportunities for new business. Without good data, these connections can be elusive.
WHERE TO BEGIN
It’s easy to talk about data problems, but what’s the solution? Actually, there a number of steps and multiple considerations in any data quality project.
Step 1 – Assess the Mess. Start with a Data Quality Assessment
Before attempting to begin any data quality project, it’s important to evaluate your situation. Failure to fully evaluate the extent of data quality issues can result in a project that takes longer and costs more than you expected. This means performing a Data Quality Analysis to help you to understand how much bad data you have and determine the best way to tackle it. This will also allow you to estimate how long the cleanup could take and how much it will cost. A data quality assessment will help you answer some key questions such as:
- How much bad data do we have? Is this a manageable project that can handled with internally effort or will it require additional internal or external resources be deployed?
- Where is all the bad data located? Often bad data isn’t limited to one system. Frequently it’s throughout the organization in disparate systems. Often these systems may be connected, which means that bad data may be flowing from one to another, exacerbating the problem. This also means that the data may be under the control of different departments or teams. To effectively clean this data, multiple departments may need to be involved and may need to work together and coordinate cleanup efforts.
- How did it get there? It’s important to identify all the ways that bad data enters your systems as well as any people and process issues that need to be addressed to prevent or minimize the reintroduction of bad data into your systems after your cleanup efforts.
- How much is it costing us? It can be extremely helpful to evaluate the amount of time, money and other limited resources that your bad data is costing your organization. This data can then be presented to leaders to help them appreciate the need for data quality resources and the savings and other benefits they will receive for their investments in data quality.
Step 2 – Put Together a Plan
Effective data cleansing starts with a plan. Once you know what your data problems are, it’s important to determine the best ways to achieve the desired results. Here are a few planning elements and recommendations to consider:
- What is the best way to clean up the data? Should you utilize a manual data quality process that will be have a high level of accuracy but may also may be more costly or could you perhaps start with an automated data quality process that is quicker and less expensive but will provide a lower level of
- Who will perform the cleanup? Does your organization have internal resources with the right skill sets who can be dedicated to cleaning that data as a part of their role? If it’s a large project, do you want to pay for full-time or part-time resources to assist (along with associated costs such as training, benefits and supervision) If not, it may make more sense to outsource the project to data quality professionals who are experienced and can preform the cleanup more efficiently and effectively.
- How long will the cleanup take? Is the cleanup time-sensitive? Are there impending implementations or projects that rely on the data requiring immediate attention, or could it be handled over a longer period?
- How much will the cleanup cost? One of the best ways to determine the cost of the data cleaning project is to get a professional that will provide you with a report of the total number of errors such as bad email addresses and duplicate contacts as well as information that is missing, outdated or incorrectly formatted. This will also provide a precise estimate of the time and cost required to research, clean and enhance the information. This process could also be done manually by running a number of searches in your system for the errors, compiling the data and determining how fast your people might be able resolve the issues.
Step 3 – Formulate Your Data Strategy
Once you have your plan, you need to determine how you will approach your project to ensure success. Some things to consider include:
- Start with your most important data. All data is not created equal, so focus on the data that will yield the best results and return on your investments of time and effort. For instance, you may want to begin with contact records for current customers or clients. This is the data that is most important to your organization. Start with a manageable group of contacts such as your top 100, 500 or 1000 key companies, along with their associated contacts.
- Review your lists: Review the contacts on your most frequently used lists to ensure that your communications and invitations are reaching the right people. If you have an upcoming email to get out the door, people may be more motivated to help you vet the lists for accuracy.
- Break your project into manageable parts. Striving for 100 percent clean data is not a realistic goal. In fact, the costs of cleaning the entirety of your data can often exceed the benefits. Instead break the project into pieces. For instance, if you are rolling out a new system or trying to enhance technology adoption through a concerted training effort, you may want to clean the contacts of groups of users as they are about to be trained because that is the data they are going to look at first.
- Tackle relevant time-sensitive projects. Frequently there are opportunities to engage your users in the cleanup project. For instance, when there is an upcoming event, people are often focused on making sure the right contacts are invited and will help with the cleanup efforts for the invitations.
Step 4 – Get to Work
Once you have a strategy and plan in place, it’s time to focus on execution. To achieve the right level of data quality for your organization’s unique needs, there are several data cleaning methods to consider:
- Automated Data Cleaning and Appending: One of the quickest and most cost effective ways to improve your data is to reach out to a company that can run your data through a tool that can clean, correct, update and deduplicate the records by comparing it against a database of valid, correct and complete contact information. Often these systems can also append missing data and enhance the data with additional information such as company information and industries, which can be valuable for data segmentation and targeting. You should consider automated data cleaning to when you want to improve your data quickly and the information doesn’t have to be perfect because even the best automated services are only able to accurately update 50 to 75 percent of your data. For some organizations’ data requirements, this is enough. But if your organization requires near-pristine information you may want to consider other options such as…
- Manual Data Cleansing: Manual cleanup or “data stewarding” as it is often called, involves having a person or team go through your data to research, validate, deduplicate, clean and enhance the records or information. While this task can be performed by internal employees, the process can be expensive and time-consuming. The average annual cost of an experienced and well-trained in-house data steward can range from $70,000 to more than $100,000 including hiring costs, salary, benefits, training, supervision, office space and other expenses). The work is also be rote and repetitive, which can lead to a revolving door in staff turnover. Many organizations have learned this lesson the hard way and now instead choose to have the process performed by experienced outsourced data quality professionals. These experienced data quality specialists are trained to apply research, insight and expertise to quickly and cost-effectively resolve data problems. They also have data quality consultants who can help you to establish processes and procedures to improve your organization’s ongoing data quality.
- Combination Method: One of the best ways to clean your data in the shortest time is to have both processes performed together. First, an automated clean and append process is performed as a quick and cost-effective first step to improve as much data as possible, reducing the number of records. Then manual data stewarding can be performed at a reduced cost on remaining records to quality check and enhance the accuracy of the automated process.
Step 5 – Start with Style
To enhance your cleanup efforts and prevent future data quality problems, it’s essential to establish and document your organization’s data standards in a Data Quality Standards Manual. This document outlines your organization’s specific data styles and processes for consistent contact data entry, formatting and maintenance. This document is important for training anyone in your organization on how to input data into key systems correctly and consistently. It is an invaluable resource that guides your data stewards and makes cleanup efforts faster and more efficient. A data style and standards manual should include instructions for entering elements such as:
- Address standardization and formatting
- Formal names and nicknames
- Phone number guidelines
- Company names and business designations
- Job roles and titles
- Honorary and professional titles
- Government entities and politicians
- Names of universities and academic institutions
- Prefixes, salutations and suffixes
- Websites and e-mail addresses
- Records for couples
- International address and phone numbers
- Retired and deceased contacts
- Merged, acquired and out of business companies
- Social media profile information
- Abbreviations and punctuation
Additionally, as a data project progresses, new situations sometimes arise that require data quality additions or modifications. Because the manual is a working document, these ongoing changes should be reflected in updates and shared with anyone who inputs data to ensure that your style guide stays current and relevant.
Step 6 – Focus on the Future
Once your data has been cleaned, you need to think about how to maintain it going forward. Here are some key points to consider:
- It’s never over. Too many organizations undertake a massive data cleaning intuitive only to reduce the resources once they think the project is finished. The problem is that a CRM implementation is never really over because CRM not a one-time initiative or project – it’s a fundamental change in how your organization manages its most important assets: it’s contacts and relationships. This means it never ends – and neither should the data cleaning. The right level of resources should be dedicated for ongoing maintenance to ensure that data doesn’t degrade again.
- Review new data regularly. Put processes in place to make ongoing data reviews routine. Go through bounced emails after each campaign or mailing and, at a minimum, remove them from lists. A better process it to have internal or outsourced data quality resources research the contacts to identify where they are now so you can keep in contact with them. An even better process is to regularly run your lists through and inexpensive automated email validation to identify bad addresses in advance. Then they can be researched and updated before each campaign to ensure that your information actually reaches your targets in a timely manner.
It’s everyone’s job. Remember the adage, “many hands make light work?” Well, this is especially true with data. It’s important to communicate to everyone in the organization who deals with your data that if they just take a little time or effort to regularly review and update the data they are working with, it can reduce the ongoing data quality resources that are required and save the whole organization a lot of time and effort in the long run.
SOME GOOD NEWS ABOUT BAD DATA
After learning all you need to know about data cleansing, it’s easy to feel overwhelmed. But rest assured, the task is absolutely manageable if you just dedicate the resources and follow the instructions above. Additionally, once your team begins regularly maintaining your data, the cleanup will get easier over time. And remember, because data cleaning never really ends, the good news is that this means you have forever to get better at it.
If you need additional resources for or assistance with your data quality project, the team at CLIENTSFirst Consulting can help. We provide both automated and manual U.S. based data cleaning and our team of almost 100 dedicated outsourced data quality professionals and experienced data quality consultants can assist you with any size or type of data quality project. We can also train your data stewards, create your styles and standards guide and help to put people and processes in place to help with ongoing data maintenance. Contact us for a complimentary assessment of your data quality needs at 404-249-9914 or firstname.lastname@example.org.
ADDITIONAL DATA QUALITY RESOURCES
To read more relevant information on data quality
- Categorizing Contacts for CRM Success
- Building Quality Data – A Sound Structure for Your CRM
- The Dirty Data Domino Effect – Free Download
- Garbage in/Garbage Out
- Dealing with Data Quality Depression