Data quality
How to ensure your data creates value
AI’s learning efficacy is not a given. It’s largely determined by an organization’s approach to data quality. Precision, completeness, harmony, timeliness, relevance — the attributes associated with data integrity also define generative AI’s reliability. And reliability, in turn, dictates generative AI’s ability to generate a meaningful return on your investment.
Regularly (e.g., quarterly)
Occasionally (e.g., annually)
Rarely or never
Why is it that only 23% of business leaders view data quality as a barrier to generative AI? Sheer novelty may be the answer.
“Most IT budgets did not even anticipate generative AI a year and a half ago,” explains David McCurdy, Chief Enterprise Architect and CTO at Insight. “Just about every business leader is fixated on how this technology can reinvent their operations and create new business models.”
In creating those new business models, the focus should be narrowed to actionable insights and scalable initiatives. To this end, Jason Rader, VP and CISO at Insight, suggests targeted data hygiene practices centered on specific AI applications.
“If you want to do a data clean up exercise, don’t get super-focused on the data estate as a whole,” says Rader. “What we’re discovering is that — based on a specific use case being optimized or whether the data is structured or unstructured — there are priorities that you can focus on rather than trying to tackle a massive mountain of data.” Meaning that businesses must adopt a set of core data management strategy tailored to the needs of generative AI.
Key considerations for improving data quality for generative AIPreserving data integrity or improving data quality for generative AI entails more than general housekeeping. It requires a strategy that firmly embeds data integrity into the fabric of AI initiatives. Primary areas to consider include:
Data auditing Regularly review datasets for accuracy and completeness to ensure AI models are fed with high-fidelity data.
Continuous integration Seamlessly merge data from various sources to maintain a unified, accessible and up-to-date data environment.
Data governance Establish clear policies and procedures for data usage, privacy and security to establish and engender trust in AI outputs.
Quality over quantity Focus on the relevance and quality of data rather than the volume by aligning datasets with specific AI objectives.
Data cleansing Implement systematic processes to correct or remove incorrect, incomplete or irrelevant data from datasets.
Scalable infrastructure Design a data architecture that can grow with your AI needs, accommodate new data sources and expand analytic capabilities.
Ethical data use Create normative guidelines for data management practices that comply with ethical standards and remove implicit biases.
Training and enablement Equip teams with the requisite skills and understanding to manage and use data effectively and responsibly for generative AI applications.
Integrating these practices into an organization’s framework involves creating a systematic approach that conforms with an organization’s operational rhythms. To this end, assigning clear roles and responsibilities and using technology that automates and simplifies data tasks can greatly alleviate the perceived burden of data management and enhance overall capabilities. We can look to several real-life examples to understand the transformational impact of those enhanced capabilities.
Organizations with an AI Center of Excellence (CoE) have the distinct advantage of innovating in an environment where generative AI can thrive without overwhelming internal or external stakeholders.
These success stories demonstrate the strategic advantage of high-quality data in unleashing the full potential of gen AI.
Summary: Over six weeks, Insight co-developed a generative AI solution with this major electronics firm. The solution was similar to InsightGPT, consisting of a private instance of Azure® OpenAI® that pulls customer and supplier contracts, 10-Ks and other public financial documentation to be used by sales, marketing and other teammates.
Anticipated outcomes: Quick access to relevant information will support sales and marketing efforts, especially as they conduct customer-facing work. Further use cases are being identified to reduce employees’ menial workloads — and refocus their time on higher-priority work.
Future business: The Insight team has provided a 12-month roadmap that includes a sequence of projects to expand the client’s generative AI footprint to other enterprise systems and data strategy tasks.
Summary: By using Insight Lens™, this client is enhancing its customer service with generative AI. Through the retrieval of job postings and other hiring metadata from several public APIs, the solution is streamlining the candidate-role match process.
Anticipated outcomes: With enhanced candidate-role matching, this client is expected to reduce time on both sides of the staffing equation — with job seekers able to find a fitting role and employers finding higher-quality matches in shorter periods.
Future business: Once ROI is validated with this initial project, the staffing firm plans to expand use cases across the enterprise.
Summary: This client had a problem with data that was difficult to access across different departments. They wanted a generative AI solution that could rapidly learn from many internal documents and feed them into applications and databases.
Anticipated outcomes: With a Proof of Concept (PoC) rollout, this business conglomerate will allow numerous departments to experiment with the generative AI solution in a private secure instance to augment its workflows.
Future business: This client is eager to deploy more generative AI use cases across its many subsidiaries to expand the powerful benefits.
How will organizations know when their generative AI capabilities are running at maximum efficiency for long-term success?
“You won’t need to ask, ‘What data do we have, and how are we going to use it to take advantage of generative AI?’” says Carm Taglienti, Chief Data Officer at Insight. “You’ll be in a position where you can ask those questions of your data and get the answers you’re looking for.”
That’s because the strategic prioritization of data quality transforms the potential of generative AI into tangible outcomes. By embedding industry-leading data management practices into their operational fabric, organizations can ensure that their AI systems are as reliable and effective as they are innovative and interactive.
Is data quality creating friction on your road to AI adoption? Discover how Insight experts can set up your data estate for generative AI success.