Data quality essential in training ChatGPT

Issue 7 2023 AI & Data Analytics

It is a year since OpenAI launched ChatGPT to the public, with adoption rates skyrocketing at an unprecedented pace. By February 2023, Reuters reported an estimated 100 million active users. Fast forward to September, and the ChatGPT website has attracted nearly 1,5 billion visitors, showcasing the platform’s immense popularity and integral role in today’s digital landscape.

Willem Conradie, CTO of PBT Group, reflects on this journey, noting the significant usage and adoption of ChatGPT across various sectors. “The rise of ChatGPT has highlighted significant concerns. These range from biased outputs, question misinterpretation, inconsistent answers, lack of empathy, and security issues. To navigate these, the concept of Responsible AI has gained momentum, emphasising the importance of applying AI with fair, inclusive, secure, transparent, accountable, and ethical intent. Adopting such an approach is vital, especially when dealing with fabricated information when ChatGPT provides incorrect or outdated information,” says Conradie.

Of course, the platform’s versatility extends beyond public use. It serves as a powerful tool in corporate environments, enhancing various business processes such as customer service enquiries, email drafting, personal assistant tasks, keyword searches, and creating presentations. For the best performance, it is essential that ChatGPT provides accurate responses. This necessitates training on data that is relevant to the company and accurate and timely.

“Consider a scenario where ChatGPT is employed to automatically service customer enquiries, with the aim of enhancing customer experience by delivering personalised responses. If the underlying data quality is compromised, ChatGPT may provide inaccurate responses, ranging from minor errors like incorrect customer names to major issues like providing incorrect self-help instructions on the company’s mobile app. Such inaccuracies could lead to customer frustration, ultimately damaging the customer experience and negating the intended positive outcomes.”

Addressing such data quality concerns is paramount. Ensuring relevance is the first step. This requires the data used for model training to align with the business context in which ChatGPT operates. Timeliness is another critical factor, as outdated data could lead to inaccurate responses. The data must also be complete. Ensuring the dataset is free from missing values, duplicates, or irrelevant entries is important, as these could also result in incorrect responses and actions.

Moreover, continuously improving the model through reinforcement learning incorporating user feedback into model retraining cycles, is essential. This assists ChatGPT, and conversational AI models in general, to learn from their interactions, adapt, and enhance their response quality over time.

“The data quality management practices highlighted here, while not exhaustive, serve as a practical starting point. They are applicable not just to ChatGPT, but to conversational AI and other AI applications like generative AI. All this reinforces the importance of data quality across the spectrum of AI technologies,” concludes Conradie.




Share this article:
Share via emailShare via LinkedInPrint this page



Further reading:

Growing adoption of AI at work
News & Events AI & Data Analytics
AI adoption accelerates worldwide, with South Africa making gains amid uneven diffusion. Locally, South Africa ranks 46th of 147 economies measured, and its AI usage increased to 23,1% in Q1 2026.

Read more...
Enterprise AI hits the wall
News & Events AI & Data Analytics
Demands for AI privacy and sovereignty expose the limits of architectures built for centralised and borderless data flows. Organisations that redesign early are gaining a measurable edge in AI readiness and scale.

Read more...
Video accelerates smart manufacturing processes
Hikvision South Africa AI & Data Analytics
Combined with the reliability of video systems and industrial IoT connectivity, large-scale AI transforms video from a record-keeping tool into a core intelligence engine for the factory.

Read more...
Enabling the next wave of intelligent innovation
Altron Arrow AI & Data Analytics
Across the African continent, organisations are increasingly recognising AI as a catalyst for economic growth, operational efficiency, and digital transformation. Yet, one critical challenge continues to slow adoption: access to the right infrastructure.

Read more...
AI trust depends on resilient data foundations in critical industries
AI & Data Analytics
The latest South African Generative AI Roadmap 2025 found that 67% of respondents reported current GenAI adoption, up from 45% in 2024, a sharp shift from planning to active use.

Read more...
Taking control of IAM in the AI era
Access Control & Identity Management AI & Data Analytics
AI and Shadow AI are proliferating, creating a series of new risks for organisations. To gain control over who and what has access to corporate data, organisations need unified control over their entire environment.

Read more...
IQSight SmartSuite integration with XProtect
Surveillance News & Events AI & Data Analytics
Milestone Systems and IQSight have strengthened their collaboration with the release of SmartSuite, a consolidated plug-in suite for Milestone XProtect video management software, to cut installation time for system integrators by 70%.

Read more...
Smart port monitoring and automated container tracking
LD Africa AI & Data Analytics Surveillance Logistics (Industry)
A leading shipping port set out to improve visibility, security, and operational efficiency across its site, turning to an advanced monitoring solution powered by Axxon PSIM.

Read more...
Claude Mythos wake-up call
Technews Publishing AI & Data Analytics Information Security
AI has crossed a critical cybersecurity threshold and frontier models are accelerating attack lifecycles and will enable attackers to identify and exploit vulnerabilities at scale and speed, through novel methods that were previously the domain of advanced nation-state entities.

Read more...
When your security starts thinking with you
Secutel Technologies Surveillance Perimeter Security, Alarms & Intruder Detection AI & Data Analytics
If you manage a warehouse or logistics environment, you already understand how quickly risk can escalate during the day and after hours. The question is: how quickly can you respond?

Read more...










While every effort has been made to ensure the accuracy of the information contained herein, the publisher and its agents cannot be held responsible for any errors contained, or any loss incurred as a result. Articles published do not necessarily reflect the views of the publishers. The editor reserves the right to alter or cut copy. Articles submitted are deemed to have been cleared for publication. Advertisements and company contact details are published as provided by the advertiser. Technews Publishing (Pty) Ltd cannot be held responsible for the accuracy or veracity of supplied material.




© Technews Publishing (Pty) Ltd. | All Rights Reserved.