Data quality essential in training ChatGPT

Issue 7 2023 AI & Data Analytics

It is a year since OpenAI launched ChatGPT to the public, with adoption rates skyrocketing at an unprecedented pace. By February 2023, Reuters reported an estimated 100 million active users. Fast forward to September, and the ChatGPT website has attracted nearly 1,5 billion visitors, showcasing the platform’s immense popularity and integral role in today’s digital landscape.

Willem Conradie, CTO of PBT Group, reflects on this journey, noting the significant usage and adoption of ChatGPT across various sectors. “The rise of ChatGPT has highlighted significant concerns. These range from biased outputs, question misinterpretation, inconsistent answers, lack of empathy, and security issues. To navigate these, the concept of Responsible AI has gained momentum, emphasising the importance of applying AI with fair, inclusive, secure, transparent, accountable, and ethical intent. Adopting such an approach is vital, especially when dealing with fabricated information when ChatGPT provides incorrect or outdated information,” says Conradie.

Of course, the platform’s versatility extends beyond public use. It serves as a powerful tool in corporate environments, enhancing various business processes such as customer service enquiries, email drafting, personal assistant tasks, keyword searches, and creating presentations. For the best performance, it is essential that ChatGPT provides accurate responses. This necessitates training on data that is relevant to the company and accurate and timely.

“Consider a scenario where ChatGPT is employed to automatically service customer enquiries, with the aim of enhancing customer experience by delivering personalised responses. If the underlying data quality is compromised, ChatGPT may provide inaccurate responses, ranging from minor errors like incorrect customer names to major issues like providing incorrect self-help instructions on the company’s mobile app. Such inaccuracies could lead to customer frustration, ultimately damaging the customer experience and negating the intended positive outcomes.”

Addressing such data quality concerns is paramount. Ensuring relevance is the first step. This requires the data used for model training to align with the business context in which ChatGPT operates. Timeliness is another critical factor, as outdated data could lead to inaccurate responses. The data must also be complete. Ensuring the dataset is free from missing values, duplicates, or irrelevant entries is important, as these could also result in incorrect responses and actions.

Moreover, continuously improving the model through reinforcement learning incorporating user feedback into model retraining cycles, is essential. This assists ChatGPT, and conversational AI models in general, to learn from their interactions, adapt, and enhance their response quality over time.

“The data quality management practices highlighted here, while not exhaustive, serve as a practical starting point. They are applicable not just to ChatGPT, but to conversational AI and other AI applications like generative AI. All this reinforces the importance of data quality across the spectrum of AI technologies,” concludes Conradie.




Share this article:
Share via emailShare via LinkedInPrint this page



Further reading:

The global generative AI market surpassed $130 billion in 2024
News & Events AI & Data Analytics
According to a new research report from the IoT analyst firm, Berg Insight, the Generative AI (GenAI) market grew substantially in 2024, experiencing triple-digit growth rates in all three major segments: GenAI hardware, foundation models, and development platforms.

Read more...
Questing for the quantum AI advantage
Infrastructure AI & Data Analytics
The clock is ticking down to the realisation of quantum AI and the sought-after ‘quantum advantage’. In many boardrooms, however, quantum remains mysterious; full of promise, but not fully understood.

Read more...
IoT-driven smart data to stay ahead
IoT & Automation Infrastructure AI & Data Analytics
In a world where uncertainty is constant, the real competitive edge lies in foresight. Businesses that turn real-time data into proactive strategies will not just survive, they will lead.

Read more...
SA businesses embrace GenAI, but strategy and skills lag
News & Events AI & Data Analytics
South African enterprises are rapidly integrating Generative AI (GenAI) into their operations, but most are doing so without formal strategies, dedicated leadership, or the infrastructure required to maximise value and minimise risk.

Read more...
Eagle Eye Precision Person & Vehicle Detection
Surveillance Products & Solutions AI & Data Analytics
Eagle Eye’s new Precision Person & Vehicle Detection feature detects people and vehicles at long distances with high accuracy and is especially designed for customers who actively monitor for intruders

Read more...
Can AI improve operational challenges?
AI & Data Analytics Industrial (Industry)
AI offers local manufacturers an answer to a growing list of operational challenges. The increasing sophistication of AI solutions could not come at a better time for South African manufacturers, who are grappling with declining sales and the uncertainty of global trade.

Read more...
Hikvision launches AcuSeek NVR
Surveillance Products & Solutions AI & Data Analytics
By integrating natural language interaction, Hikvision’s AcuSeek NVR enables precise video and image retrieval within seconds, marking a transformative milestone for the security industry's advance into intelligent and efficient applications.

Read more...
Open and collaborative logistics systems
Hikvision South Africa Surveillance Logistics (Industry) AI & Data Analytics
E-commerce and other high-volume logistics operations need open and collaborative technology ecosystems that drive efficiencies, throughput and digital transformation. Hikvision discusses the benefits of harnessing open and collaborative systems in the logistics market.

Read more...
The rise of AI-powered cybercrime and defence
Information Security News & Events AI & Data Analytics
Check Point Software Technologies launched its inaugural AI Security Report, offering an in-depth exploration of how cybercriminals are weaponising artificial intelligence (AI), alongside strategic insights defenders need to stay ahead.

Read more...
Hikvision launches latest range of cameras
Hikvision South Africa Surveillance AI & Data Analytics
Hikvision has launched its latest network cameras with ColorVu 3.0 technology and EasyIP 4.0 Plus, which elevate video security by delivering improved image quality, enhanced intelligent functions, superior audio capabilities, and a refined product design and materials.

Read more...










While every effort has been made to ensure the accuracy of the information contained herein, the publisher and its agents cannot be held responsible for any errors contained, or any loss incurred as a result. Articles published do not necessarily reflect the views of the publishers. The editor reserves the right to alter or cut copy. Articles submitted are deemed to have been cleared for publication. Advertisements and company contact details are published as provided by the advertiser. Technews Publishing (Pty) Ltd cannot be held responsible for the accuracy or veracity of supplied material.




© Technews Publishing (Pty) Ltd. | All Rights Reserved.