Data quality essential in training ChatGPT

Issue 7 2023 AI & Data Analytics

It is a year since OpenAI launched ChatGPT to the public, with adoption rates skyrocketing at an unprecedented pace. By February 2023, Reuters reported an estimated 100 million active users. Fast forward to September, and the ChatGPT website has attracted nearly 1,5 billion visitors, showcasing the platform’s immense popularity and integral role in today’s digital landscape.

Willem Conradie, CTO of PBT Group, reflects on this journey, noting the significant usage and adoption of ChatGPT across various sectors. “The rise of ChatGPT has highlighted significant concerns. These range from biased outputs, question misinterpretation, inconsistent answers, lack of empathy, and security issues. To navigate these, the concept of Responsible AI has gained momentum, emphasising the importance of applying AI with fair, inclusive, secure, transparent, accountable, and ethical intent. Adopting such an approach is vital, especially when dealing with fabricated information when ChatGPT provides incorrect or outdated information,” says Conradie.

Of course, the platform’s versatility extends beyond public use. It serves as a powerful tool in corporate environments, enhancing various business processes such as customer service enquiries, email drafting, personal assistant tasks, keyword searches, and creating presentations. For the best performance, it is essential that ChatGPT provides accurate responses. This necessitates training on data that is relevant to the company and accurate and timely.

“Consider a scenario where ChatGPT is employed to automatically service customer enquiries, with the aim of enhancing customer experience by delivering personalised responses. If the underlying data quality is compromised, ChatGPT may provide inaccurate responses, ranging from minor errors like incorrect customer names to major issues like providing incorrect self-help instructions on the company’s mobile app. Such inaccuracies could lead to customer frustration, ultimately damaging the customer experience and negating the intended positive outcomes.”

Addressing such data quality concerns is paramount. Ensuring relevance is the first step. This requires the data used for model training to align with the business context in which ChatGPT operates. Timeliness is another critical factor, as outdated data could lead to inaccurate responses. The data must also be complete. Ensuring the dataset is free from missing values, duplicates, or irrelevant entries is important, as these could also result in incorrect responses and actions.

Moreover, continuously improving the model through reinforcement learning incorporating user feedback into model retraining cycles, is essential. This assists ChatGPT, and conversational AI models in general, to learn from their interactions, adapt, and enhance their response quality over time.

“The data quality management practices highlighted here, while not exhaustive, serve as a practical starting point. They are applicable not just to ChatGPT, but to conversational AI and other AI applications like generative AI. All this reinforces the importance of data quality across the spectrum of AI technologies,” concludes Conradie.




Share this article:
Share via emailShare via LinkedInPrint this page



Further reading:

The future of security: intelligent automation
Access Control & Identity Management AI & Data Analytics IoT & Automation
As the security landscape evolves, businesses are no longer looking for stand-alone solutions, they want connected, intelligent systems that automate, streamline, and protect.

Read more...
Local is a lekker challenge
Secutel Technologies Technews Publishing AI & Data Analytics
There are a number of companies focused on producing solutions locally, primarily in the software arena, but we still have hardware producers churning out products, many doing business locally and internationally.

Read more...
AI and privacy to shape consumer cybersecurity landscape
AI & Data Analytics
A report from Kaspersky indicates that artificial intelligence will become an integral part of daily life in 2025, while privacy concerns around biometric data and advanced technologies will take centre stage.

Read more...
How can South African organisations fast-track their AI initiatives?
AI & Data Analytics Security Services & Risk Management
While the AI market in South Africa is anticipated to grow by nearly 30% annually over the next five years, tapping into the promise and potential of AI is not easy.

Read more...
Efficient, future-proof estate security and management
Technews Publishing ElementC Solutions Duxbury Networking Fang Fences & Guards Secutel Technologies OneSpace Technologies DeepAlert SMART Security Solutions Editor's Choice Information Security Security Services & Risk Management Residential Estate (Industry) AI & Data Analytics IoT & Automation
In February this year, SMART Security Solutions travelled to Cape Town to experience the unbelievable experience of a city where potholes are fixed, and traffic lights work; and to host the Cape Town SMART Estate Security Conference 2025.

Read more...
Milestone announces a platform to enable access to data and train AI models
Surveillance AI & Data Analytics
Milestone Systems has announced Project Hafnia to build services and democratise AI-model training with high-quality, compliant video data leveraging NVIDIA Cosmos Curator and AI model, fine-tuning microservices.

Read more...
Security industry embraces mobile credentials, biometrics and AI
AI & Data Analytics Access Control & Identity Management Integrated Solutions
As organisations navigate an increasingly complex threat landscape, security leaders are making strategic shifts toward unified platforms and emerging technologies, according to the newly released 2025 State of Security and Identity Report from HID.

Read more...
AI for retail risk management
Surveillance Retail (Industry) AI & Data Analytics
As businesses face mounting challenges in a volatile economic environment, Ares-i remains an essential tool for proactively identifying, assessing, and mitigating risks that threaten operational stability and customer satisfaction.

Read more...
Top five AIoT trends for 2025
Hikvision South Africa IoT & Automation AI & Data Analytics Facilities & Building Management
Hikvision highlights that with technological advances, AIoT (AI-powered Internet of Things) is transforming industries not just by enhancing security, but also by making the world smarter and more efficient.

Read more...
Threats, opportunities and the need for post-quantum cryptography
AI & Data Analytics Infrastructure
The opportunities offered by quantum computing are equalled by the threats this advanced computer science introduces. The evolution of quantum computing jeopardises the security of any data available in the digital space.

Read more...