AI-based video analysis holds the promise of a technological quantum leap, with significant customer benefits. But only if the competent (i.e. informed) user can appraise the technology correctly. This article is intended to set forth a few basic principles which will enable users to correctly evaluate its functionality, usability and the advantages for their own specific application.
Systems have learning difficulties too
Routines based on artificial intelligence (AI) have been growing increasingly widespread in video security technology for a long time. A steadily growing number of new applications and products rely on the algorithms to offer new analyses or make existing analyses significantly more reliable. The objective is to provide appreciable added value for users and the results speak for themselves: Not too long ago, a great deal of work was required before classic image processing was able to recognise a tree moving in the wind as a false alarm, to give just one example. Today AI does that effortlessly.
The essential point of distinction between image or video analyses with classic image processing and those which use AI is that algorithms are no longer ‘just’ programmed, they can also be ‘taught’ with the aid of large volumes of data. On the basis of this data, the system learns to detect patterns and accordingly recognise the difference between a tree and an intruder, for example. But the concept of machine learning also throws up new problems and challenges.
One well-known example of this is the difference in the quality of recognition of different ethnic groups, an issue which has even made headlines in the news. Yet the background is relatively simple: An AI system can only learn substantially if it is supplied with enough and sufficiently diverse and evenly distributed data.
The quality of the AI system
All of this leads to the question of the performance capability of a system that uses artificial intelligence. What metrics allow a comparison between two routines, two different systems or two manufacturers, for example? What does it mean when a product brochure promises, for example, ‘95% detection accuracy’ or ‘reliable recognition’? How good is accuracy of 95%? And what is ‘reliable recognition’?
First of all, it is most important to understand how AI routines can be evaluated. The first step is the application- and customer-specific definition of what ‘incorrect’ and ‘correct’ mean, especially in borderline cases: For example, in a system set up to recognise persons, is a detection to be defined as correct if the image or video does not even show a real person, but instead just an advertisement depicting a person?
This and other parameters must be defined. As soon as this definition has been established, a dataset is needed in which the results that are expected to be correct are known. This dataset will then be analysed with AI to deduce the percentages of correct and incorrect detections. In this process, mathematics provides the user with an exceptionally wide variety of metrics, such as sensitivity (percentage of expected detections which were actually detected) or hit accuracy (percentage of detections that are actually correct). Ultimately, therefore, the quality of AI is always a statistical statement about the evaluation dataset used.
Summer or winter?
How usable this statement really is for the user or potential purchaser of a system depends on the distribution of the dataset. Accordingly, an evaluation may attest to good recognition performance. But if the dataset was founded solely on image material from summer months, this evaluation has no validity regarding the quality of the AI in winter, since light and weather conditions may be very different.
In general, it follows that statements about the quality of an AI analysis – particularly those quoting specific figures, such as ‘99,9%’ – are to be treated with caution if not all parameters are known. If the dataset used, the metrics used and the other parameters are unknown, in fact it is no longer possible to make a definitive statement about how representative the result is.
Exact specifications do not exist
Every system has its limits and of course this is also true of AI systems. Therefore, knowing these limits is the fundamental prerequisite for making sound decisions. But here too, statistics and reality collide, as is illustrated in the following example: Logically, the smaller an object is in the image/video, the less well an AI system is able to recognise it. So, the first question the user asks himself before buying a system relates to the maximum distance at which objects can be detected, since this has implications for the number of cameras needed and thus also for the costs of the total system.
However, it is quite impossible to specify an exact distance. There is simply no value up to which the analysis delivers 100% correct results, nor another value above which recognition is entirely impossible. In this case, an evaluation is only able to return statistics. For example, detection accuracy as a function of object size.
Best compare directly
Regarding the system limits, it is conventional practice, for example, in product datasheets to describe the limits of the system using specific minimum and maximum values to the extent possible. These include the minimum distance or a minimum resolution. This is also expedient, because customers or installers need points of reference to enable them to rate the system.
Even so, there are still many unknowns, for example, whether the manufacturer was inclined conservatively or more optimistically when specifying these limit values. So, the user is well advised always to bear in mind that there are no well-defined, clear limits in video analysis.
For all systems, errors will inevitably occur even within certain parameters and at the same time useful results can be returned outside of the limits under favourable conditions.
If one wishes to find out the true quality of an AI-based analysis as a user, this is really only possible by carrying out a direct comparison – the figures and parameters quoted by the various manufacturers differ too widely. Furthermore, of course, the boundary conditions and the input must be identical for all systems.
The optimum option for this is a live test, with demo products, rented equipment or the like. Then, the performance capability of the system in the exact usage case required is also revealed. Incidentally, this also describes the yardstick used to evaluate the performance capabilities of AI systems generally: It all depends on the specific usage case. This should be specified as precisely as possible. Only then is it also possible to generate true added value for the customer based on the right solution.
|Articles:||More information and articles about Dallmeier Electronic Southern Africa|
© Technews Publishing (Pty) Ltd | All Rights Reserved