Intelligent video analytics for CCTV

Issue 9 2020 Editor's Choice

Intelligent video analytics (IVA) is on my radar, having just finished audits for two residential estates, both of which are making use of thermal analytic technology on their perimeters, but where the two could not be more different in terms of the application of, and the understanding of, this technology.

About IVA

Video analytics in their most basic form perform real-time monitoring in which objects, their attributes, movement patterns and behaviour are detected and processed, and can also be used to go through historical data and mine that data for forensic purposes.

Fixed algorithm analytics and artificial intelligence (deep learning) analytics are similar in that they both determine if unwanted behaviour is occurring within the field of view of the camera. Facial recognition, the third common type of analytics, matches points on a face in real time with a sample of that face stored in a database.

However, IVA is also processor-intensive and so several approaches to this problem have been developed over time:

• IVA is processed on the camera itself (‘at the edge’).

• IVA is processed on the CCTV server, appliance, or NVR centrally.

• IVA is processed utilising third-party software installed on either the CCTV head-end or on a separate server.

• IVA is streamed to and processed offsite by a bureau (cloud service) providing IVA services.

Many manufacturers and system integrators nowadays tend to take a hybrid approach to client solutions, where processing by the cameras at the edge reduces bandwidth for real-time monitoring, and then centralising the data for forensic analysis at the head end.

Edge analytics in focus

At the outset, there was analogue CCTV, for which only very rudimentary analytics were sometimes implemented. Then came IP CCTV, which enabled the transmission of video via network cable, meaning IP cameras attached to IP networks could now analyse digital video by utilising computing power and using purpose-built algorithms for the processing of a sequence of well-defined instructions, designed to solve a particular class of problem (in this case, the intelligent analysis of CCTV footage).

Originally, the analytics took place at the server/appliance (or NVR) loaded with the relevant video management software, either by the manufacturer of the cameras, or by a third-party provider. This software became more sophisticated, adding sense and structure to the images that the cameras were viewing, and producing alerts if what was being observed could potentially be classified as a threat.

But this came with challenges, given the large amount of processing power required to effectively analyse the footage.

So manufacturers turned their attention to analytics taking place on the camera itself. With these edge analytics, both the image and the metadata of the image is analysed by each camera, without having to send video across the network to the VMS. Instead, the results of the analysis are sent to the VMS. If a camera is fitted with motion detection, for example, it will only start sending images when and if motion is detected.

Furthermore, cameras can now also record at the edge. Live viewing can be undertaken at relatively low resolution to conserve bandwidth, and high-resolution recordings can be made for forensic analysis and evidence. SD cards and the like take this a step further, enabling each camera to hold several terabytes of data.

The CCTV analytics world has been further impacted by artificial intelligence (particularly deep learning), in that AI has taken analytics to an altogether new level. Deep learning means that the performance of the relevant algorithm is continually being improved. The camera ‘learns’ the environment and begins to be able to distinguish between what should and should not be the case for that environment.

This has not made cameras predictive quite yet, but it certainly can significantly reduce nuisance alarms. How deep learning has affected processing is that there is more need for analytic processing on the camera before notification is sent to the video management software at the head end.

With my two residential estate audits in mind, the section to follow will briefly touch on detection principles and the effect on analytics of the physical installation.

The basics of IVA application

The Johnson metric has, as its basic premise, the camera’s ability to discriminate between objects, and it has become common to refer to the following levels of differentiation using this metric:

• Detection, which means that an unidentifiable object has been detected.

• Classification, which means that the camera’s analytics can distinguish between an inanimate (vehicle) and an animate (person or animal) object.

• Recognition, which means that the object can be distinguished as a person specifically.

• Identification, which means that the identity of the person is easily distinguishable.

A camera’s detection capability is determined by a multitude of factors, but in practical terms, the primary consideration is the number of pixels associated with the object in question. This in turn is determined primarily by the focal length of the lens, the size of the sensor, and the resolution of the camera.

Now as it relates to people, there is broad consensus between manufacturers that the following pixel counts are required for the various levels:

• Detection: 1.5 pixels.

• Classification: 6 pixels.

• Recognition: 12 pixels.

• Identification: 25 pixels.

Unfortunately, many manufacturers do not fully differentiate between recognition and identification. However, for this article, I am primarily interested in the detection of potential intruders and so the discrimination between recognition and identification is therefore not very important.

If we assume that 12 pixels is an appropriate pixel count for reliable recognition of a potential threat to a property, and assume a perimeter thermal camera with a 19 mm lens, this camera should be able to recognise a human at about 140 metres (or even slightly beyond). At that distance, however, the horizontal field of view is 29,5 metres across, and a human represents about 7% of that. Even though the camera can resolve a human, it is only a very vigilant control room operator, looking at a small number of screens, that is likely to detect that human.

So now we turn to using intelligent video analytics to assist human operators in surveying a scene. As part of this process, we need to consider that intrusion prevention is predicated by the Four Ds: Deter, Detect, Delay and Defend.

Deter, Detect, Delay and Defend

It is of paramount importance that any attempt at breaching the perimeter be detected as early and as reliably as possible and detection should also provide the ability to determine the nature and the exact location of the attempted breach. This is important for understanding whether the attempt is real because not being able to do so will easily result in ‘false response fatigue’, which will reduce effectiveness in understanding the nature of the intrusion to provide the right response, and for understanding exactly where the response should be directed to.

The objective behind setting up video analytics detection is simple, but practical implementation can be difficult. The objective is to set up the analytics so that it is impossible for a human to approach the perimeter barrier from the outside without being detected, while at the same time making sure that the analytics do not alarm for anything else.

For video analytics to work well, the area in which detection is required needs to be clear of obstacles, and the undulations of the area need to be such that no area falls below the direct line of sight of the camera (this principle applies to bends in the fence also). The camera should not be required to provide accurate detection further than its specified capacity.

There is a difference between detecting an object and detecting an object which is recognisable as human. The cameras need to be positioned so that each covers the dead spot of the other. This is not in terms of the view (how far the camera can see), but is in terms of the detection capability of each camera; analytics are not able to detect objects through fences accurately, even if these are wire fences or palisades which are easy to see through.

Detection without the ability to categorise the object will lead to a significant number of false alarms, which can easily overload an alarm stack and can also cause an unnecessary response or even system crashes.

Also, to meet the objective of the Four Ds, specifically the requirement for Delay, it follows that detection should be on the outer perimeter. Where this is not possible, it is important to rely on the barrier itself to provide adequate alarming, and for the corresponding camera to link to that alarm for visual confirmation. Again, this reduces nuisance alarms caused by the configuration of unnecessary detection rules, and streamlines the process.

A significant number of analytic algorithms (tasks) exist for various camera makes and models, ranging from object in field, to crossing line, to loitering, and so on. For most of these tasks, each thermal camera must be correctly calibrated for its position. This ‘teaches’ the camera about size and distance. This is used to configure the analytics to trigger only for events that meet certain size objectives.

It is beyond the scope of this article to provide a detailed description of how the analytics should be optimally configured, but I will suggest several significant guidelines:

• Avoid the use of ‘Detect any object’.

• Use ‘Enter field’ rather than ‘Object in field’.

• Try and use directional ‘Crossing line’ (IVA fow) rather than just ‘Crossing line’.

• Use ‘Loitering’ with care and with suitable delays. A loitering condition should almost always follow an alarm condition only. This is the single most important note to avoid copious nuisance alarms.

In terms of the physical installation of each camera, it is important to note that on a CCTV camera with a 60 mm lens, a 0,05 cm (half a millimetre) movement of the camera around its axis will result in a relative 100 cm (one metre) movement of an object in the field of view at 300 metres. The implication is obvious for video analytics. From the perspective of a camera on a pole, two events happen simultaneously in a strong wind:

• The first is that the camera direction is deflected by a degree directly correlated with the strength of the prevailing wind.

• The second is that harmonic vibration is likely to set in.


Lesley-Anne Kleyn.

Any pole, regardless of its construction material, will be affected to a greater or a lesser extent by the wind. This is often noticeable with the naked eye (think streetlamps in a strong wind). It is therefore important from the perspective of utilising video analytics that this effect is minimised.

In addition to the movement of the pole itself, the mounting bracket with which the camera is affixed to the pole also contributes to unwanted movement. Sway is usually noticed at the top of the pole (this is known as first-level vibration). Second-level vibration (Aeolian vibration) is caused by steady winds ranging from 2 to 15 metres per second (8 to 56 km/h) and produces frequencies of 2-20 Hz. This vibration is predominantly caused by vortices that form on the back side of the pole as this steady stream of air passes across the pole.

The vortices originate from opposite sides of the pole and create alternating pressures that move at right angles to the direction of the airflow. This causes a high-frequency, short-cycle harmonic reaction. While no poles are immune to these effects, tapered spun concrete poles perform better than steel, fibreglass and wood poles. Brackets need to be small, rigid, and where larger cameras are used, also damped.

Kleyn Consulting is an independent risk, safety and physical security consultancy with experience in a range of verticals. Based in the Western Cape Winelands, Lesley-Anne travels across South Africa. Feel free to contact her on +27 64 410 8563 or at [email protected]




Share this article:
Share via emailShare via LinkedInPrint this page



Further reading:

Back-up securely and restore in seconds
Betatrac Telematic Solutions Editor's Choice Information Security Infrastructure
Betatrac has a solution that enables companies to back-up up to 4 GB of data onto a device and restore it in 30 seconds in an emergency, called Rapid Access Data Recovery (RADR).

Read more...
Key design considerations for a control room
Leaderware Editor's Choice Surveillance Training & Education
If you are designing or upgrading a control room, or even reviewing or auditing an existing control room, there are a number of design factors that one would need to consider.

Read more...
CCTV control room operator job description
Leaderware Editor's Choice Surveillance Training & Education
Control room operators are still critical components of security operations and will remain so for the foreseeable future, despite the advances of AI, which serves as a vital enhancement to the human operator.

Read more...
A passport to offline backups
SMART Security Solutions Technews Publishing Editor's Choice Infrastructure Smart Home Automation
SMART Security Solutions tested a 6 TB WD My Passport and found it is much more than simply another portable hard drive when considering the free security software the company includes with the device.

Read more...
Navigating the complexities of privileged access management
Editor's Choice Access Control & Identity Management
Privileged Access Management and Identity Access Management are critical pillars of modern cybersecurity, designed to secure access to sensitive resources, enforce principles like least privilege, and implement just-in-time access controls.

Read more...
Rewriting the rules of reputation
Technews Publishing Editor's Choice Security Services & Risk Management
Public Relations is more crucial than ever in the generative AI and LLMs age. AI-driven search engines no longer just scan social media or reviews, they prioritise authoritative, editorial content.

Read more...
Efficient, future-proof estate security and management
Technews Publishing ElementC Solutions Duxbury Networking Fang Fences & Guards Secutel Technologies OneSpace Technologies DeepAlert SMART Security Solutions Editor's Choice Information Security Security Services & Risk Management Residential Estate (Industry) AI & Data Analytics IoT & Automation
In February this year, SMART Security Solutions travelled to Cape Town to experience the unbelievable experience of a city where potholes are fixed, and traffic lights work; and to host the Cape Town SMART Estate Security Conference 2025.

Read more...
Historic Collaboration cuts ATM Bombings by 30%
Online Intelligence Editor's Choice News & Events Security Services & Risk Management
Project Big-Bang, a collaborative industry-wide task team, has successfully reduced ATM bombings in South Africa by 30,7% during the predetermined measurement period of November, December and January 2024/5.

Read more...
World-first safe K9 training for drug detection
Technews Publishing SMART Security Solutions Editor's Choice News & Events Security Services & Risk Management Government and Parastatal (Industry)
The Braveheart Bio-Dog Academy recently announced the results of its scientific research into training dogs to accurately detect drugs and explosives without harming either the dogs or their handlers.

Read more...
The need for integrated control room displays
Leaderware Editor's Choice Surveillance Training & Education
Display walls provide a coordinated perspective that facilitates the ongoing feel for situations, assists in the coordination of resources to deal with the situation, and facilitates follow up by response personnel.

Read more...