Video and audio analytics

CCTV Handbook 2019 CCTV, Surveillance & Remote Monitoring, Integrated Solutions

To protect people and assets from accidents, incidents and terror attacks, the number of video security cameras and systems being used is increasing at a high rate. However, the number of security operators has not increased at the same rate, resulting in an average of 20 to 50 cameras being monitored by each operator.

Viewing many monitors and cameras simultaneously prevents security personnel from focusing on core monitoring duties, which can lead to an increased probability of missing critical situations due to viewing fatigue. For these reasons, interest in intelligent audio and video analysis technology for overcoming such limitations and for efficient monitoring is expanding.

Intelligent audio and video analytics is a technology that alerts the operator to abnormal activities detected through analysing video and audio information, designed to prevent accidental or intentional actions and to minimise damage through prompt response.

Intelligent analytics can also be used during recording and search operations. Recorded events are tagged with the event type and associated metadata. An operator can simply search recorded video for specific event types to quickly locate an incident, saving valuable time.

Hanwha Techwin is continuing to invest in the research and development of intelligent analysis technology, and this white paper is designed to provide information on the intelligent audio and video source analysis technology featured in Hanwha Techwin network cameras.

The following information is a summary of the various analysis techniques Hanwha Techwin provides. The full white paper can be found at*hanwha1, redirects to

Hanwha Techwin’s analysis technology

Tamper detection

Tampering detection is a technology which detects events that disturb normal monitoring and it is a crucial technology which all monitoring systems must provide. In case of sudden changes, the camera may not be able to perform normal monitoring. If the following changes occur, check the device on site and implement suitable measures:

• Camera direction changed due to impact.

• Camera focus significantly impaired.

• Camera vision lost due to object being covered or the camera being spray painted.

• Camera video lost due to intentional blockage.

In normal monitoring environments, small sudden or gradual lighting changes may be present, or the camera may be subject to repeated vibration due to wind or vibration from the installation location. Furthermore, an object temporarily appearing on the screen or repeated changes on portions of the screen may be detected.

Hanwha Techwin’s tampering technology effectively excludes such elements in normal monitoring environments and is designed to detect only significant events.

Face detection

Face detection is a technology which identifies human faces from video images by identifying the key features of human faces. There are a variety of methods used to detect faces, including:

Template matching method: This method develops templates based on facial information extracted and registers the relationship in the system. Then it calculates the similarity between faces in video images and the templates.

Feature invariant approach: This face detection method utilises facial features which are less influenced by rotation, size and lighting changes. It combines information about eyes, noses and mouths to determine the presence of a person’s face.

Boosting approach: This method uses basic patterns of faces which are compared to a classifier containing facial feature information for determining an individual’s face.

Facial detection requires a significant amount of video information, and Hanwha Techwin’s X Series vastly improves its face detection performance compared to previous products by collecting more detailed video from user-designated areas. The Wisenet X series cameras require only 25 x 25 pixels to detect a face, compared to 90 x 90 or 45 x 45 for previous generation cameras. The improved detection requires 3,25 times fewer pixels for 2-megapixel detection and 13 times fewer pixels for 5-megapixel detection. Thus, face detection can work on wider scenes and cases where the subject is farther away from the camera. Furthermore, the face detection function can detect up to 35 faces at one time.

IVA (Intelligent Video Analysis)

The system can be set to generate an event and take an action in a case where movement is detected or in a situation that satisfies defined event rules.

Common settings include a user-defined minimum and maximum object size. To avoid detection errors due to noise and extraneous movements, set a suitable minimum/maximum detection size for the installation environment. However, as identical movement from identical locations may be detected differently, be sure to include margins in the minimum/maximum size limitations. A sensitivity adjustment setting is available to change the threshold for movements to be detected. To detect events only in the specified areas, exclusion zones are available to ensure detection only occurs where desired to prevent false positives.

Virtual line crossing detection: Objects crossing a designated virtual line can be detected. The direction of detection can be specified. Configuration options include defining virtual lines and direction.

Enter/exit detection: Objects entering/exiting a designated virtual area can be detected.

Intrusion detection: Intrusion detection can trigger an event when movement is detected within a designated virtual area.

Appearing detection: Objects appearing in a designated virtual area and holding their position for more than the set observation time are detected.

Disappearing detection: Objects disappearing from a designated virtual area and remaining absent for more than the set observation time are detected.

Loitering detection: Objects loitering in a designated virtual area for more than the set observation time are detected. The camera looks for movements of similar patterns that are contained within the virtual area. Once these patterns are observed for a specified duration, then loitering detection is triggered.

Audio detection

Audio detection is a technology that detects audio levels which exceed the user-defined levels. As audio levels are greater in abnormal situations than in normal situations, audio levels exceeding set levels are detected as being abnormal.

Through audio detection technology, the camera is able to detect abnormal situations and notify the operator via event signals, allowing the operator to take suitable measures.

Hanwha Techwin’s audio detection technology calculates the absolute level of actual audio signals collected using the microphone, then normalises the levels in steps of 1 to 100. It defines the normalised level as the audio size, and audio levels exceeding the set level are detected as an event. Note that the audio size used for this purpose does not correlate to specific decibel (dB) values.

Audio source classification

Audio source classification is a technology to classify audio being picked up by the camera. Since the audio detection technology previously discussed generates alarms based simply on audio size, it may generate events even under normal situations. To overcome such limitations, technologies to classify audio source types have being developed.

When the camera classifies the audio source type satisfying the criteria defined by the operator, it then notifies the operator via an event trigger, allowing a suitable response to be initiated.

Hanwha Techwin features an audio source database which supports the classification of screams, gunshots, explosions and crashing glass. The camera extracts the characteristics of the audio source collected using the camera’s internal or externally connected microphone and calculates its likelihood based on the predefined database. It selects the audio source with the highest likelihood and generates an event.

Hanwha Techwin’s audio source classification technology available in X Series cameras features three customisable settings for category, noise cancellation and detection level for optimum performance in a variety of installation environments.

Image stabilisation

Image stabilisation is a technology that compensates image shaking due to vibrations from the environment to produce a stable image. In general, image stabilisation technology is classified as a hardware method which utilises the camera lens or image sensor to compensate for shaking, whereas DIS (Digital Image Stabilisation) uses software analysis of shaking based on the image. As DIS compensates for shaking with software, it can reduce a product’s price by reducing the amount of hardware in the product.

Hanwha Techwin’s image stabilisation is based on the software compensation method, DIS. The company developed its gyroscope sensor-integrated DIS technology to reduce malfunctioning and improve DIS accuracy. Independently operating gyroscope (gyro) sensors collect camera shake information aside from the movement vector information collected through image analysis, reducing the probability of malfunctioning.


The intelligent audio and video analysis technology featured in Hanwha Techwin’s network cameras automatically notifies the operator of predefined situations detected. Through these technologies, they are able to not only monitor all cameras 24/7, but they can also ensure efficient operations by easily confirming and determining the circumstances of an event. With the use of intelligent audio and video analysis technology, a single individual can monitor many more cameras and monitors, as well as lowering the number of missed critical events. Furthermore, operators can review recorded video quickly by filtering or skipping to specific event types, easing the burden of reviewing all video or events, increasing operational efficiency.

For more information contact Jaco De Wet, Hanwha Techwin (formerly Samsung Techwin), +27 79 843 4051,

Share this article:
Share via emailShare via LinkedInPrint this page

Further reading:

The importance of correct specifications
Issue 2 2020, Sensor Security Systems , CCTV, Surveillance & Remote Monitoring
To prevent the incorrect specification of CCTV equipment, Sensor Security has a comprehensive checklist to complete.

Dahua unveils core products for 2020
Issue 2 2020, Dahua Technology South Africa , CCTV, Surveillance & Remote Monitoring
Dahua Technology unveiled its 2020 core products in Intersec Dubai, enabling and accelerating its AIoT transformation.

Wireless HD CCTV network
Issue 2 2020 , CCTV, Surveillance & Remote Monitoring
Infinet Wireless’ wireless solutions have been deployed in Ipswich town centre to improve safety and ensure hassle free HD surveillance.

Hikvision launches LED display product line
Issue 2 2020, Hikvision South Africa , CCTV, Surveillance & Remote Monitoring
Hikvision has launched a full range of internally manufactured LED displays, providing high-definition colour imaging.

End-to-end surveillance upgrade
Issue 2 2020 , CCTV, Surveillance & Remote Monitoring
An upgrade of a video surveillance solution composed of video management software and 425-plus new cameras has transformed security operations for the Central Bank of Jordan.

Reinventing network camera security
Issue 1 2020, Axis Communications SA , CCTV, Surveillance & Remote Monitoring
Now in its seventh generation and celebrating its 20th anniversary, the Axis ARTPEC chip was launched in 1999 designed to optimise network video.

Cloud-based fleet and driver management
Issue 1 2020, Graphic Image Technologies , CCTV, Surveillance & Remote Monitoring
Graphic Image Technologies (GIT) has announced the availability of a cloud-based dashcam designed to improve on-the-road behaviour and assist in improving fleet management.

Cathexis specialises in integration
Issue 1 2020, Cathexis Technologies , CCTV, Surveillance & Remote Monitoring
The integration of multiple systems is intrinsic and essential to the goal of creating an effective and efficient operational environment.

Do wireless networks meet modern surveillance demands?
Issue 1 2020, Duxbury Networking, RADWIN , CCTV, Surveillance & Remote Monitoring
It is predicted that video will account for 15,1 zettabytes (1 zettabyte = 1 trillion gigabytes) of data annually, which is more than any other IoT application.

Traffic doesn’t have to be this way
Issue 1 2020, Dahua Technology South Africa, Axis Communications SA , CCTV, Surveillance & Remote Monitoring
More effective traffic management is something that would save us all a lot of frustration and wasted time, and it’s one of the areas where AI and big data can have a significant impact.