How deep learning benefits the security industry

CCTV Handbook 2018 Editor's Choice, Surveillance

Data storage devices across the security industry are routinely required to handle an enormous amount and many layers of raw data. As safe city projects in varying sizes become more prevalent, the number of surveillance nodes has reached the hundreds of thousands. And due to the widespread use of high-definition monitoring, the amount of data involved in security surveillance has increased dramatically in a short time.

Efficient collection, analysis, and application of data and the intelligent use of it are becoming ever more critical in this industry. Thus, improving video intelligence appears to be an inevitable, industry-wide goal.

Security users hope that their investment in new products will bring even more benefits beyond simply tracing and tracking persons of interest and evidence collection after a security event. Some examples of added benefits include using the latest technologies to replace the large amount of man-power previously required for searching surveillance footage, detecting anomalous data, and finding ever more efficient ways to allow surveillance to shift from post-incident tracing to alerts during incidents – or even pre-incident alerts.

In order to satisfy these demands, new technologies are required. Intelligent video surveillance has been available for many years. However, the outcomes of its application have not been ideal. The emergence of deep learning has enabled these demands to become reality.

The insufficiency of traditional intelligent algorithms

Traditional intelligent video surveillance has especially strict requirements for a scene’s background. The accuracy of intelligent recognition and analysis in comparable scenarios remains inconsistent. This is primarily due to the fact that traditional intelligent video analysis algorithms still have many flaws.

In an intelligent recognition and analysis process, such as human facial recognition, two key steps are required: First, features are extracted, and second, ‘classification learning’ is performed.

The degree of accuracy in this first step directly determines the accuracy of the algorithm. In fact, most of the system’s calculation and testing workload is consumed in this part. The features in traditional intelligent algorithms are designed by humans and have always been heavily subjective. More abstract features – those that humans have difficulty comprehending or describing – are inevitably missed. With shifting angles and lighting, and especially when the sample size is enormous, many features can be too difficult to detect. Therefore, while traditional intelligent algorithms perform well in very specific environments, subtle changes (image quality, environment, etc.) yield significant challenges to accuracy.

The second step – classification learning – mainly involves target detection and attribute recognition. As the number of available categories for classification rises, so does the difficulty level. Hence, traditional intelligent analysis technologies are highly accurate in vehicle analysis but not in human and object analysis. For example, in vehicle detection, a distinction is made between a vehicle and a non-vehicle, so the classification is simple and the level of difficulty is low. To recognise vehicle attributes requires recognition of different vehicle designs, logos, and so on.

However, there are relatively few of these, making the classification results generally accurate. On the other hand, if recognition is to be performed on human faces, each person is a classification of its own, and the corresponding categories will be extremely numerous – naturally leading to a very high level of difficulty.

Traditional intelligent algorithms generally use shallow learning models to handle situations with large amounts of data in complex classifications. The analysis results are far from ideal. Furthermore, these results directly restrict the breadth and depth of intelligent applications and further development. Hence the need for increasing the ‘depth’ of intelligence in big data for the security industry is arising.

The advantages of deep learning and its algorithms

Traditional intelligent algorithms are designed by humans. Whether or not they are designed well depends greatly on experience and even luck, and this process requires a lot of time. So, is it even possible to get machines to automatically learn some of the features? Yes it is. This is actually the objective of artificial intelligence (AI).

The inspiration for deep learning comes from a human brain’s neural networks. Our brains can be seen as a very complex deep learning model. Brain neural networks are comprised of billions of interconnected neurons; deep learning simulates this structure. These multi-layer networks can collect information and perform corresponding actions. They also possess the ability for object abstraction and recreation.

Deep learning is intrinsically different from other algorithms. The way it solves the insufficiencies of traditional algorithms is encompassed in the following aspects.

First, from ‘shallow’ to ‘deep’

The algorithmic model for deep learning has a much deeper structure than the two 3-layered structures of traditional algorithms. Sometimes, the number of layers can reach over a hundred, enabling it to process large amounts of data in complex classifications. Deep learning is very similar to the human learning process, and has a layer-by-layer feature-abstraction process. Each layer will have different ‘weighting,’ and this weighting reflects on what was learned about the images ‘components’.

The higher the layer level, the more specific the components. Simulating the human brain, an original signal in deep learning passes through layers of processing; next, it takes a partial understanding (shallow) to an overall abstraction (deep) where we can perceive the object.

Second, from ‘artificial features’ to ‘feature learning’

Deep learning does not require manual intervention but relies on a computer to extract features by itself. This way it is able to extract as many features from the target as possible, including abstract features that are difficult or impossible to describe. The more features there are, the more accurate the recognition and classification will be. Some of the most direct benefits that deep learning algorithms can bring include achieving comparable or even better-than-human pattern recognition accuracy, strong anti-interference capabilities, and the ability to classify and recognise thousands of features.

Key factors of deep learning

In total, there are three main reasons why deep learning only became popular in recent years and not earlier: the scale of data involved, computing power, and network architecture.

Improvements in data-driven algorithm performance have accelerated deep learning in various intelligent applications in a short space of time. Specifically, with the increase in data scale, algorithmic performance improved as well. Accordingly, user experience has improved and more users are involved, further facilitating a larger scale of data.

Video surveillance data makes up 60% of big data, and the amount is rising at 20% annually. The speed and scale of this achievement is due to the popularisation of high-definition video surveillance – HD 1080p is becoming more common, and 4K and higher resolutions are gradually being applied in many important applications.

Hikvision has operated in the security industry for many years with its own research and development capabilities, employing large amounts of real video and image data as training samples. With a large amount of good quality data, and over a hundred team members to label the video images, sample data with millions of categories have been accumulated. With this large amount of quality training data, human, vehicle, and object pattern recognition models will become more and more accurate for video surveillance use.

Furthermore, high performance hardware platforms enable higher computational power. The deep learning model requires a large amount of samples, making a large amount of calculations inevitable. In the past, hardware devices were incapable of processing complex deep learning models with over a hundred layers.

In 2011, Google’s DeepMind used 1000 devices with 16 000 CPUs to simulate a neural network with approximately 1 billion neurons. Today, only a few GPUs are required to achieve the same sort of computational power with even faster iteration. The rapid development of GPUs, supercomputers, cloud computing, and other high performance hardware platforms has allowed deep learning to become possible.

Finally, the network architecture plays its own role in advancing deep learning. Through constant optimisation of deep learning algorithms, better target-object recognition can be achieved. For more complex applications such as facial recognition or in scenarios with different lighting, angles, postures, expressions, accessories, resolutions, etc., network architecture will impact the accuracy of recognition, i.e., the more layers in deep learning algorithms, the better the performance.

In 2016, Hikvision achieved the number one position in the Scene Classification category at the ImageNet Large Scale Visual Recognition Challenge 2016. The team from Hikvision Research Institute used inception-style networks and not-so-deep residual networks that perform better in considerably less training time, according to Hikvision’s experiments for training and testing.

Furthermore, Hikvision’s Optical Character Recognition (OCR) Technology, based on deep learning and led by the company’s research institute, also won the first price in the ICDAR 2016 Robust Reading Competition. The Hikvision team substantially surpassed both strong domestic and foreign competitors in three word-recognition challenges, including born-digital images, focused scene text, and incidental scene text, demonstrating that the word recognition technology by Hikvision reached the world’s top level.

Application of deep learning products

In the past two years, deep learning technology has excelled in speech recognition, computer vision, voice translation, and much more. It has even surpassed human capabilities in the areas of facial verification and image classification; hence, it has been highly regarded in the field of video surveillance for the security industry.

In the application of intelligent video in target detection, tracking, and recognition, the rise of deep learning has had a profound influence. When applying those three functions, deep learning potentially touches upon every aspect of the security video surveillance industry: facial detection, vehicle detection, non-motor vehicle detection, facial recognition, vehicle brand recognition, pedestrian detection, human body feature detection, abnormal facial detection, crowd behaviour analysis, multiple target tracking, and so on.

These types of intelligent functions require a series of front-end surveillance cameras, back-end servers and other products that support deep learning algorithms. In small-scale applications, front-end cameras can directly operate structured human and vehicle feature extraction, and tens of thousands of human facial images can be stored within the front-end devices to implement direct facial comparison so as to reduce costs of communicating with a server. In large-scale applications, front-end cameras can work with back-end servers. Specifically, the structured video task is handled by front-end devices, reducing the workload for back-end devices, and this also improves the matching and searching efficiency of back-end servers.

Hikvision has introduced a series of products with deep learning technology, such as the DeepinView Series cameras which can accurately detect, recognise, and analyse human, vehicle, and object features and behaviour, and can be widely used in indoor and outdoor scenarios.

Another product worth mentioning is Hikvision’s DeepinMind Series of NVRs which incorporate advanced deep learning algorithms and imitate human thoughts and memory. The DeepinMind products feature an innovative NVR+GPU mode, retaining the advantages of traditional NVRs and additional structured video analysis functions, which together greatly improve the value of video.

Deep learning is the next level of AI development. It is beyond machine learning where supervised classification of features and patterns are set into algorithms. Deep learning incorporates unsupervised or ‘self-learning’ principles. Hikvision is developing this concept in its own analytics algorithms. Enhanced accuracy is the result of multi-layer learning and extensive data collection. Application of this algorithm into face recognition, vehicle recognition, human recognition, and other platforms will significantly advance the performance of analytics.

For more information contact Janis Roux, Hikvision South Africa, +27 (0)10 035 1172, support.africa@hikvision.com, www.hikvision.com



Credit(s)




Share this article:
Share via emailShare via LinkedInPrint this page



Further reading:

Genetec launches Cloudlink 2210
Genetec Infrastructure Surveillance
New cloud-managed appliance addresses the practical challenges when adopting a cloud-managed model at scale, including storage costs, support for devices that do not enable direct-to-cloud connectivity, and the need to maintain local operation during connectivity disruptions

Read more...
Enhancing control room operations
iFacts Security Services & Risk Management Surveillance
As South Africa faces complex and more advanced security challenges, the demand for advanced surveillance solutions, including CCTV and security control rooms, continues to surge, but what about the people in front of the screens?

Read more...
The AI goldrush has a credibility problem
Refraime Editor's Choice Surveillance AI & Data Analytics
The single most important question a surveillance buyer can ask is deceptively simple: “Was this system programmed or was it trained?” That question alone will reveal more about what you are evaluating than any feature list or marketing video.

Read more...
Crime behaviour insights more important than ever
Leaderware Editor's Choice Surveillance Training & Education AI & Data Analytics
Behavioural surveillance skills are as essential now as they have ever been, especially in situations where quick evaluation of context is needed. Training operators in behavioural recognition skills is a vital part of control room success.

Read more...
Security’s three defining forces for 2026
Milestone Systems AI & Data Analytics Surveillance IoT & Automation
As we move into 2026, several technology trends that were once mostly confined to research labs and conference keynotes are now becoming part of the daily reality of the security industry.

Read more...
Large-scale AI boosts manufacturing efficiency
Hikvision South Africa Surveillance Industrial (Industry) AI & Data Analytics
Video systems, once used mainly for security, are rapidly becoming one of the most valuable sources of operational data in factories and industrial parks, accelerating smart manufacturing process.

Read more...
From false alarm filtering to intelligent decision-making
DeepAlert AI & Data Analytics Surveillance
As AI continues to evolve, the most successful surveillance operations will be those that not only reduce nuisance alerts, but also derive meaningful business intelligence from video data.

Read more...
Proactive estate security in Cape Town
neaMetrics OneSpace Technologies Technews Publishing SMART Security Solutions Fang Fences & Guards ATG Digital Editor's Choice News & Events Integrated Solutions Infrastructure Residential Estate (Industry)
SMART Security Solutions started the year with our annual SMART Estate Security Conference in Cape Town on 26 February 2026. Held at Anna Beulah Farm, the conference saw a number of delegates enjoying the farm’s excellent cuisine, while listening to outstanding presenters.

Read more...
How AI video is reshaping real estate security
neaMetrics TRASSIR - neaMetrics Distribution Editor's Choice
Globally, property maintenance and facility operations spending is projected to grow to over US$145 billion by 2034, reflecting rising complexity, compliance pressures, and increased exposure to operational costs. AI systems can protect properties, automate access, and optimise building management.

Read more...
Open systems support hybrid surveillance
SMART Security Solutions Axis Communications SA neaMetrics Editor's Choice
Today, end users can select the most suitable surveillance solution for their needs, whether it is on-site, at the edge, or in the cloud; a hybrid approach combining different options is most effective depending on the scenario.

Read more...










While every effort has been made to ensure the accuracy of the information contained herein, the publisher and its agents cannot be held responsible for any errors contained, or any loss incurred as a result. Articles published do not necessarily reflect the views of the publishers. The editor reserves the right to alter or cut copy. Articles submitted are deemed to have been cleared for publication. Advertisements and company contact details are published as provided by the advertiser. Technews Publishing (Pty) Ltd cannot be held responsible for the accuracy or veracity of supplied material.




© Technews Publishing (Pty) Ltd. | All Rights Reserved.