Published: Apr 27, 2021

A few years ago, the identification of face masks was rather a theoretical task of face recognition and considered a temporary solution for cases when people occasionally use face masks for different purposes. In Japan, China, Hong Kong and South Korea, for example, citizens sometimes wear face masks for protection from seasonal flu and air pollution. However, in the new business reality, face-mask defection solutions are useful software to apply widely.

Face-mask detection and facial recognition software in the pandemic reality

The pandemic reality has changed the entire world, along with business processes and personal habits. Face masks are now the compulsory attributes of corporate culture and indicators of social responsibility. We all have witnessed why wearing face masks is vital and why we all should wear it the right way: covering noses and mouths. Despite all the contradictions and anti-masks movements, imagine how much trouble one unprotected cough in the shopping mall may cause for thousands of people. Or let’s think about consequences your both mask and symptom-free colleague may lead with whom you chat small talks every day at coffee points.

The only person not wearing a face mask or wearing it the wrong way in an office threatens coworkers' wellbeing and welfare personally. Moreover, the absence of a face mask is related to business owners' risks both in short and long-term perspectives. The whole team can get ill or require two-week isolation at the same time, or even the entire floor. Also, depending on the sphere of operation and local regulations, some business owners may have to pay enormous penalties for employees and customers that do not wear masks or do not cover a nose and a mouth with face masks. Legislation in many countries across the world is stringent for public places and services, such as shopping malls, drug stores, hospitals, delivery, airports, transportation hubs, office buildings and others.

Compliance with mask-wearing rules is crucial for some industries and critical infrastructure. Some manufacturing businesses can not stop production lines and have organized watch schedules when workers and employees are divided into watches and isolated from each other to protect from possible contamination. Institutions of critical manufacturing and infrastructure, such as hospitals, emergency services, police, power stations, dams, have to follow face mask wearing rules to ensure continued operation. The farming sector, food production, delivery services are also spheres strictly regulated in terms of individual protection means usage, including face masks.

Taking into account all the risks, many governments have applied measures to control face-mask wearing. Cameras and surveillance are one of the approaches. Security departments monitor how employees and visitors wear their face masks in offices and shops, on streets, in public transport. However, manual monitoring is time and resource-consuming, goes along with human errors and is hard to scale out. Government authorities and business owners look for automated solutions to control face mask wearing on a massive scale.
Let us illustrate the situation with face masks wearing in the USA.

At the beginning of 2021, face masks are required almost in all states in the USA.

USA mask

However, as any strict restrictions cause a reaction, obligatory face mask wearing triggered contradictions and opposition. People fight against mask wearing for multiple reasons, including limitation of freedom and religious constraints.

USA anti mask

Government institutions should care for public welfare and wellness on the one hand and keep the balance for public liberties as well oт the other. Not every country allows overall public surveillance and monitoring complemented with face recognition, like China. Legal protection of sensitive data makes CCTV ineffective and inapplicable if we talk about widely used tools to guarantee pandemic safety and correct covering of faces with masks.

Digital automated solution and face-mask detection software, applied in public places with a high risk of contamination, may help increase protection by monitoring and controlling the presence of face masks and how people wear face masks automatically. For example, integrated with security and access systems, face mask detection software may not open the doors of the shopping center, hospital, office center or corporate building for a visitor without a mask. Or, a software solution may send an alert to a security guard of a school, assign a ticket to a high-level manager at stores, charge a penalty to an employee at open space or take other actions.

Why the idea of face mask detection software implementation appeared

In new pandemic reality, except governmental requirements, face mask detection algorithms find applications in software solutions that are also capable of solving multiple business tasks:

  • avoid mass infection of employees and the closure of the enterprise for quarantine
  • avoid penalties and sanctions from the government for non-compliance with the prescribed norms
  • save resources on hiring employees for monitoring and controlling face masks wearing
  • share and promote an idea of the importance of wearing a face mask and demonstrate the effects of non-compliance with the rules.

We have decided to address new business reality demands and enhance our face recognition system integrated with an access system by expanding its functionality with face mask detection.

Our office access system uses ID cards and face recognition algorithms to identify employees and guests entering the premises. Adding a face mask detection feature helps a business owner protect a corporate building, monitor compliance with pandemic restrictions. With face mask detection functionality, an office keeper can configure and control access restrictions, and take actions against face mask rule violations.

In this article, we would like to share how we have implemented face mask detection step-by-step, what challenges we have faced in the process and what further advances we have planned to develop to widen its functionality and improve its performance.

Where and how to apply face mask detection software

Depending on configurations, our face mask detection software can be applied both for companies that recognize their employees (by a face or an entry card) to provide personal access to premises and for public spaces where there is a need in identifying face masks wearing violations only.

Enterprises, corporate buildings, office centers, government institutions or other companies that use an access system to limit or restrict access and have the database of employees’ photos, ID or entry cards may integrate our face recognition and mask detection software with their access and security systems. While in shopping centers, stores, hospitals, schools, etc., where business owners are focused on monitoring human flows for security, safety and protection purposes, our face mask detection solution may help to comply with face mask wearing norms and rules.

A company can place cameras in access points to transmit data to the face mask detection software that will check the presence of a mask in a split second, and software checks if a face mask is worn correctly.

Employees of enterprises or offices often use a personal ID card to access various premises. Our face mask detection software is easily integrated with an access system and facial recognition software to provide wider functionality.

If the system recognizes a person in a face mask not properly covering nose and mouth, it can also identify an employee by an ID or entry card and even a visible part of the face. In case you are interested in finding out more information about facial recognition and its implementation in various systems, you’re welcome to read articles in the AI, Data Science and Neural Networks category on our blog.

Our face mask detection system provides a possibility to identify uncovered faces, faces correctly covered with a mask and faces with incorrectly worn masks.

When a face mask detection software identifies an event of a wrongly worn face mask or an absence of a face mask, it can take action. Various options for further actions are possible. A face mask detection software can send a notification, alert or signal to an appointed employee, send an automatic warning or charge a fine for an employee who violated the mask norms or even deny accessing assets or premises.

The next section of this article will provide a step-by-step how-to guide on face mask detection software implementation. This section will contain technical details and specific knowledge and may be out of broad audience interests.

How to develop a face mask detection software

1. Research of the ready-to-use dataset

We have created a new model for our face mask detection software (a term “model” in this article means a convolutional neural network) using experience gained from developing our office access system. The first step in training a model is searching for a dataset and preparing it for specific purposes and tasks: face mask detection in our case. Machine learning engineers can approach the dataset selection task differently: generate a new custom dataset from scratch or take a free-licensed dataset. We have chosen to apply a ready-to-use dataset because it takes less effort and resources and ensures a quick start. With further project development progress, we will create a tailored dataset and experiment with several more ready-made datasets.

Faces with and without masks

After researching and comparing the available options that match our requirements, we have used Masked Face Detection In the Wild dataset for our face mask detection solution. This dataset contains photos of human faces with and without face masks. After processing this dataset to sort out photos to fit our purposes and quality demands, we have extracted 1,811 images of faces wearing masks and 286 photos of faces without a cover. To balance the classes, our AI engineers have added 1,525 photos of unmasked faces.

The face photos were cropped and aligned to 112x112 px with the help of RetinaFace-R50 detector. The chosen tool recognizes faces by outputting face-bound boxes and five points, facial landmarks: eyes, a nose and a mouth.

2. Transfer learning: training of the top layers and fine-tuning

Having prepared the dataset for training, we have used a transfer learning approach to speed up the training of a model for face mask detection. We have chosen MobileNet-v1-0.25 as the backbone, with an input size of 112x112 px. A pre-trained on the ImageNet dataset model had been used. After excluding the fully connected layers at the top of the network, we have added new fully connected layers for classifying photos as masked / not-masked to prepare a model.

At the first step of transfer learning, we’ve frozen the convolutional backbone and trained the top of the network (fully connected layers for classifying). As a result, we have gained a face mask detecting network with a frozen backbone that is easy and fast to train and converge. We’ve achieved a fast convergence with a high accuracy level on validation and test sets — 90+% accuracy.

After training the classification top layers, software developers have proceeded with the second and final transfer learning process step: we’ve fine-tuned the model. Fine-tuning helps improve the performance and accuracy of the face mask detection software. To fine-tune a model, we’ve unfrozen and trained all layers, including the backbone. To get the best result, the cyclic learning rate technique was used. For your reference, readers can find more information about the cyclic learning technique on Medium and GitHub. It changes the learning rate and helps to avoid getting stuck in the local minima. Also, this method increases the model's accuracy and speeds up convergence by dynamically changing the learning rate.

Cycling learning rate

Picture Cyclic learning rate

Considering that we have achieved a fast convergence and a high accuracy rate for the entire backbone, we've concluded that this convolutional neural network's capacity is very high too and exceeds the needs of the current task we are intended to solve. So, we have changed a face mask detecting model to a smaller one and selected a model of the appropriate size, reducing the number of backbone layers by cutting off the extra upper blocks.

After testing the implemented face mask detection network on mobile devices, we have revealed that the network successfully classifies the presence and absence of a mask.

In some cases, for the first version of the face mask detection model, the classification is incorrect. For example, when a person covers the lower part of the face with a hand or another object (not a face mask), a model makes a false decision that a face mask is present. And vice versa, if a person is wearing a face mask with some specific prints, especially resembling a photo of a face, a model also makes a mistake and classifies a person as one non-wearing a face mask. However, such cases are not common mistakes or usual practice and even may be called fraud attempts.

The operation speed for middle-priced mobile devices is 6-33 ms (160-30 FPS) on CPU 4 threats and 6-11 ms (170 - 90 FPS) on GPU, correspondingly (on a mobile device).

3. Enhancing the face mask detecting solution with new datasets

Analyzing errors and possibilities of fraud discovered in the testing process, we have assumed that the first version of the implemented face mask detection model classifies faces as masked and unmasked by indirect factors or non-valid hallmarks. To fix this, we have used more complex datasets to train a model. First of all, we have tried the synthetic dataset — MaskedFace-Net, which contains not only photos of faces with and without masks but also a set of photos of faces with three options of incorrectly worn face masks.

To improve classification, we’ve trained the MobileNet-v1-0.25 model again, using the same learning method that we described in previous steps: dataset preparation (align faces) and transfer learning, including training of the top layers and fine-tuning. With this dataset, after the first step, the model’s accuracy was 89.9%. To better the accuracy, we’ve used fine-tuning. The second step improved it up to 99.5%. After these actions, an implemented face mask detection model fastly converges to 99.5% accuracy on validation and test sets.

So, after examining the model operation on real, not synthesized data, we can conclude that an implemented face mask detection model does not cope with the task as needed. A possible reason for this is the poor quality of synthetic data, as a single type of a face mask and the same image of it was used to create all synthetic images in the dataset. Therefore, we’ve decided to create our own dataset for face mask detection software combining the available face dataset from an access system and open-source solutions (facial bound and landmarks detection with MediaPipe Face Mesh, adding different masks to a face with MaskTheFace, PRNet, etc.)

4. The synthesis of new datasets

As soon as the initially used synthetic dataset did not fit our requirements of face mask detection, we’ve used the MaskTheFace code to generate a new one. Using this library, a software engineer can create a new dataset of photos of masked people containing photos with a wide variety of mask types, their colors and patterns. The main purpose of the creation of a new dataset is to avoid overfitting, as in the case of MaskedFace-Net, where a single image of a face mask is used.

To improve the quality of the newly created dataset, we’ve decided to change the way of finding facial landmarks. Instead of the outdated dlib, use the modern and more accurate solution — MediaPipe Face Mesh.

We’ve synthesized a new dataset and used it to train the face mask detection model. The MaskTheFace library allows generating photos of faces with correctly worn masks only. That is why we’ve modified the library for synthesizing images with incorrectly worn masks. The usage of the synthesized dataset improves the quality of detection and operational parameters of the face mask detection model and widens its functionality.

After training on the generated dataset, we will fine-tune the model on real-life images to adapt a face mask detection model to real-life conditions. Unfortunately, we don’t have a large enough dataset to train a face mask detection model on real images only. The amount of real-life data is too small and includes only faces covered and non-covered with a mask, and while training on them, the model quickly overfit. Moreover, there is no ready-to-use data set containing real-life images of different ways to wear a face mask incorrectly.

Train a face mask recognition model on new custom datasets

At the start of the project development, we haven’t found the free image dataset containing faces with incorrectly worn face masks. That is why we’ve decided to create a mixed dataset. Our new custom dataset has included almost all images of wrongly dressed face masks from the synthetic dataset and supplemented with the synthesized images of faces correctly covered with face masks. We’ve gained 3 classes of images of 3,000 photos for each class in total.

Considering that our new custom dataset contains fewer images than previously applied datasets, we’ve used the next techniques and methods.

  1. We’ve used the previous synthetic dataset at the first step of transfer learning — for training the face mask detection model’s top layers.
  2. At the second step of the transfer learning process, we’ve used the mixed dataset (synthetic + real-world data) and applied augmentation functions (the imgaug library) of adding Gaussian Noise, SaltAndPepper, Flip img (turn over left to right), translation and rotation.

The model size has been decreased to 9 blocks (out of 16) to reduce the number of calculations from 45.26 MFLOPs to 25.06 MFLOPs.

5. Lightweighting of a backbone

MobileNet-v1-0.25 and MobileNet-v2-0.35 have been tested and used as a backbone for the current version of face mask detection software. Since chosen backbones learn very quickly, we consider it is worth lightweight by reducing the backbone size. MobileNet-v1-0.25 consists of 14 structural blocks and MobileNet-v2-0.35 consists of 16 blocks differing in parameters. Blocks of a backbone are connected subsequently.

So, with the help of removing the upper blocks, we can lightweight the model. Lightweighting of the model helps to reduce the resource consumption (a smaller model file size, less memory space, less required computing resources)and increases the face mask detection model’s operation speed.
Let’s illustrate the impact of backbone lightweighting.

Weights number (params) and computational cost (FLOPs) for MobileNet-v1-0.25

Weights number and computational cost

The model’s performance in regards to a different number of blocks

The model’s performance

6. Application of other backbones including MobileNet-v2 and MobileNet-v3

For the first version of face mask detection software, we used MobileNet-v1 as a backbone. Further, we’ve tried to apply MobileNet-v2 and MobileNet-v3, as we suppose the updated backbones can improve the accuracy and operation speed of the face mask detection model. And we’ve achieved better performance for our face mask detection model.

7. Backlog of features and further plans for face mask detection software improvements

  • Fraud detection and prevention

Our plans comprise developing strategies and activities to detect and prevent attempts to fool the face mask recognition system. We will experiment with different possible fraudulent acts, like covering a face with a hand or a sheet of paper in front of an entrance camera, or wearing face masks with photo prints of a human face, etc., that may lead to our implemented model's disorientation. Such experiments are intended to find algorithms, solutions and methods of fraud prevention and implement them into our face mask recognition software.

  • Generate the dataset using 3D data

Mediapipe FaceMash can detect 3D facial landmarks. We can use such 3D facial landmarks to enhance the quality of generated images of faces with masks put on. Usage of photos generated with the help of 3D facial landmarks can improve the performance of the face mask detection model. As soon as the creation of a dataset with the help of 3D data is too resource-consuming, it remains a possible further improvement.

Final thoughts

Compliance with mask wearing norms is intended to protect public health. However, the control on keeping the rules may require too much effort for a business or a government. Machine learning algorithms and computer vision technology may help keep the social balance between protecting public liberties and ensuring public well-being and wellness. Face mask detection solutions are helpful for many purposes and applicable for many business tasks.

We’ve implemented face mask detection software to recognize the presence of a mask on a face and to identify if a mask is worn correctly and covers a nose and mouth. In short terms perspective, in the middle of the pandemic, face mask detection is a crucial digital transformation every business should implement:

  • to escape an outage, delays, shutdown or quarantine,
  • to avoid losses of employees and customers,
  • not to get penalty charges, fines and legal proceedings.

Business cases of face mask detection software application

Our face mask detection software is capable of fitting many industries and sectors.

IT companies engaged in software development, support and maintenance of software solutions, products or services. Almost 90% of IT companies’ premises are open spaces, which means close interaction of numerous employees sharing one office area. A single cough may cause a whole team to get out of the development and lead to delivery delays or contract termination.

Manufacturing and production of food, hygiene items, household items, etc. Such businesses also often use open-spaced production facilities with a large number of employees working along production lines. One infected employee may become a threat to a whole batch of goods and your customers that will use your products.

Service business (renting or selling apartments, repairing equipment, beauty salons, etc.). The service business model involves the communication of employees with many clients daily. A business owner should care for both employees and clients and control them not to pose a danger of infection to each other. Failing this will lead to the closure of the business and the loss of the clientele.

Small-sized retail companies (food and drug stores, clothing shops, equipment and appliance departments and so on). For this group of companies, violations of face mask wearing rules by customers is even more dangerous than non-compliance or errors made by employees. Therefore, such businesses need to monitor face mask wearing and respond to violations immediately.

Large-sized retail premises (business centers, corporate buildings, office centers, coworking spaces, shopping centers and shopping malls). Flows of customers and employees are as massive as extensive are protective measures to comply with public health regulations. Losses from closing such a business for a single week are impressive and colossal too. A company can use many approaches, actions and methods in any combination to control compliance with the face mask wearing norms. For this business case of a large-sized business, gathering, storing and analyzing the history of events and statistics is essential. Collected statistical data allows a business owner to analyze problems, find patterns, make data-driven and take tailored actions for concrete business tasks. For example, not to close the entire shopping center, but restrict attendance by closing only specific departments.

How our face mask detection solution helps a business to monitor and control the correct wearing of face masks

The implemented face mask detection solution is intended for the observance of compliance with face mask rules. Our software may offer a customer to configure multiple features to control the face mask wearing and the possibility to set specific actions taken against violations. Let’s introduce some possible options.

Using a camera installed in front of the office building entrance, the software recognizes an employee either by an ID card or by a face. If the face is not covered with a mask, for example, the office entrance is right on the street, the face mask detection solution recommends a person to put a mask on. In case a person does not cover a face immediately or do it incorrectly and enters the office space with an uncovered face, this event is recorded by a camera and as a violation of the mask wearing regulation by a specific employee. Depending on the customer demands, further processing of such events can differ and is configured according to customers’ business processes and tasks.

Often inside the office premises, there are many rooms with access restrictions. An employee uses an entry or ID card to enter such places. It is possible to install a camera at the entrance to restricted areas and combine an employee's identification with a check of mask wearing correctness. The face mask detection software may generate events of partial or complete violations by a specific employee.

Also, open office and industrial spaces may be equipped with cameras to monitor and control mask wearing norms. The system may process violation events with and without the identification of a specific employee. It may recognize an employee if a face is not covered with a mask and generate an alert by a concrete employee. Or the software may generate violation reports by a specific department or room and introduce appropriate measures.

A business owner can configure white office zones where face masks are not required: a dining room, kitchen, coffee point, separated individual workplace. In these zones, the absence of a face mask will not be identified as a violation and an event will not be generated. However, the software solution may be used for other purposes, such as monitoring the allowed number of people. In this case, the system may issue an alert if the number of employees exceeds 10 at a small coffee point.

One more possible application of our face mask detention software may also be monitoring of customers’, clients’ or guests’ behavior regarding face mask wearing norms. Usually, customers and guests do not use an ID card and can not be recognized or identified. But the system can detect whether they cover a face with a mask and whether they do it right. We can set the software to locate where the violation takes place and how many people are involved. Alerts and notifications can be configured according to the business processes and needs of a certain user. For example, the system may notify only when there are 5 unmasked customers detected if there are 15 more clients with masks on them and 3 employees.

Events and actions as features of face detection software

We may tailor our Implemented face mask detection software for customer needs to generate several preset types of events. According to client needs, we may add and configure as many events as required to fit specific business processes. Usually, our face mask detection solution issues an event when it:

  • detects and recognizes an employee without a face mask
  • detects a recognized employee wearing a face mask incorrectly
  • detects an unrecognized person without a mask or wearing a face mask incorrectly
  • detects X number of people without a mask or wearing a face mask incorrectly in a Y location.

Collecting and analyzing data about any of the face mask rule violation events, as mentioned above, may save a business. The software records time, locations, participants and other details of events, so a business owner will have enough information to make the right decisions. For instance, a system may issue an event in department A, when at the meeting from 9 a.m till 10 a.m., 20 employees from a total number of 25 were not wearing masks. And then in the evening one of them complains about health and gets ill. Considering the circumstances and events gathered by the face mask detection software, it is better to isolate the whole department, not only the sick person and his other two colleagues he had communicated directly with.

Our face mask detection software may allow responding to violations of face mask rules. The listed are preset options, and the system is configurable, so we can tailor responses to fit your business demands. When an implemented software solution detects a person without a face mask or wearing a face mask incorrectly, it allows you to take the following actions:

  • control access to the premises, including denial and restriction
  • charge fines or penalties to employees
  • apply sanctions or preventive measures for a department, room or floor
  • send an alert to a security guard to come where a violation takes place
  • notify about recommendations to close or isolate certain rooms or premises based on face mask wearing statistics.

The implemented face mask detection system provides a wide choice of responses to face mask violation reports. Moreover, we can integrate, configure and tailor any complex software and hardware solution within your existing business environment.

Global face recognition market

SYTOSS is a software development company focusing on delivering complex integrated systems of enterprise-grade, driving digital transformation and implementing modern technologies, such as image recognition, facial recognition, face mask detection, emotion recognition, people counting solutions, and tailoring them to business-specific needs. If you are in need of a digital transformation and look for a mature IT partner to implement it, contact us, and we’ll discuss how we could help your business reach its goals.

Glossary of terms used in articles about AI, machine learning, neural networks and facial recognition