Abstract

In order to improve the accuracy and efficiency of human motion perception, a multisensor intelligent fall perception algorithm considering the precise classification of human behavior characteristics is proposed. Multisensor devices (smart watches, smart phones) collect data such as acceleration and heart rate of the human body to obtain human behavior data. On the basis of human behavior data collection, the acceleration characteristics of a falling state are extracted, and the SVM method is used to classify human behavior characteristics. Cuckoo search is used to optimize the width of the SVM kernel and improve the accuracy of human behavior recognition. Finally, based on the behavior recognition results, the intelligent perception of human falling behavior is realized through the exercise preparation potential. The experimental results show that the perceptual accuracy of this method is high, which has reached 90%, and the perception efficiency is higher. The minimum perception time is only 0.56 s, which fully verifies the effectiveness of this method. It can be widely used in human-computer interaction, machine vision, and other fields.

1. Introduction

With the development of computing technology, the machine-centered computing model is changing to the human-centered computing model. Let people become a part of computing, promote the integration of the physical world and the information world, and realize high-level human-computer interaction is the future development direction [1, 2]. Accurate perception and understanding of human behavior are essential technical supports. As an important branch of image understanding, human behavior perception has a broad application prospect in computer vision fields such as video surveillance, human-computer interaction, and virtual reality. Considering the characteristics of human behavior, estimating, recognizing, and perceiving human behavior is a research hotspot and has been a difficulty in the field of computer vision in recent years [3, 4].

Yang et al. [5] proposed a method of human fall perception based on SECNN (squeeze and excitation cellular neural network), built a data acquisition environment, used a wireless radio frequency tomography network node as the communication foundation of the network, and used the network node to build a complete wireless sensor network communication system. The processing method of wireless sensor network data is “transmission, reception, and storage.” According to the collected RF (radio frequency) signal strength value, effective links are extracted, denoising, wavelet transform, time-domain features, and wavelet domain features are extracted, and the SECNN model is trained using the extracted multidomain features. The XGBoost (Extreme Gradient Boosting) model is used to filter the multidomain features obtained by permutation and combination of the time-domain feature components and wavelet domain feature components to obtain the joint feature components with strong robustness, and the multidomain feature perception fingerprint database of the fall action is established. The SECNN can be trained by using the extracted multidomain features, thus realizing the perception of human fall. The experimental results show that this method is convenient and feasible and is suitable for most scenes. However, this algorithm currently only considers the single-person scenario in the detection environment, not the multiperson scenario. Hao et al. [6] proposed a highly robust human motion perception model based on WiFi signals. This model uses antenna diversity to eliminate random phase shift and takes the Doppler frequency shift and FFT (fast Fourier transform) value caused by human motion in the frequency domain as identification features. Finally, the application performance of the model is verified by experiments. However, the generalization ability of this algorithm is weak and cannot recognize people’s actions in a multiperson environment, the feature extraction speed is slow, and the robustness is poor. Zheng et al. [7] proposed a human perception algorithm based on WiFi channel state information, which was designed by using a neural network method to convert the amplitude characteristics of channel state information into a three-dimensional matrix structure so as to retain the spatial, temporal, and frequency correlation carried by single sample data to the greatest extent. Then, 2D convolution is used to extract features from the 3D matrix, and random discard neurons and batch normalization are used to reduce overfitting. Finally, multitask learning is used to realize the parallel perception of human gesture and identity. The experimental results show that this method has high recognition accuracy for 150 gestures. However, the algorithm does not combine with the method of migration learning, so it cannot obtain a high-performance training model, and the accuracy of identity recognition is low. Choi et al. [8] proposed a fall risk perception algorithm for patients in acute nursing hospitals. This study aims to develop a fall risk perception questionnaire for patients in acute nursing hospitals and determine its reliability and effectiveness. In order to prevent patients from falling during hospitalization, they must accurately perceive the risk of falling. Although it achieved good results, it failed to use tools with established reliability and effectiveness to regularly assess patients’ fall risk awareness. Si et al. [9] proposed a 3D perceptual reconstruction algorithm of the retina based on a visual light field image. As an important part of the eye, the retina is very important to the human body. A large number of data were collected through a literature survey, and the related theories and algorithm processes of computer vision, light field imaging technology, and 3D reconstruction technology were introduced in detail, which laid a sufficient theoretical foundation for the 3D perceptual reconstruction of the retina. However, due to the uncertainty of the background image and the defects of current medical imaging materials, the quality of the reconstructed image is not very good. Rehman et al. [10] proposed the application of multimodal computer vision for suspicious activity identification based on the Internet of Things in smart city security. The proposed research model uses cross-combined human-computer interaction to implement an IoT-based architecture for efficient and real-time decision-making. Use a class dataset from the UCF crime dataset for activity identification. At the same time, the dataset extracted for suspicious object detection involves human-object interaction. This research is also applied to the detection and recognition of human activities on campus, for real-time suspicious activity detection and automatic alarm. Experimental results show that the proposed multimodal method achieves significant activity detection and recognition accuracy. However, the pattern of this algorithm is not customizable, flexible, and extensible, and it has certain limitations. Şengül et al. [11] proposed to apply smart watches to medical care based on the fall detection of deep learning. Fall is divided into falling from a chair and falling from a standing position. A mobile application is developed that can collect acceleration and gyroscope sensor data and transmit it to the cloud. The rolling update method is used to calculate 38 statistical data features and use them as the input of the classifier. Although it has achieved good results, the deep learning algorithm does not perform well on large and diverse datasets. Şengül et al. [12] proposed smart phone sensor data fusion for daily user activity classification. New mobile applications need to estimate user activity by using sensor data provided by smart wearable devices and provide context-awareness solutions for users living in an intelligent environment. This method is based on the matrix time series method for feature fusion as well as the improved, better-than-optimal fusion method and the random gradient descent algorithm and is used to build the optimal decision tree for classification. In order to estimate user activity, the statistical pattern recognition method is used, and the k-nearest neighbor and support vector machine classifiers are used. The classification accuracy rate of pedestrian motor traffic activities is 99.28%, but the algorithm has some limitations because more user activities are not classified and a large number of sensor-fused data are not processed in time.

After the above analysis, although the previous algorithms have achieved good results, there are still different problems. Therefore, this paper proposes a multisensor fall intelligent perception algorithm considering the fine classification of human behavior characteristics. Compared with the previous research algorithm, the innovation of the algorithm proposed in this paper is: (1)The acceleration, heart rate, and other data of the human body are collected through multisensor equipment to obtain human behavior data(2)On the basis of human behavior data collection, the acceleration features of the fall state are extracted, and the SVM method is used to precisely classify the human behavior features(3)The cuckoo search algorithm is used to optimize the width of the SVM core to improve the accuracy of human behavior recognition(4)Based on the results of behavior recognition, the intelligent perception of human fall behavior is realized through motion preparation potential

The research results of the algorithm are as follows: (1)The perceptual accuracy of the method in this paper is high, which has reached 90%(2)The perception efficiency is higher, and the minimum perception time is only 0.56 s, which fully verifies the effectiveness of the method. It can be widely used in human-computer interaction, machine vision, and other fields

2. Perception of Human Fall Behavior

The foundation of intelligent perception is to build an intelligent platform with strong data analysis capability. Its design does not require other redundant wearing links and only requires the collection of human behavior data through multisensors to achieve intelligent data analysis. In the intelligent perception platform, we can divide the areas where people are prone to fall at ordinary times, prevent falling areas in advance, and give early warning the first time of falling. This technology requires the accuracy of motion recognition. If normal actions are taken as falls, the effect of motion recognition will be affected. Therefore, in the process of action perception, this paper will maximize the ability of intelligent perception analysis to ensure the accuracy of human fall behavior perception.

2.1. Human Behavior Data Collection

At present, most human behavior recognition is based on video images. In order to ensure sufficient and high-quality video image data, traditional human behavior recognition methods based on video images often need to install a large number of cameras in a fixed area. Even so, when collecting images, they cannot achieve full-time continuous tracking of users. At the same time, scripted training of data collection personnel is required before data collection, which leads to a far cry from the real-life state and the inability to record the real behavior data of individuals. In addition, privacy protection is also a big challenge in the process of actually collecting image information. With the intelligent development of sensor equipment, accelerometer, magnetometer, heart rate meter, and other sensors are built into smart phones and smart bracelets, which can track and collect human behavior data for a long time and fully protect personal privacy.

The structure of human behavior data acquisition based on multisensor is shown in Figure 1.

According to Figure 1, firstly, data such as acceleration and heart rate of the human body are collected by multisensor devices (smart watches and smart phones), and the original data are scanned in a fixed time window to obtain samples as the input of the recognition model. Then, the behavior recognition model is used to identify specific categories and generate individual behavior datasets. In this paper, in the data analysis of intelligent perception, common actions such as walking, running, and jumping are added, and the actions such as walking in situ, tiptoeing, and backing are selected as references, and the data of the above six actions are collected as the behavior data of normal falls.

The statistics of human behavior intelligent perception parameters are shown in Table 1.

Table 1 shows a group of data collected and saved in a file in plain text. The axis, axis, and axis of the data are the original sampled data of intelligent perception. Through the definition of intelligent perception parameters, the action state of the human body at this time can finally be obtained.

The human behavior data collection method based on multisensor devices allows users to complete data collection by wearing mobile phones and smart bracelets. The advantages of this data collection method are as follows: (1)The data collection scenarios are diversified and are not limited by external conditions such as environment and weather(2)The data is real, which can record a person’s full-time motion data in a natural state(3)The dataset is complete. Compared with video collection, data collection using multisensor devices (smart phones, smart bracelets) is not subject to time, place, and human constraints

2.2. Feature Extraction of Falls

On the basis of human behavior data collection, the acceleration characteristics of the fall state are extracted, as shown in the following:

In Formula (1), is the state of human motion change; , , and are the directional acceleration signal transmission status of , and axes, respectively. Based on the big data analysis capability of intelligent perception, the human fall can be analyzed through the acceleration of , , and in , , and axes. The acceleration feature extraction method in this paper has a certain recognition effect, but its time-domain information still has a big loophole. Therefore, this paper will convert the time-domain features, and the conversion formula is as follows:

In Formula (2), is the conversion coefficient; is the robust feature in the time domain; is time domain sequence; is the time-domain sequence point; is a constant. Because the conversion coefficients of the time domain characteristics are different, . In addition, when the time-domain sequence performs feature conversion, the sequence length also needs to be converted at the same time, and needs to be guaranteed to be an integer power of 2. Under the same feature conversion frequency, this paper plans the time-domain sequence of acceleration uniformly, so that and are converted to , and the feature extraction results under the ideal state are as follows:

In Formula (3), is the ideal feature extraction state of the converted fall.

After the above feature extraction and conversion, six motion features, such as walking, running, jumping, walking in place, tiptoeing, and backing, are extracted to ensure the effect of motion perception.

2.3. Fine Classification of Human Behavior Characteristics

Based on the results of fall motion feature extraction, the human behavior features are precisely classified by the SVM (support vector machines) method. SVM has the characteristics of global optimization and good learning ability for small samples. It is a more suitable method for feature classification. The performance of SVM mainly depends on the kernel function. Due to different kernel functions, the performance of SVM will reflect different advantages and disadvantages. In the case of unknown data, the selection of the kernel function becomes a difficult problem. In the case of known data, the selection of an appropriate kernel function is the key problem to be solved. In order to construct an efficient SVM with good robustness, this paper proposes and uses an iterative method to improve the hyperkernel function that depends on training data, so as to reduce the learning risk and improve the convergence speed.

Support vector machines often use two types of kernel functions: (1)Translation invariant kernel, in the form of:

In Formula (4), is an arbitrary function. Such kernel functions include the Gaussian kernel and polyhedral kernel. (2)Rotation invariant kernel, in the form of:

In Formula (5), is an arbitrary function, which includes the polynomial kernel and two-layer neural network kernel.

This paper constructs a hyperkernel function to overcome the shortcomings of a single kernel function:

In Formula (6), and are classifier parameters, . The first term uses the commonly used Gaussian kernel function, and the second term uses the second-order polynomial kernel function.

For a classification problem, suppose there is , where and are the vector of the input space , and is the category index. SVM uses nonlinear mapping to map data to high-dimensional feature space and uses hyperplane to separate the two types of problems, that is, to find a linear classification equation in the feature space. From a set point of view, this mapping defines the transition from input space to feature space . When feature space is a Euclidean or Hilbert space, Riemannian metric can be defined:

By increasing the spatial resolution of the interface in , the distance between classes can be increased so as to improve the resolution performance of SVM and thus improve the classification accuracy of human behavior features.

2.4. Human Behavior Recognition

Although SVM can obtain the classification results of human behavior features when the number of training samples is small, the training time is long, and the running efficiency is low. RVM (relevance vector machine) integrates the advantages of SVM and neural networks. It not only overcomes the “overfitting phenomenon” caused by the small number of neural network training samples but also has the good classification ability of SVM, providing a new research tool for human behavior recognition. However, the kernel function parameters are still a problem to be solved. Therefore, in order to obtain an excellent human behavior perception effect, it is necessary to optimize the kernel width parameter .

2.4.1. Cuckoo Search Optimization Algorithm

The cuckoo algorithm is a swarm intelligence algorithm based on the transformation of the cuckoo’s nest-seeking behavior in nature. Compared with the ant colony algorithm, annealing algorithm, and particle swarm algorithm under similar algorithms, it can effectively promote the clustering algorithm to quickly jump off the local optimal steady-state platform to achieve steady-state convergence. The cuckoo search optimization algorithm is proposed on the basis of the cuckoo search algorithm. It is a metaheuristic algorithm based on nest parasitism and flight search mechanism, which are characterized by low complexity and good global performance [1317]. The algorithm first defines three assumptions, as follows: (1)A cuckoo bird lays only one egg and is randomly deployed in the nest(2)Some of the better nests will be reserved for the next generation, and only some of the worse nests will be updated

The probability of the host recognizing cuckoo eggs is .

The path and location update operation of cuckoo’s nest searching is as follows:

In Formula (8), and represent the position vectors of the cuckoo nest of generation and ; is the random search path of cuckoo flight; is a regulating factor.

The relationship between random search path and time follows the distribution of

In Formula (9), is the power coefficient, and .

In the working process of traditional algorithms, most bird nests are randomly updated, which makes the important information in the area near the bird’s nest not be fully utilized. Therefore, a selective elimination strategy is adopted, specifically:

In Formula (10), , , and are the nest positions at different times.

The traditional algorithm has a strong global search ability but a weak local search ability. Therefore, a fine search strategy is adopted to optimize the core width parameter , specifically:

In Formula (11), and are the best and worst nest positions, respectively. If the cuckoo search optimization algorithm assumes that there are adult cuckoo birds in the current population, the probability of their nests being found is and usually set as 0.25; the initial spatial distribution of cuckoo birds is , and the flight iteration distance and flight mode of the current population based on the moving step and Levi distribution can be expressed as:

In Formula (11), represents the data parameter in the Levy distribution , and is usually set to 1.5, is usually set to 1. The expression of is:

The population position is iteratively updated through , and the fitness function is recalculated. According to the results obtained, the data point set of the optimal solution is determined as , and the relationship between the preset value and is compared. If , the population position is iteratively updated, and the above operations are repeated until the optimal solution or the preset iteration threshold is reached, otherwise, the current iteration result is retained as the optimal solution; thus, the research of the cuckoo search optimization algorithm is completed.

2.4.2. Steps of Human Behavior Recognition

(1)For the human motion features extracted in Section 2.2, they are normalized as follows: (2)Determine the value range of the Gaussian kernel function parameter of the correlation vector machine

Initialize the nest position vector, which includes a subset of human behavior features and . (3)Calculate the fitness value of the nest position vector, as follows:

In Formula (13), is the characteristic state; is the weight value. (4)Some poor-quality nest position vectors are updated to generate new nest positions(5)If the end condition of the algorithm is reached, the human behavior feature subset and are obtained according to the globally optimal nest position vector(6)Build human behavior recognition model according to feature subset and [18, 19]

The workflow of the human behavior recognition model with the cuckoo search algorithm optimizing features and classifier parameters are shown in Figure 2.

2.5. Intelligent Perception of Human Fall Behavior

On the basis of human behavior recognition, the process of intelligent perception of human falling behavior is shown in Figure 3.

Firstly, the human behavior data collected by multisensor is used as the data source to preprocess the sensor signal, and the multisensor equipment is used to determine the starting time of human motion. Secondly, assuming that the starting time of human motion is zero, the sensor signal of -3.0–3.0 s is defined as an effective data segment to complete data segmentation and event alignment. Thirdly, the motion artifact component is removed from the segmented data.

After data processing is realized, an intelligent perception of human fall behavior is conducted based on motion readiness potential, and the steps are as follows:

First, according to the acquired characteristics of human behavior changes, the channel that conforms to the change rule of motor preparation potential is selected as the benchmark, and the falling edge of its motor preparation potential is taken as the matching template.

Secondly, the sliding step, threshold, and observation window are set for sliding matching. When the matching coefficient is greater than the set threshold, the motor readiness potential is considered to occur, and the time distance between the endpoint of the current observation window position and the starting time of the motion is calculated.

Finally, count the number of channels that can be matched; when it is greater than the threshold value of the number of channels set, it is considered that the human fall behavior occurred in this experiment, and determine the preperceived time of human fall [2025].

3. Experimental Analysis

In order to verify the practicability and effectiveness of the multisensor intelligent fall perception algorithm considering the precise classification of human behavior characteristics, experimental research was carried out. In the experiment, a method of human fall perception based on SECNN and a highly robust human motion perception model based on WiFi signals were used as comparison methods and compared with this method.

3.1. Experimental Design and Arrangement

A total of 6 college students (labeled S1-S6) were selected in the experiment, including 5 males and 1 female, aged between 22 and 25. None of the 6 students had any history of sensorimotor impairment or any psychological disorder, and they signed an informed consent form with the subjects before the experiment started.

The experimental arrangement is as follows: each subject needs to complete two groups of lower limb autonomous movement experiments, including the left leg step and right leg step. In order to ensure the accuracy and integrity of the experiment, the subjects should always stand naturally during the experiment. After the start of the experiment, the subjects remained in a resting state for a certain time. Then, the subject can start the left or right leg’s autonomous walking movement and keep it for about 1-2 seconds. The movement start time is completely controlled by the subject, without any prompts or requirements. Thirdly, the subjects ran, jumped, tiptoed, and retreated. Finally, the subjects had a rest of about 10 s after completing the above actions and prepared for the next experiment. According to the above experimental arrangement, the subjects were tested many times, and the number of falls was counted. The process is shown in Figure 4.

3.2. Experimental Evaluation Index

In this experiment, three evaluation indexes—precision , recall , and value—are used. The precision is the proportion of the number of real positive cases in the detected positive cases; is the most common evaluation index, and its calculation method is relatively simple, namely:

The recall rate is the proportion of all samples judged as positive examples in all positive examples. is also called recall, and its calculation method is:

The value is put forward on the above two evaluation indexes, which can evaluate the advantages and disadvantages of the perception algorithm. The value is the harmonic average of precision and recall. If only precision or recall is considered, it cannot be used as an indicator to evaluate a method. Therefore, the value is used to reconcile the two, which is compatible with precision and recall, and it is also one of the commonly used evaluation indexes. The calculation method is:

In the above formula, is a true example, is a true negative example, is a false positive example, is a false negative example, and the four parameters, respectively, indicate the number of positive samples, negative samples, negative samples, and negative samples.

3.3. Results and Analysis

Formulas (16)–(18) are used to calculate the intelligent perception results of human fall behavior based on a method of human fall perception based on SECNN proposed by Yang et al. [5], a highly robust human motion perception model based on WiFi signals, and this method proposed by Hao et al. [6], as shown in Figures 57.

According to Figures 57, compared with a method of human fall perception based on SECNN and a highly robust human motion perception model based on WiFi signals, this method has obvious advantages in three commonly used evaluation indexes, which fully show that this method has a better perception effect on human falling behavior. By analyzing the specific data, it can be seen that the value of this method has achieved excellent results, and the accuracy has reached 90%. This method has also achieved good results in value and value, which proves the advantages of this method. The above experimental results show that the motion perception effect of this method is better because the cuckoo search is used to optimize the kernel width parameter , which improves the correct rate of human behavior recognition.

In order to further verify the effectiveness of the method in this paper, a method of human fall perception based on SECNN, a highly robust human motion perception model based on WiFi signals, and the method in this paper are compared with the action perception efficiency as an experimental indicator. The results are shown in Table 2.

According to Table 2, the number of experiments increases gradually, and the perception time of the iterative actions of the three methods also increases gradually. In comparison, the perception time of this method is shorter, and its minimum value is only 0.56 s, which is 0.96 s and 1.04 s lower than that of a method of human fall perception based on SECNN and a highly robust human motion perception model based on WiFi signals, respectively. The maximum perception time of this method is 1.13 s, which is 1.56 s and 1.42 s lower than that of a method of human fall perception based on SECNN and a highly robust human motion perception model based on WiFi signals, respectively. A method of human fall perception based on SECNN and a highly robust human motion perception model based on WiFi signals are 2.138 s and 2.115 s, respectively. The average perceived efficiency of this method is 0.82, which is 1.318 s and 1.295 s lower than the other two methods. To sum up, the method in this paper is more efficient in motion perception, which shows that this method can react to the human fall in a shorter time, which is conducive to avoiding the timely response of the human body and serious physical injury.

4. Conclusion

Many scholars have been constantly creating new technologies to realize more effective perception and analysis of human behavior, and many behavioral analysis technologies have been gradually applied to daily life. However, the existing methods still have the problem of low accuracy in motion perception. Therefore, to improve the accuracy and efficiency of human motion perception is the research purpose, this paper proposes a multisensor intelligent fall perception algorithm considering the precise classification of human behavior characteristics. Through research, the main conclusions are as follows: (1)Multisensor devices (smart watches, smart phones) are used to collect human acceleration, heart rate, and other data to obtain human behavior data and improve the comprehensiveness of the data(2)Based on the results of human behavior data collection, the features of the fall state are extracted, and the SVM method is used to precisely classify human behavior features, which is conducive to improving the accuracy of motion perception(3)Cuckoo search is used to optimize the width of the SVM kernel to improve the accuracy of human behavior recognition. Based on the behavior recognition results, an intelligent perception of human fall behavior is realized through motion preparation potential(4)This method has high perception accuracy. Among them, the A value has achieved excellent results, the accuracy has reached 90%, and the perception efficiency is higher. The minimum value of perception time is only 0.56 s, which is 0.96 s and 1.04 s lower than the methods in reference, respectively, which fully verifies the effectiveness and application value of the method

Although the method in this paper has achieved some research results, the miscalculation rate of fall behavior is high, and the amount of calculation is too large, which can be further studied in the future.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The author declares that there is no conflict of interest regarding the publication of this article.