Estimation of Parameters of Finite Mixture of Rayleigh Distribution by the Expectation-Maximization Algorithm

Mohammed, Noor; Ali, Fadhaa

doi:https://doi.org/10.1155/2022/7596449

Journal of Mathematics

On this page

Abstract Introduction Materials and Methods Results Conclusion and Discussion Data Availability Conflicts of Interest References Copyright Related Articles

Research Article | Open Access

Volume 2022 | Article ID 7596449 | https://doi.org/10.1155/2022/7596449

Estimation of Parameters of Finite Mixture of Rayleigh Distribution by the Expectation-Maximization Algorithm

Noor Mohammed¹and Fadhaa Ali¹

Academic Editor: Ding-Xuan Zhou

Received22 Sept 2022

Revised16 Nov 2022

Accepted02 Dec 2022

Published30 Dec 2022

Abstract

In the lifetime process in some systems, most data cannot belong to one single population. In fact, it can represent several subpopulations. In such a case, the known distribution cannot be used to model data. Instead, a mixture of distribution is used to modulate the data and classify them into several subgroups. The mixture of Rayleigh distribution is best to be used with the lifetime process. This paper aims to infer model parameters by the expectation-maximization (EM) algorithm through the maximum likelihood function. The technique is applied to simulated data by following several scenarios. The accuracy of estimation has been examined by the average mean square error (AMSE) and the average classification success rate (ACSR). The results showed that the method performed well in all simulation scenarios with respect to different sample sizes.

1. Introduction

Fitting an appropriate distribution of data is one of the recent issues that challenge research worldwide. One of which arises from the date when it belongs to more than one population. Therefore, finite mixture models are employed to tackle such issues. In many applications, one large population can be formed from several subpopulations that are mixed in an unknown proportion. In such a case, one can come across a population of biological or electrical elements that may be divided into two or more subpopulations based on the possible causes of failure or any other categories such as male, female, and age categories. Many researchers have introduced several inferential methods to infer some aspects of data following heavy-tailed distributions [1, 2].

Many methods have been proposed to apply the mixture model to data in physical, biological, engineering sciences, and other fields as a valid technique of machine learning methods. A mixture of normal and Laplace distributions is fitted to wind shear data [3, 4]. In addition, the mixture model was used in modeling the crime and justice data [5]. A mixture of exponential distributions was considered a valid tool to infer device failure by dividing the population into several subpopulations [6, 7]. Split-and-merge operations were used to improve the likelihood via the mixture of the t-distribution model to study image compression and pattern [8]. The mixture model was used as a classifier tool to cluster individuals based on estimated density functions for each individual based on blood cell data [9].

The mixture model can be structured from densities following the same families of distributions or different ones. Either way, it is useful as a confirmatory or predicting tool. Indeed, the maximum likelihood (ML) is the popular method for inferring the model [10, 11]. However, the challenge is the unknown number of allocations and observation memberships, through which the accuracy of estimation is undermined. These issues have been tackled by many proposed techniques over years. Therefore, determining the exact number of components by performing a relevant test is considered an important starting point in applying the mixture model [12, 13]. In the same manner, the most common way of handling the above issues is by using a latent variable that leads to a complete-data log likelihood rather than using the incomplete one [14], and then, the expectation-maximization (EM) algorithm is employed to estimate model parameters [15–17].

Moreover, Markov hidden model can be employed by adding components-indicator labels to the mixtue model to determine observations memberships. This is an extension of Gaussian mixture model to model k-components of stationary or non-stationary auto-regressive [18], or to model time series data by non-Gaussian mixture [19]. Bayesian inference can also be used to infer the parameters of mixture models by assuming prior distribution for the parameters which in turn result in the posterior distribution of the parameters. This framework can be conducted by Markov chain Monte Carlo (MCMC) [20, 21]. This framework of estimation can avoid many difficulties in its applications that involve (ML) estimation [22–24]. The authors in [25] introduced a general form of inverse Rayleigh distribution, which is known as exponentiated inverse Rayleigh distribution (EIRD), which extends a more flexible distribution for modeling life data.

In this paper, we derived the expectation-maximization (EM) algorithm to estimate parameters of the mixture of Rayleigh distributions. This can be done through the maximum likelihood of the model, and then, we can use the EM algorithm to update estimators by iterative steps until reaching convergence. Afterward, the accuracy of the model inferred is examined by simulation based on different scenarios and the number of components. The success rate of clustering is also calculated accordingly. The rest of the paper is organized to give the theoretical details of the theoretical part in Section 2. In Section 3, the method is applied to the simulated data by using several scenarios. In Sections 4 and 5, the discussion and conclusion are involved to show the complete picture of the method and its results.

2. Materials and Methods

A random variable of Rayleigh distribution is defined as the time between two consecutive occurrences [26]. The probability density function is written as

In many lifetime scenarios, data can come from more than one component, and then, known distributions become inaccurate to fit the data. In such a case, the mixture model can be used to handle the problem. When it comes to lifetime data, a finite mixture of Rayleigh distribution is used to model the data. The probability density function (PDF) of the mixture of Rayleigh distribution with components can be written aswhere <, .

3. Inferential Method

The classical way of estimating the parameters of the mixture model is by the maximum likelihood (ML). The incomplete-data likelihood function of the k-component mixture model is defined as follows:where .

The allocations here are considered a hidden state that needs to be defined by a latent vector The variable is a Markovian vector, and it is defined as , where . is defined as

Hence, the complete-data likelihood function is written aswhere .

Taking the logarithm of Equation (2) results in

Many methods have been proposed to estimate parameters of Equation (3). One of which is the expectation-maximization (EM) algorithm that is used with some mixture models [27]. In this paper, the EM algorithm is derived for the k-component mixture of Rayleigh distribution. The next section consists of the details of such a method.

4. EM Algorithm

Equation (6) is the complete-data log-likelihood function that needs to be maximized given the data. The EM algorithm is used here to estimate the model parameters by considering missing data. It consists of two iterative steps, E (expectation step) and M (maximization step). For , we repeat the following two steps.

4.1. E-Step

The unobservable data (here ) are handled by the E-step, which takes the expectation of the complete-data log likelihood conditionally on the given observed data , using the value of of the current iteration. Let be the value chosen initially for . The first iteration of the EM algorithm computes the conditional expectation of log given by using for , which can be written as

As log is linear in the missing data , the E-step at the iteration requires the current expectation of the random variable , which corresponds to , to be calculated conditionally on the given observed data .

Then, we get

The quantity is the posterior probability, which implies that the element of the sample with the observed value belongs to the component of the mixture. By substituting into Equation (3) with , the Q function becomes

4.2. M-Step

To get the updated estimate , the global maximization of with respect to over the parameter space is required by the M-step on the iteration. Given the updated at the iteration , the weighted parameter can be estimated by taking the first derivative of with respect to and equating it to 0, which results in

The component parameter at the iteration can be computed by deriving with respect to and equating it to 0, and we get

Hence, at the iteration , the component parameter can be estimated as

The algorithm is terminated as soon as we get convergence or after the prespecified number of iterations. Note that the final iteration can be estimated by generating it from multinomial distribution as

The estimated value of can be used as a classifier of component members based on the definition mentioned in Equation (1).

The elements of the component are represented by , where

5. Assessing the Number of Components

The abovementioned EM algorithm is based on known allocation . Therefore, we assess the appropriate number of components by calculating the Bayesian information criteria according to the formula [28]:where is the maximum likelihood function in (5) and Observation memberships can be updated after getting Bayesian estimators of as previously mentioned.

6. Simulations

To examine the performance of the EM algorithm on the finite mixture of Rayleigh distribution, we generate data by following two-component and three-component scenarios of the mixture model. The following scenarios were used to generate according to predefined parameters , where And , where (. The sample sizes . A comparison was made by calculating the average mean square error (AMSE) of the estimated model and the average classification success rate (ACSR) for the number of replications equal to R by using the following formulas:

Here, is the MSE of the model on the replication .

For calculating ACSR, we let be the number of items with the correctly estimated membership divided by the sample size n on the replication .

7. Results

The results in Table 1 represent the data generated from the two-component mixture model and show that the EM algorithm is good at estimating the parameters of the model with respect to AMSE, which means the estimated parameters are close to the real ones used in the generated data. It can be seen that the AMSE is decreased as sample sizes are increased. However, the ACSR is decreased when component parameters are close to each other. The graphs of the PDF of the model with respect to true parameters and estimators are shown in Figure 1, which shows the curves of probability distribution functions given the estimated parameters compared to the one of the true parameters values that has been used in the simulation. The results of applying the EM algorithm to the three-component model are shown in Table 1.

The results of applying the EM algorithm to the three-component model are shown in Table 2. It can be seen that the results of AMS of the large sample sizes are better than the small sample sizes. However, it is obvious that as the component parameters are close to each other, the ACSR is decreased, which means that the clustering of data becomes difficult. This is obvious in Figure 2, where the of the probability distribution functions of the third scenario does not show the distance between the tops.

8. Conclusion and Discussion

We derived the formulas for estimating the parameters of the finite mixture of Rayleigh distribution by the EM algorithm in a general form. The mathematical formulas of estimators are valid for the k-component mixture of Rayleigh distribution. However, the computations of the EM algorithm can be more accurate when sample sizes are increased. The underlying reason for this is the computation of the member variable as it is calculated with the E-step of the algorithm, given the observations. This in turn will result in great accuracy when it comes to estimating component parameters and weights. We apply the proposed framework to simulated data. The data were simulated according to two and three components as a special case from the general one. It can be seen from the results that the algorithm is performed in a good manner when sample sizes are large and parameters are not close enough to each other. We considered the AMSE and ACSR to be tools for measuring the accuracy of the estimation and clustering of data into their groups. The method was illustrated by graphs as well to show the estimated graph and compare it to one of the true parameters that are used in generating data for several scenarios.

Data Availability

All data were simulated, and no real data were used.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

S. Y. Huang, Y. L. Feng, and Q. Wu, “Learning theory of minimum error entropy under weak moment conditions,” Analysis and Applications, vol. 20, no. 01, pp. 121–139, 2022.
View at: Publisher Site | Google Scholar
F. S. Lv and J. Fan, “Optimal learning with Gaussians and correntropy loss,” Analysis and Applications, vol. 19, no. 01, pp. 107–124, 2021.
View at: Publisher Site | Google Scholar
P. N. Jones and G. J. McLachlan, “Laplace-normal mixtures fitted to wind shear data”,” Journal of Applied Statistics, vol. 17, no. 2, pp. 271–276, 1990.
View at: Publisher Site | Google Scholar
G. K. Kanji, “A mixture model for wind shear data,” Journal of Applied Statistics, vol. 12, no. 1, pp. 49–58, 1985.
View at: Publisher Site | Google Scholar
C. M. Harris, “On finite mixtures of geometric and negative binomial distributions,” Communications in Statistics - Theory and Methods, vol. 12, no. 9, pp. 987–1007, 1983.
View at: Publisher Site | Google Scholar
G. S. Rao, “Estimation of reliability in multicomponent stress-strength based on generalized exponential distribution,” Revista Colombiana de Estad´ıstica, vol. 35, no. 1, pp. 67–76, 2012.
View at: Google Scholar
N. Ueda, R. Nakano, Z. Ghahramani, and G. E. Hinton, “SMEM algorithm for mixture models,” Neural Computation, vol. 12, no. 9, pp. 2109–2128, 2000.
View at: Publisher Site | Google Scholar
D. Peel and G. J. McLachlan, “Robust mixture modelling using the t distribution,” Statistics and Computing, vol. 10, pp. 335–344, 2000.
View at: Google Scholar
I. V. Cadez, C. E. McLaren, and P. Smyth, “Hierarchical models for screening of iron deficiency anemia,” in Proceedings of the Sixteenth International Conference on Machine Learning, pp. 77–86, Morgan Kaufmann, San Francisco, CA, USA, June 1999.
View at: Google Scholar
E. L. Lehmann, “Efficient likelihood estimators,” The American Statistician, vol. 34, no. 4, pp. 233–235, 1980.
View at: Publisher Site | Google Scholar
E. L. Lehmann, Theory of Point Estimation”, Wiley, New York, NY, USA, 1983.
D. Karlis and E. Xekalaki, “On testing for the number of components in finite Poisson mixtures,” Annals of the Institute of Statistical Mathematics, vol. 5, pp. 149–162, 1999.
View at: Google Scholar
J. Henna, “On estimating of the number of constituents of a finite mixture of continuous distributions,” Annals of the Institute of Statistical Mathematics, vol. 37, no. 2, pp. 235–240, 1985.
View at: Publisher Site | Google Scholar
C. Liu and D. B. Rubin, “Maximum likelihood estimation of factor analysis using the ECME algorithm with complete and incomplete data,” Statistica Sinica, vol. 8, pp. 729–747, 1998.
View at: Google Scholar
T. A. Louis, “Finding the observed information matrix when using the EM algorithm,” Journal of the Royal Statistical Society: Series B, vol. 44, no. 2, pp. 226–233, 1982.
View at: Publisher Site | Google Scholar
C. Liu and D. B. Rubin, “ML estimation of the t distribution using EM and its extensions, ECM and ECME,” Statistica Sinica, vol. 5, pp. 19–39, 1995.
View at: Google Scholar
F. Ali and J. Zhang, “Mixture model-based association analysis with case-control data in genome wide association studies,” Statistical Applications in Genetics and Molecular Biology, vol. 16, no. 3, pp. 173–187, 2017.
View at: Publisher Site | Google Scholar
C. S. Wong and W. K. Li, “On a mixture autoregressive model,” Journal of the Royal Statistical Society: Series B, vol. 62, no. 1, pp. 91–115, 2000.
View at: Publisher Site | Google Scholar
N. D. Le, R. D. Martin, and A. E. Raftery, “Modeling flat stretches, bursts, and outliersin time series using mixture transition distribution models,” Journal of the American Statistical Association, vol. 91, no. 436, pp. 1504–1514, 1996.
View at: Publisher Site | Google Scholar
M. A. Tanner and W. H. Wong, “The calculation of posterior distributions by data augmentation,” Journal of the American Statistical Association, vol. 82, no. 398, pp. 528–540, 1987.
View at: Publisher Site | Google Scholar
A. E. Gelfand and A. F. M. Smith, “Sampling-based approaches to calculating marginal densities,” Journal of the American Statistical Association, vol. 85, no. 410, pp. 398–409, 1990.
View at: Publisher Site | Google Scholar
M. K. Cowles and B. P. Carlin, “Markov chain Monte Carlo convergence diagnostics: a comparative review,” Journal of the American Statistical Association, vol. 91, no. 434, pp. 883–904, 1996.
View at: Publisher Site | Google Scholar
G. Celeux, M. Hurn, and C. P. Robert, “Computational and inferential difficulties with mixture posterior distributions,” Journal of the American Statistical Association, vol. 95, no. 451, pp. 957–970, 2000.
View at: Publisher Site | Google Scholar
S. P. Brooks, “On Bayesian analysis and finite mixtures for proportions,” Statistics and Computing, vol. 11, no. 2, pp. 179–190, 2001.
View at: Publisher Site | Google Scholar
G. S. Rao and S. Mbwambo, “Exponentiated inverse Rayleigh distribution and an application to coating weights of iron sheets data,” Journal of Probability and Statistics, vol. 2019, Article ID 7519429, 13 pages, 2019.
View at: Publisher Site | Google Scholar
S. L. Miller and D. Childers, Probability and Random Processes”, Academic Press, Cambridge, MA, USA, 2012.
J. M. G. Taylor, “Semi-parametric estimation in failure time mixture models,” Biometrics, vol. 51, no. 3, pp. 899–907, 1995.
View at: Publisher Site | Google Scholar
S. L. Sclove, “Application of model-selection criteria to some problems in multivariate analysis,” Psychometrika, vol. 52, no. 3, pp. 333–343, 1987.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2022 Noor Mohammed and Fadhaa Ali. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

346

Downloads

392

Citations