Abstract

Honey is considered as a premium food produced by honeybees. It is highly appreciated by consumers around the world and raises a major concern nowadays which is ensuring its authenticity in respect to its production and its botanical origin. In Lebanon, honey is mainly multifloral which makes its authentication rather difficult. While mid-infrared (MIR) spectroscopy combined with multivariate analysis has proven to be successful in authenticating unifloral honey, the challenge with Lebanese honey lies in assessing its performance with multifloral honey. Therefore, this work aims to test the performance of common components analysis (CCA) applied on mid-infrared spectra in the authentication of multifloral Lebanese honey. For this purpose, 96 multifloral Lebanese honey samples of different floral sources were collected from different regions of the Lebanese territory and analyzed using MIR spectroscopy. CCA applied to the spectral data, allowed a separation between honeydew honey samples and floral honey samples. In addition, honey samples collected from the Bekaa plain region were differentiated from the other honey samples collected from all the other Lebanese geographical regions. This discrimination between the groups of honey samples is based essentially on their sugar composition.

1. Introduction

Honey is considered as a major product in the food industry due to its valuable nutritional and medicinal benefits. Honey is highly valued for its potential beneficial activities to the human body as an anti-inflammatory, antidiabetes, and antibacterial [1].

The honey trade is a significant global industry that provides various economic and nutritional benefits, but it is also facing some challenges that need to be addressed to ensure the quality and authenticity of the product. Besides, food safety standards are very important in honey trade. There are various laws and regulations, such as the Codex Alimentarius standard (CODEX STAN 12-1981, Rev. 1–200) and the European Commission directive (2001/110/EC), that establish guidelines for honey quality.

The production of honey and its quality highly depend on the floral and nectar sources foraged by honeybees. Thus, honey is divided into two categories: nectar and honeydew. Nectar, which is a sugar-based liquid, originates from the lymph of plants and is secreted by special glands called “nectaries,” whereas honeydew is a secretion produced by parasitic insects that suck on the plant lymph [2].

The main constituents of nectar which are carbohydrates especially sucrose and glucose, in addition to water, are considered unchanging for the same species of plants and have a direct effect on honey composition [3]. Glucose and fructose dominate the composition of honey, and other constituents are present such as enzymes, amino acids, organic acids, carotenoids, vitamins, minerals, and aromatic substances [4].

According to the Codex Alimentarius standard, “honey should not have any ingredients added; no particular constituent can be removed from it; it does not have any objectionable matter, flavor, aroma, or taint absorbed from foreign matter during processing and storage; and it should not be heated or processed to such an extent that its essential composition is changed and/or its quality impaired” [5].

Honey produced by bees can be either unifloral or multifloral (polyfloral). The latter contains remarkable quantities of nectar and/or honeydew provided from different plant species. In contrast, unifloral honey is composed of nectar or honeydew provided from single plant species. Actually, honey produced from free flying bees is very rarely pure unifloral [6]. In addition, honey composition, flavor, and aroma vary significantly depending on its botanical and geographical sources as well as the flowering and harvesting seasons, production methods, and storage conditions [7] which increase its complexity.

Nowadays, the major concern is to ensure honey authenticity, as the honey quality and economical value are influenced by its authentication. The authentication of honey involves two main aspects: honey production and honey origin. The first aspect is related to the methods used in honey processing such as addition of sugar syrup, filtration, heating, and moisture content. As for the honey origin, it has a significant impact on its market price and overall quality, and it includes both geographical and botanical origins. Monitoring and controlling the authenticity of honey origin is of great importance, and this is achievable through the combination of reliable analytical methods and advanced chemometric tools [8]. Over the years, many studies have been investigating and developing several new analytical techniques for this purpose. The authentication and characterization of honeys, based on their botanical and geographical origins, were usually performed through classical complementary methods: sensory, melissopalynological, and physicochemical analyses [9]. However, these methods are laborious, time consuming, and require specialized skills.

The potential of mid-infrared spectroscopy coupled to different chemometric techniques has been evaluated for the authentication of foods [10]. Combining these methods was found to be a promising approach for the authentication of unifloral honey [11, 12]. For example, David et al. used MIR-ATR associated to PLS-DA to classify Romanian honey based on botanical origins and harvesting years [13], while Kasprzyk et al. employed FTIR-ATR followed by discriminant analysis to authenticate unifloral rape honey [11]. However, the characterization and authentication of polyfloral honey remains quite challenging.

In Lebanon, beekeeping is a main activity in the agricultural sector. It is one of the oldest professions in the country and has an important economic impact [14]. Because of the small surface of the country, offering within its 10,452 km2, many geographical and topographical morphologies with different altitudes, ranging from 0 to 3000 meters, and 4 distinct seasons, honey production in Lebanon is essentially multifloral [15].

According to Lebanon’s 5th national report to the convention on biological diversity, Lebanon has around 9,116 known species out of which 4,486 fauna species and 4,630 flora species distributed over five geographical regions: (1) the Coastal Zone, (2) the Mount Lebanon, (3) the Bekaa Plain, (4) the Anti-Lebanon, and (5) the South Lebanon [16].

Moreover, there are 4 types of honey that are produced and commercialized in Lebanon: (1) Citrus blossom honey, (2) Multiple flowers honey, (3) Honeydew honey, and (4) Mountain honey (Jerdi) [17].

Beekeepers in Lebanon tend to move their hives during their beekeeping practices to follow the blossoming flowers. This seasonal migration is also known as transhumance. In Lebanon, vertical transhumance is the base of professional beekeeping. It consists in moving hives from higher to lower altitudes according to the seasonal changes, the temperature variations, and the blooming of flowers [14]. This frequent practice makes the authentication of honeys according to their geographical and botanical origin even more difficult and challenging.

Consequently, the aim of this work is to study the efficiency of common components analysis (CCA) which is a novel exploratory chemometric tool when applied to mid-infrared spectra, in determining certain aspects of multifloral Lebanese honey authenticity.

2. Materials and Methods

A total of 96 Lebanese honey samples were collected directly from the beekeepers just after the extraction date and stored at 4°C until analysis. They originate from all Lebanese regions at altitudes ranging from 0 to 1950 meters and with different botanical origins. The distribution of the samples according to geographical and botanical origins is summarized in Table 1.

2.1. MIR Spectroscopy

Mid-infrared (MIR) spectra were recorded using Thermo Scientific Nicolet iS5 spectrometer equipped with an Attenuated Total Reflectance ATR-ZnSe accessory. MIR measurements were done using a Globar IR source and a DTGS-KBr detector. The honey samples were analyzed in triplicate without any prior preparation process with 64 scans per spectrum at a spectral resolution of 4 cm−1 in the wavenumber range from 500 to 5000 cm−1.

The MIR spectra were collected in a matrix of size (288, 2335), with 288 triplicate analysis of 96 samples, and 2335 variables corresponding to the MIR wavenumbers (cm−1).

2.2. Data Preprocessing

Data preprocessing methods are chosen based on the data analysis methods. All preprocessing methods aim to reduce random noise and systematic variations in the resulting data to improve their analysis. For this reason, preprocessing of data is an important and relevant step, especially when reasonable results are to be obtained [18].

First, on the acquired MIR spectral data, the wavenumber region 1830-5000 cm−1 related to water absorption bands was removed for all samples. Then, four sets of pretreatments were carried out to choose which pretreatment works best. These were standard normal variates (SNVs) with column centering, probabilistic quotient normalization (PQN) with column centering, first derivative calculation, and Detrend with SNV. Afterwards, CCA was applied to the pretreated datasets, and the results allowed a decision regarding the most suitable pretreatment, based on the clearest group separation pattern obtained. The pretreatment thus chosen was SNV with column centering applied on truncated spectral data. In SNV, each individual spectrum is normalized to zero mean and unit variance [19, 20] so as to eliminate baseline shifts and uncontrolled variations in global signal intensity. Mid-infrared data were then centered by column (Figure 1).

2.3. Common Components Analysis (CCA)

The pretreated matrix was analyzed by common components analysis (CCA) as a novel exploratory tool.

Common components analysis is a modification of the common components and specific weights analysis procedure (CCSWA or “ComDim”) [21].

The original ComDim algorithm aims to describe m data tables observed for the same n samples but possibly different numbers of variables. Extracted components correspond to the maximum amount of variance that is common to the largest number of tables. The method consists in determining a common space for all m data tables, with each matrix having a specific contribution (“Salience”) to the definition of each dimension of this common space. The procedure iteratively calculates for each successive common component a common scores vector (coordinates of the n samples along the direction defined by that common component) [22].

The idea of the ComDim algorithm is to calculate a weighted sum of the samples’ variance-covariance matrices,  =  , using an initial weighting or salience (), of 1 for all tables. The vector of scores of the first normed principal component is extracted from as an initial estimate of the first “common component” (CC). The salience of each block is then recalculated from these scores. The estimation of the global scores and saliences is optimized by iterative recalculation until convergence. After computation of the first CC, each original matrix is deflated, and the procedure is repeated for the calculation of the second CC, and so forth [21].

Common components analysis has been proposed as a modification of ComDim where each table contains a single variable. The CCs calculated by CCA are based on linear combinations of the individual variables with strong weightings for a given dispersion of the observations, hence grouping together strongly correlated variables [22].

The advantages of using CCA in exploratory analysis lie in its ability to provide a holistic view of multivariate patterns and its robustness to noise in the data. While some exploratory methods focus on individual datasets or specific aspects of the data, CCA brings a unique perspective by emphasizing the commonality between sets of variables.

Both preprocessing and data treatment were performed using MATLAB R2017a (simulink).

3. Results and Discussion

3.1. Interpretation of the MIR Spectra

An example of a mid-infrared spectrum of a honey sample is shown in Figure 2, and the bands attribution in the different zones of the spectrum is summarized in Table 2.

Earlier studies of Anjos et al. and Wang et al. allowed the differentiation of glucose, fructose, and sucrose by analyzing the spectral region 750–1500 cm−1. According to Anjos et al., sugar characterization could be determined by the peaks related to carbohydrate structure at 918 cm−1 which corresponds to the C–H bending, and at 1043 cm−1 and 1254 cm−1 which correspond to the C–O stretch in the C–OH group as well as the C–C stretch. A small peak at 1110 cm−1 is linked to the stretching of the C–O band of the C–O–C linkage. A peak at 1321 cm−1 is due to O–H bending of the C–OH group, and the peak at 1411 cm−1 is due to a combination of O–H bending of the C–OH group and C–H bending of alkenes [26]. A more detailed assignment was provided earlier by Wang et al., in which characteristic absorption bands identified for glucose are at 991, 1031, 1080, 1107, 1151, and 1367 cm−1 with a key peak at 1032 cm−1, the ones for fructose are at 966, 1063, 1083, 1155, 1254, 1346, 1416, and 1456 cm−1 with a key peak at 1053 cm−1 and those for sucrose are 995, 1055, 1113, and 1138 cm−1 with key peaks of 994 and 1049 cm−1. The characteristic absorption bands for carbohydrates in the region 904–1153 cm−1 were attributed to the stretching modes of C-O and C-C bonds whereas those from the region 1199–1474 cm−1 were due to the bending modes of the groups O–C–H, C–C–H, and C–O–H [27].

3.2. Interpretation of the CCA Results

Mostly, multifloral honey is produced in Lebanon as each honey sample contains contributions from several botanical sources and plant species. For this reason, two approaches were suggested to characterize honey samples depending on their botanical origin: the first classification is based on the floral sources that the bees fed on, based on the bees’ nutritional sheet provided with each honey sample. The second classification is based on the Lebanese geographical regions from where the samples were collected (Table 1).

The CCA results are presented in terms of scores and loadings plots. The scores plot represents the distribution of the samples and their repetitions in the space of the common components, while the loadings plot shows the contribution of the initial variables on this distribution.

The CCA scores plot CC1 vs CC2 presented with the grouping based on the botanical origin of honey (Figure 3) show a separation of the honey samples groups mainly along CC2, where the Honeydew and Citrus blossom honey samples are located on the positive side of the CC2 axis, while multiple flowers and Mountain honey samples are positioned on the negative side of the axis.

The CCA loadings plot of CC2 (Figure 4) shows the contribution of the variables related to MIR wavenumbers characterizing fructose. This concerns the negative peaks at 1051 cm−1 and 962.3 cm−1 on the CCA loadings plot of CC2 (second common components), which highly influence the position of the honey samples on the negative side of CC2 axis. This means that multiple flowers and Mountain honey samples are richer in fructose than Honeydew and Citrus blossom honey samples.

When comparing the state of dispersion of the Honeydew honey samples along the first component CC1, it is worth mentioning that most of these samples are concentrated on the negative side of CC1. The loadings plot of CC1 (Figure 5) shows an intensive negative peak at 997 cm−1. This wavenumber is attributed to sucrose. Hence, most of Honeydew honey samples analyzed with mid-infrared spectroscopy have high sucrose levels, noting that the predominant type of Honeydew honey collected from Lebanese regions is oak honey.

This result comes in agreement with the study of Poyrazoglu et al. reporting that pine and oak honeys (Honeydew honey samples) contained lower concentrations of fructose and higher concentrations of sucrose compared with all the floral honeys that showed a relatively higher concentration in fructose [28].

This result aligns also with the one generated by Gok et al., which stated that there is a difference between honey samples originating from trees (honeydew), and those from flowers (floral). The difference was also due to the carbohydrate content [29].

In fact, the same sugars are present in the different honey types, but in different percentages and proportions, which are mainly related to the flora [30]. The main sugars found in honey are fructose and glucose, which are products of the sucrose reduction done by enzymes deposited in the honey by the bees. According to Codex Alimentarius, as well as the European Commission, Honeydew honey should contain a minimum of 45 g/100 g of the sum of fructose and glucose, while floral honey should contain a minimum of 60 g/100 g of the sum of fructose and glucose. Many studies, namely the study of Bergamo et al., showed that Honeydew honey contains a higher concentration of di- and trisaccharides, as well as lower mean contents of glucose and fructose than nectar honeys [31].

Furthermore, the difference between the Lebanese geographical regions from where the honey samples were harvested is also discussed.

CCA scores plot of CC1 vs CC2 presented with the grouping based on Lebanese geographical regions (Figure 6) show that honey samples collected from the coastal zone, Mount Lebanon, and from the South of Lebanon do not represent an in-between group difference. The only separation seen is relative to the samples collected from the Bekaa plain, which are located on the negative side of CC2 axis.

The intensive negative peaks (at 1051 cm−1, 975.8 cm−1, and 962.3 cm−1) observed on the CC2 loadings plot (Figure 4) correspond to the MIR variables that contribute significantly on the separation of the honey groups along the CC2 axis and highly influence the position of the honey samples collected from the Bekaa plain on the negative side of CC2. This means that honey samples collected from the Bekaa plain are richer in fructose than the other honey samples collected from the other regions.

According to the “second report on the state of plant genetic resources for food and agriculture” in Lebanon, the Bekaa plain is known for its cultivated field crops that include cereals, potatoes, sugar beets, onions, forages, tobacco, and food legumes; these crops can be the source for nectar, which explains the high concentration in fructose found in honey samples collected from the Bekaa plain [32].

In another hand, if we look closely to the distribution of the honey samples collected from the coastal zone on the CCA scores plot of CC1 vs CC2 (Figure 6), we can see clearly the separation of two distinct groups along the CC1 axis: one group located on the negative side and corresponding to the Honeydew honey and another group located on the positive side of the axis and corresponding to the multiple flowers honey. When analyzing the loadings plot of CC1 (Figure 5), we find again that the variable that contributes significantly to this separation is the one that correspond to the intensive negative peak at 997 cm−1 characterizing sucrose. This variable describes the samples located on the negative side along the CC1 axis, which represent most of the samples collected from the coastal region and correspond to the Honeydew honey.

Again, according to the “second report on the state of plant genetic resources for food and agriculture” [32], in the coastal zone of Lebanon, pine trees, carob trees, storax, oak trees, and willows are mainly found and thus, could be the main source of Honeydew, providing most of the honey samples collected from the coastal region with a higher concentration in sucrose, located on the negative side of the CC1 axis.

This differentiation observed in the coastal region is not evident for the South Lebanon and Mount Lebanon regions. This is due to other factors related to altitude and climatic diversity inside each of these Lebanese zones.

At the end, in the case of Lebanese honey, it is very challenging to discriminate between all botanical origins and geographical zones. This fact is due to the seasonal migration activity also called transhumance performed by the beekeepers all year long. Beekeepers tend to move their hives following the different seasons and flower bloom to increase honey production [33]. According to a survey with Lebanese beekeepers [14], 87% of Lebanese beekeepers practice transhumance, with the lowest rate of migratory beekeeping found for beekeepers of the Baalbeck-Hermel region which may explain this distinction of the honey samples collected from this region. Considering the small area of Lebanon, fields with plants uniformity do not exist. Honeybees are hence offered a diverse and variable floral environment composed of different plant species all year long. Therefore, all honey samples produced from different floral origins cannot be differentiated, although the difference is seen between floral and Honeydew honey samples. This activity is the one of the relevant reasons behind the heterogeneity of Lebanese honey.

4. Conclusions

This study allowed the characterization of Lebanese multifloral honey based on their botanical and geographical origin in the aim of their authentication; a task which has so far remained very challenging. The application of CCA as a novel exploratory tool on the data of mid-infrared spectroscopy showed a high potential in discriminating between the botanical origin of the Lebanese honey. This difference was explained by variables that correspond to MIR wavenumbers related to specific vibrations of fructose and sucrose functional groups.

Moreover, the calculation of CCA on the MIR spectral data, based on the geographical origin, permitted to discriminate honey samples collected from the Bekaa plain from the honeys collected from the other Lebanese regions. Honey samples collected from the Bekaa plain showed to be richer in fructose than the other honey samples. This specificity of this region is due to the low transhumance activity and to the types of crops cultivated in this region.

This work constitutes a step forward towards the authentication of multifloral honeys, which has been until now very difficult. The application of CCA on MIR data has shown to be a promising and efficient method in the characterization of Lebanese multifloral honeys.

Data Availability

The data used to support the findings of this study are available on request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was partially supported by the Lebanese National Council for Scientific Research (CNRS-L).