Abstract

The main aim of this work is the determination of aromaticity in biochar from easier accessible parameters (e.g., elemental composition). To this end, two machine learning models, including adaptive neurofuzzy inference system (ANFIS) and least-squares support vector machine (LSSVM), were used to predict this constant form 98 dataset gathered from earlier reported sources. The outputs of the statistical parameters showed that the LSSVM model has the ability to estimate the target parameter with R-squared values of 0.986 and a mean relative error of 3.821 for the overall dataset. Also, by analyzing the sensitivity on the input parameters, it was shown that the carbon percentage has the greatest effect on the target values, and a high focus should be placed on this parameter. Finally, by comparing the methods proposed in this paper with other models published in previous studies, our model has shown higher accuracy in predicting the target parameter.

1. Introduction

By undergoing biomass thermochemical transformation throughout an oxygen-scarce condition, biochar is created [1, 2]. As a carbon-negative product and an efficient way to enhance soil productivity, biochar has lately gained attention [35]. It also has other ecological benefits, such as storing carbon to lessen the effects of continent changing [68]. A significant biochar characteristic is aromaticity, which may enhance durability and impact the ground ecosystem [9, 10]. Chemical endurance and biological disintegration tolerance are improved in biochar with higher aromatic C proportions [11, 12]. Aromaticity refers to the proportion of carbon atoms that are found in aromatic rings [13, 14]. The aromaticity of biochar C has been linked to its ability to adsorb organic contaminants, according to Han et al. [15]. When biochar is incubated over time, the quantity of CO2 produced correlates directly with the percentage of aromatic C present at the beginning of the process, according to Singh et al. and Xu et al. [16, 17].

Solid-state 13C nuclear magnetic resonance (13C NMR) spectroscopy, near-edge X-ray absorption fine structure spectroscopy (NEXAFS), benzene polycarboxylic acid (BPCA) assessment, and lipid profile are several chemical and physical techniques, which have been employed to analyze aromatic C throughout biochar statistically [1821]. Though time- and money-consuming and difficult to implement, these methods can only be found at research-heavy colleges [22]. Biochar aromaticity has so far been evaluated using uncomplicated and reliable substitutes [2325].

Because it is feasible to correctly anticipate the aromaticity of biochar C by utilizing certain fundamental characteristic properties of biochar, numerical modeling may be a commercially practical and effective option for assessing the extent of aromaticity. The atomic H/C proportion was used by Maroto-Valer and his colleagues to create a linear quantitative method for predicting bituminous char’s C aromaticity [26]. The linear Maroto-Valer formula, on the contrary, has an applicability spectrum of just 0.5 to 0.8 for H/C. As a result, Mazumdar created a more precise forecasting method for the C aromaticity based on fundamental combinations obtained from polycyclic aromatic hydrocarbons (PAHs, which only included C and H) and an updated densimetric technique [27]. Compared to the Maroto-Valer approach, the new one possesses a higher predictive ability since it considers the structural data of the C- and H-atoms. Biochar aromaticity may now be predicted using a framework that incorporates heteroatoms of H, O, and N, developed by Wang and his colleagues [28]. When the H/C proportion of biochar is more than 1.0, the altered model’s extension capabilities are similarly restricted. To this end, novel predictive models for biochar aromaticity must be developed that have greater extension capabilities.

Artificial intelligence strategies combined with machine learning can successfully simulate a variety of phenomena [2931]. In order to estimate biochar production, discover inorganic phosphor hosts, investigate nanomaterial toxic effects, and improve the magnetoelastic Fe-Ga alloy architecture, the intelligence approach has been employed [13]. Artificial neural network (ANN), support vector machine (SVM), Bayesian network (BN), and random forest (RF) are examples of intelligence techniques that are used in operation. Numerous desirable hypotheses were added during the development of the Mazumdar framework to incorporate heteroatom structural data (e.g., H, O, and N). For example, in the event of high C/N proportions, one C = O link was substituted by two C-H linkages and one N atom by a C atom [28]. The revised Mazumdar model’s forecasting ability is not acceptable for feasible usage when optimal hypotheses are taken into account, leading to the omission of essential model details. As a result, an appropriate predictive model capable of more generality may be identified. This study’s primary aims are as follows: (1) to create fundamental composition-based forecasting algorithms with improved generalization capabilities for biochar aromaticity and (2) to uncover novel connections and regulations among the fundamental combinations and biochar C aromaticity. In this paper, LSSVM and ANFIS models are used for the first time to predict the biochar aromaticity with acceptable accuracy.

2. Least-Squares Support Vector Machine (LSSVM)

Structural risk reduction and quantitative learning concepts were inspired by Vapnik to create the algorithm of SVM as a new machine learning technique [3234]. On Seyken’s SVM framework, the LSSVM method was created [35]. Two optimization variables of γ and σ2 must be calculated for the LSSVM method [36]. The following sources provide more information and LSSVM method formulas [3739].

3. Adaptive Neurofuzzy Inference System (ANFIS)

The ANFIS method is built using a mix of fuzzy logic and ANN characteristics [40, 41]. In terms of capacity to convert language words into computational variables, each layer has a distinct function in this algorithm, which comprises five separate levels [42, 43]. Throughout the paper, more information concerning these levels may be reported [44, 45].

4. Preanalysis Phase

Two alternative methods were utilized in this study to estimate and evaluate the aromaticity in biochar obtained by the algorithms. For this purpose, 98 datasets were collected from the data reported in previous articles [13]. The information gathered in the empirical part of this research was used to instruct the algorithms, with about 25% of these data points used for algorithm confirmation. In addition, the following steps were taken to normalize the dataset:

In this formula, x represents the input score, and represents the normal score.

5. Outlier Identification

Suspected or outlier data points that behave differently from the rest of the databank are common in large datasets [46, 47]. These data factors may have an impact on the models’ dependability and accuracy [48]. Due to the importance of the training dataset, it is critical to find these types of data when developing algorithms. Some limits in the model will emerge if some inexplicable effects are disregarded. In other words, investigating the outlier may provide insight into these limits, which are the benefits of the discussion analysis [49]. To eliminate outlier data, the leverage technique was used. This technique involves calculating the prediction tool’s divergence from the accurate matching data. This deviation, also known as standard cross-accredited residuals, builds the hat matrix. The hat matrix is calculated in this study using the following formula [50, 51]:in which X indicates a matrix of size N × P. P and N indicate the overall input parameters and data points, respectively. Transpose and inverse operators are denoted by T and −1, respectively. Furthermore, the following equation was used to describe a warning leverage value:

The viable region is specified as a rectangle limited by and R = ±3 values. Only 3 suspicious data points are found within the total dataset, as shown by the red dots in Figure 1.

The relevancy factor is calculated as [52, 53]in which denotes input k, is the average input, is output i, and stands for the average output.

A larger relevancy factor implies higher efficiency for the corresponding parameter. Figure 2 illustrates the influences of the input on the aromaticity in biochar. The relevancy factor was obtained to be 0.14 and 0.90 for N% and C%, respectively. This suggests that these parameters have the strongest contributions to the aromaticity in the biochar estimate. They are directly related to the output. Also, target is greater at larger C%. It should be noted that O% can be concluded to undergo small variations within the empirical dataset as it was found to have a small relevancy factor.

6. Verification Method

Model creation is not complete without verifying the accuracy and reliability of the conclusions drawn. To that aim, the following equations are used to evaluate the validity of the proposed models [5456]:

7. Results and Discussion

To determine the aromaticity in biochar employing molecular descriptors, this research has established two web-based methods, namely, LSSVM and ANFIS. To understand how to calculate the output for a given dataset, we divided the dataset into educating and examining parts and utilized 75% of the information points. In this step, the effectiveness of the education procedure is expressed by a comparison between anticipated values and real values of biochar aromaticity. Comparing simulations throughout the evaluation stage, on the contrary, gives perspective into the models’ correctness in unknown circumstances, which is recognized as a model generalization. All educated models in training and checking databanks are presented in Figure 3 in a side-by-side comparison of calculated and empirical velocity constants of biochar aromaticity. As demonstrated, the computed constants are capable of covering empirical locations while maintaining satisfactory performance standards.

Based on these results, it can be concluded that the proposed LSSVM pattern can calculate the biochar aromaticity with excellent confidence. The usefulness of the proposed frameworks was evaluated using a variety of graphical and mathematical methods. There are many methods that may be used to estimate the biochar aromaticity, as shown in Figure 4.

The bisector path has the highest intensity of data points, as can be seen. Figure 5 depicts the deviation designs of the ANFIS and LSSVM patterns, which represent the relative deviation against the biochar aromaticity during the instruction and testing phases. The majority of calculated deviations are found around the zero-error line. Furthermore, the LSSVM model’s deviation compaction is more apparent than in the other model.

Table 1 shows the mathematical indices calculated for the proposed models. The proposed models have a high R2 value and low RMSE, MRE, STD, and MSE values, indicating their excellent competence in estimating the biochar aromaticity values.

8. Conclusion

Two accurate methods, including ANFIS and LSSVM models, were effectively proposed in this paper. The models mentioned above operate based on pollutant structures. There are still some limitations in empirical evidence, but artificial intelligence-based estimates of aromaticity in biochar may help overcome those barriers. Researchers may propose a novel measuring method with the assistance of established prediction instruments. Given the R2, MRE, MSE, RMSE, and STD scores of 0.986, 3.821, 0.0008, 0.0203, and 0.0201, respectively, the LSSVM model has been shown the most reliable. In accordance with these findings, this approach demonstrated better effectiveness in terms of precision, generalization, and authenticity when compared to other mathematical techniques studied before. The effect of effective parameters on the output values was also studied using a sensitivity evaluation.

Data Availability

Data references are described in the text of the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was funded by the Fundamental Research Funds for the Central Universities (2017B685X14) and Postgraduate Research and Practice Innovation Program of Jiangsu Province (KYCX17_0426).