1. Introduction
To provide higher separation quality with lower maintenance costs and easier operational control, membrane filtration techniques, particularly with rotating disk membrane (RDM) filtration systems, were commonly employed[1,2]. However, the performance of such a process is significantly affected by the fouling due to the cake build-up on the surface or in the pores of the membrane caused by the effluent disposal[3,4]. So, the main objectives of RDM filtration are to increase the shear rate and reduce the cake build-up to improve the permeate flux[1,5,6]. Various investigations were carried out on the use of RDM for the filtration of many kinds of feed fluids, such as suspension yeast[7], suspension of calcium carbonate[7,8], ferric hydroxide[9], and chicory juice[10,11]. In all cases, it was found that RDM could improve the flux filtration by increasing the shear that reduces booth concentration polarization in nanofiltration, ultrafiltration, and cake build-up in microfiltration[5]. To save time and experimental cost, researchers have always tried to find an efficient way to understand the phenomenon without experimenting, using modeling, simulation tools, and collected experimental databases [12,13]. there are many mathematical models that have been developed in the literature for the purpose to evaluate the effects of the shear rate on membrane filtration by reducing the cake accumulation on the membrane surface[6,14]. Nevertheless, these models could not simulate the membrane flux decline accurately because of the presence of several numbers of fitting parameters related to the membrane, feed fluid and the quality in a membrane filtration mechanism[12]. In recent years, artificial neural networks (ANN) became one of the most powerful modeling tools in membrane filtration technology[12]. In this context, several studies have been carried out to develop mathematical models to deduce the flux using prediction methods and optimization algorithms[15]. To predict the flux decline and fouling resistance in oily wastewater treatment, Soleimani et al.[16] utilized ANN and GA (Genetic Algorithm) to optimize the operating conditions. Jawad et al.[17] employed ANN and MLR (Multi Linear Regression) to study the permeated flux in forwarding osmosis based on several experimental data from the literature with nine input parameters; they found that ANN is better in forming a relationship between input and output than MLR. Sahoo et al.[18] used genetic algorithms (GAs) to find the best geometry and values of the internal parameter of two learning algorithms of ANN. They used four parameters as an inlet (pH, feed water, particle diameters, and ionic strengths) and the flux decline as an outlet parameter. Bagheri et al.[19], in their critical review, showed the performance of artificial intelligence (AI) and machine learning in controlling membrane fouling. They found that the ANN can predict fouling with an R2 = 0.99 (the error is close to zero). Liu et al.[20] have predicted the TMP (transmembrane pressure) fouling in micro-filtration of water treatment with the ANN approach using five inputs. Their results showed that there is a good agreement between the ANN model and the experimental data. The support vector machine (SVM) is a new promising technique that has already showed good results in medical diagnostics, electric load prediction, and other domains[21]; it is a non-linear and nonparametric regression technique. To predict the clogging in a membrane bioreactor (MBR), Li and Tao[22] used first SA (simulated annealing algorithm) to optimize the three important parameters for SVM, and thereafter they employed the SVM to predict fooling in MBR. SVM was used by Hooman Adib et al.[23] to simulate the fouling resistance and permeate flux decline of oily wastewater in a tangential flow ultrafiltration membrane. They used TMP, temperature, tangential velocity, and pH as input variables, and permeation flux and fouling resistance as output variables. Their results obtained by SVM showed good agreement with the experimental data. They discovered the same R2 of 0.99 for permeation flux decline and fouling resistance. Kui Gao et al.[24] Employed SVM to predict the dead-end microfiltration membranes permeate flux in a batch reactor. It was considered that the hydraulic retention time, temperature, dissolved oxygen, mixed liquor suspended, hydraulic retention time, transmembrane pressure, and operation time all affected the membrane permeate flow. They have discovered that the experimental data and the SVM-predicted values show excellent connections. SVM and ANN were utilized by Nur Sakinah Ahmad Y et al. [25] to model and forecast membrane fouling. They used experimental data gathered from the filtering of the palm oil mill's effluent to validate their model. They discovered that SVM is just as good at making precise predictions as ANN. Although many studies have been published on the modeling of dynamic membranes using ANN[16-20], there was less research on the use of the combination of SVM and ANN to model such systems. this study aimed to model the decrease of permeate flux in RDM using both ANN and SVM models. On the other hand, and because of several experimental studies[10,26,27] that took place in the RDM and the existence of only a few theoretical models, the main objective of this research was to identify the best approach between ANN and SVM models to predict the flux decline in RDM. For this, the main characteristics of ANN and SVM models were compared and validated with the experimental results of RDM.
2. Theoretical Background
The filtrate flux ‘‘J’’ is written as:
Where A is the effective membrane surface, Vp is the total volume of permeate flux, and t is the filtration time.
The local pressure on the membrane surface in the radial direction can be calculated as follows[8]:
where Pc is the peripheral pressure, ρ is the density of the fluid, k is the constant of velocity factor (0.42, 0.84 for the disk without and with vanes, respectively) [13], and ω is the angular velocity. The mean transmembrane pressure (TMP) can be found by integrating the local pressure over the membrane surface with respect to r[28]:
The permeability of pure water Lp can be calculated from the following equation:
The permeability of filtration in the case of the feed fluid varies with the resistance of the cake Rc:
Where R represents the resistance of the membrane and μ represents the viscosity of the fluid.
The permeate flux can be deduced using Eq. 6:
In this equation, the cake resistance is not constant; it varies over time. therefore, the decrease in permeate on the RDM surface flux becomes quite difficult to estimate. Thus, modeling by nonlinear methods using ANN or SVM method could be a possible alternative to model the flux decline.
3. Dataset Collection
The experimental data used in this work was collected from previously published works[7-11] summarized in Table 1. It regroups the maximum, minimum, mean and standard deviation (SD) of each parameter. The dataset consists of 1284 data points of different feed fluids such as chicory juice, carbonate suspension yeast, and ferric hydroxide solution. The results were chosen under different conditions and filtration parameters.
4. Results and Discussion
4.1. Model performances
The performance of the developed models was evaluated based on three statistical parameters, which are the correlation determination coefficient, and the average absolute relative deviation. The mathematical definition of the previous error types, R, R2 and AARD% are given by Eq. 7 to Eq. 10, respectively.
With: and
The average absolute-relative-deviation (AARD) is given by Eq. 9:
where N is the number of compounds in the dataset for each phase (training, test, or whole dataset), ycal is the calculated (predicted) value of y; for the training or the test set, yexp is the experimental (observed) value of y, represents the average of the calculated values of y, and is the average of the experimental values of y.
5. Support Vector Regression
Support vector regression (SVM) is a robust tool to solve nonlinear regression problems. The gradient descent algorithm is the most widely applied algorithm to select SVM parameters. For this study with the use of the selected seven (07) inputs, a set of 1284 points was taken to build the SVM model. It was developed by writing a script in the MATLAB environment. To improve the forecasting accuracy of the SVM model, a gradient descent algorithm was used to find the optimal parameters namely the constant C (box-constraint), the epsilon ε, and the parameter of the kernel function or Kernel-Scale σ. The RBF, Gaussian, and polynomial Kernel functions were tested to select the most suitable model with better statistics performances. The developed model is carried out by the first SVM function of MATLAB® R2019b. For this, the cross-validation “holdout” method was selected, where 70% of the data was used randomly for learning, and 30% for testing. Since there is no rule to determine the appropriate parameter values of the SVM model; the optimized parameters were then determined using the trial- and-error method during the test stage (Table 2).
Fig. 1 shows the regression analysis between the predicted values by SVM and the experimental permeate fluxes for the training, test, and global set. Therefore, the best SVM model can capture 99.84% of the variability of the permeate flux during the overall data set which is given in Fig. 1-c, 99.85% for the training stage (Fig. 1-a), and 99.83% for the test stage (Fig. 1-b). Consequently, the optimal SVM model was found with significant correlations during the different stages, which means that this model could accurately fit the permeate flux within the given interval of input variables employed through the training stage which corroborates the performance of the SVM model. Table 3 summarizes the values obtained from the different statistical criteria and errors of the optimized SVM model. All the statistical criteria verify the conditions of acceptability (correlation close to the unit and errors close to zero).
6. Artificial Neural Networks
In this work, a multi-layer feed-forward back propagation neural network (FFBP MLP-NN) was developed to model the permeate flux. The MLP-NN was trained with the Levenberg-Marquardt optimization algorithm. A different combination of MLP-NN parameters was then tested, including transfer function, number of hidden layers, and hidden neurons.
These parameters were selected based on the trial- and-error method during the test stage and based on the above-cited statistical parameters. As described, an MLP with one hidden layer is capable to map the non-linear relationship between dependent and independent variables. Consequently, an MLP with one hidden layer was adopted with seven (07) input variables and one output variable. the number of hidden neurons was changed from 1 to 30. Also, several transfer functions were tested with other parameters. The optimal conditions found for the best ANN model are summarized in Table 4.
Fig. 2 depicts the regression plots between experimental and predicted permeate fluxes using ANN for the four stages. The obtained results indicate that the best model is characterized by low AARD and high regression coefficients. Since the correlation coefficients are superior to 0.99 for training, validation, tests, and the entire data set as shown in Fig. 2-a, Fig. 2-b, and Fig. 2-c, the developed ANN model is effective and capable of providing the permeate flux values with high accuracy. Table 5 illustrates the different statistical errors and correlations for the four stages, where Fig. 3 gives a rapid comparison between ANN and SVM when estimating the permeate flux. From Fig. 3, The analysis of this figure shows that the ANN and SVM models have almost the same correlation coefficient but the ANN yields a low AARD during the test stage. Consequently, ANN presents the best option when predicting the permeate flux.
7. Comparison between Experimental and ANN Modeling
Another comparison between experimental (empty symbols) and predicted permeate flux using ANN (charged symbols) is accomplished and the results for different operating conditions are set out in Fig. 4. For rotating velocity of 2000 rmp and TMP of 100 kPa (Fig. 4-a), the experimental data fit with high accuracy the ANN predictions with AARD = 1.126% alongside the total time cycle of filtration. However, for TMP values of 80 kPa and 49 kPa, the AARD is equal to 3.29 and 6.74%, respectively.
The comparison between experimental and ANN results versus time for the rotating velocity of 1500 rpm (Fig. 4-b) and under a TMP of 50 kPa showed a slight deviation in the time range of 0~300 min with a very acceptable AARD = 0.74%. However, for the rotating velocity of 1000 rpm (Fig. 4-c), the ANN predictions showed the best performance when modeling the flux in the case of TMP equal to 27 kPa with an AARD 0.70% in comparison with the values of TMP of 50 kPa and 80 kPa; in this case, the values of ARD are of 1.46% and 0.85%, respectively. From the results presented in Table 6, it could be noticed that the ANN model showed a very acceptable accuracy with an AARD of 0.7% under the operating conditions (Ω = 1000 rpm and TMP = 27 kPa), while it presents a considerable deviation with a high AARD of 6.74% for Ω = 2000 rpm and TMP = 49 kPa. In the general case, the ANN could be used to model permeate flux with the mean of AARD of 1.82% in comparison to the SVM model (Table 6). In resemblance to previous literary works, Jasir Jawad et al.[17] have found using ANN a high R2 value of 97.3% with an error of 16.422. R2 and the error for linear regression, on the other hand, are 49.3; 32.365 respectively.
8. Comparison between Experimental and SVM Modeling
Fig. 5 illustrates the predicted values using SVM (charged symbols) versus experimental values of permeate flux (empty symbols) for different operating conditions. It is shown that the SVM model followed accurately the trend of the experimental permeate flux with an AARD of 0.14% for W = 2000 rpm and TMP = 100 kPa, while it presents a higher deviation with an AARD of 9.49% for W = 1000 rpm and TMP = 279 kPa. In comparison with other last previous work, Kui GAO et al.[27] have found that the error for SVM (3.43%) model is slightly bigger in comparison with ANN (2.62%).
9. Applicability Domain
The applicability domain using the Williams plot of the ANN model was performed to identify the dataset outliers. In the Williams diagram, the standardized residual parameter (δ) is plotted versus a distance called leverage (hi). These two parameters (δ and hi) can be calculated using Eq. 10 and Eq. 11, respectively:
where yi and are the experimental and the calculated values for the i-th compound, respectively, A is the number of descriptors and n is the number of compounds.
The leverage value (hi) can be defined as[29]:
where xi is the descriptor vector of the i-th compound, is the transpose of xi, X is the descriptor matrix and XT is the transpose of X. The warning leverage value (h*) is calculated as:
Where k is the number of predictor variables included in the model and n is the number of data points. The applicability domain of the ANN model is analyzed using a Williams plot (Fig. 6), where the vertical line is the critical leverage value (h*) and the horizontal lines ± 3. From this Figure, it can be noticed that only 67 points (5%) lie out of the domain and most of the Dataset belongs to the AD area, which is between the horizontal lines (limit of ± 3). This means that an average of 95% of the whole data set is covered.
10. Sensitivity Analysis
The developed model (ANN) can potentially provide a good dependency between the input and output parameters. To investigate how the inputs affect the outputs, a sensitivity analysis is carried out. The most effective input can be recognized by the relevance factor (r), which is in the range of -1 to +1 and is stated by using Eq. 13 from[29]:
where Xk,i is the i-th value of the k-th input vector with its average Xk, Yi is the i-th output value and its average Y and, n is the number of compounds.
The absolute value of the relevance factor has a direct relation with the output. As can be seen in Fig. 7, the permeate flux shows a straight dependency on the inputs (density, dp, W, viscosity, TMP and concentration), and an opposite dependency on time. Also, time and density are the most relevant input variables with a relevance factor of +0.56 and –0.14, respectively.
11. ANN Interface of Permeate Flux Calculation
A friendly and flexible Matlab user interface was designed based on the best ANN parameters (weights and biases) and the selected inputs to compute the permeate flux of the RDM (Fig. 8). This allows the user to make a quick and easy calculation of the permeate flux without knowing any details about ANN, Matlab software or even the physical phenomena.
12. Conclusion
This study aimed to investigate the non-linear behaviour of the flux decline during the filtration process of a rotating disk membrane using ANN and SVM approaches under different operating conditions. In this investigation, we determined the best ANN after trying different structures. The accuracy of the developed model depends on the regression coefficients and AARD% values which have been taken as criterion parameters. The obtained results showed that the ANN with the architecture of 7-10-1 led to an R2 > 0.99 and an AARD% average of 1.82%, whereas the SVM model led to an AARD average of 3.96 and nearly the same correlations. This confirmed that the optimal ANN model is more effective than the SVM approach at predicting the permeate flux with high accuracy. The applicability domain of the model was conducted proving that about 95% of the data set was covered. In addition, a convivial graphical user interface has been designed to facilitate the computation of the permeate flux by exploiting the parameters without learning about the phenomena. The sensitivity of the best ANN model has also been examined. The obtained results revealed that density has a strong positive effect against time while having a negative effect on the permeate flux, followed by the other inputs with nearly the same effect.