Abstract: | Due to the lengths of amino symbolic sequences of protein are always different, any regression model can not be used for predicting the temperature of thermostable proteins without adequate pretreatment. We need to transfer each amino symbolic sequence as some useful physicochemical quantities by using Hurst exponent first, and then, some regression models may be considered. Combining the Hurst exponent and the Choquet integral regression model with respect to the well known fuzzy measure, L-measure, is first proposed in last year. Although L-measure is a multivalent measure and better than the well known fuzzy measures, λ-measure and P-measure, however it does not contain the additive measure and does not attain the largest fuzzy measure, B-measure. In accordance with above drawbacks, an improved L-measure, called generalized L-measure, was proposed, but this new fuzzy measure has not been used for combining the Hurst exponent to predict the temperature of thermostable proteins yet. In this paper, the sensitive comparison property between two completed fuzzy measures and some more properties of generalized L-measure are discussed, the method combining the Hurst exponent and the Choquet integral regression model with respect to generalized L-measure is proposed, a 5-fold Cross-Validation MSE is conducted. Experimental result shows that the Choquet integral regression model based on Hurst exponent and generalized L-measure has the best performance, it is better than Choquet integral regression model based on Hurst exponent and other fuzzy measures, including completed L-measure, L-measure, Lambda-measure, and P-measure, and the traditional prediction models, ridge regression and multiple linear regression models. |