Abstract: | This dissertation is motivated to the development of a computational method to improve prognosis prediction based on RNA-seq data, where RNA-seq refers to the measures of the expression profiles of transcripts collectively. Previously many computational methods have been constructed for prognosis and diagnosis prediction of cancer patients using RNA-seq data. However, computational methods to analyze prognosis and diagnosis based on high dimensional datasets are still limitedly identified. Thus, a new statistical and machine learning strategy was proposed to develop prognostic and diagnostic models that could predict prognosis and diagnosis cancer patients efficiently. This dissertation contains three publications. In paper one, this research used Cox regression and Random survival forest algorithm to identify potential miRNAs that predict the prognosis of patients with ovarian cancer (OC). In this study, potential miRNA signatures that were correlated with survival of OC patients were identified. In the second paper, a multivariate Cox regression model with three hybrid penalties approaches including least absolute shrinkage and selection operator (Lasso), adaptive lasso, and elastic net algorithms followed by best subset prognostic model were proposed. The proposed method identified potential prognostic and diagnostic gene signatures that could discriminate high-risk of hepatocellular Carcinoma(HCC) using TCGA dataset. The robustness of our results were verified by various public and experimental datasets. Finally, this dissertation present, a novel miRNA signature-based classification model of risk and stages based on a new statistical and machine learning approaches were discussed. The proposed method identified potential miRNAs that were related with both prognosis and tumor stages of clear cell renal cell carcinoma (CCRCC) patients and this miRNA signature improves risk and diagnostic classification performance of CCRCC patients. Overall, the results of these studies were developed based on statistical and machine learning models to predict prognosis, diagnosis and tumor stages of cancer patients and the biological function of biomarkers that are identified by the proposed methods were provided to elucidate the biological mechanisms and progression of cancer. This dissertation is motivated to the development of a computational method to improve prognosis prediction based on RNA-seq data, where RNA-seq refers to the measures of the expression profiles of transcripts collectively. Previously many computational methods have been constructed for prognosis and diagnosis prediction of cancer patients using RNA-seq data. However, computational methods to analyze prognosis and diagnosis based on high dimensional datasets are still limitedly identified. Thus, a new statistical and machine learning strategy was proposed to develop prognostic and diagnostic models that could predict prognosis and diagnosis cancer patients efficiently. This dissertation contains three publications. In paper one, this research used Cox regression and Random survival forest algorithm to identify potential miRNAs that predict the prognosis of patients with ovarian cancer (OC). In this study, potential miRNA signatures that were correlated with survival of OC patients were identified. In the second paper, a multivariate Cox regression model with three hybrid penalties approaches including least absolute shrinkage and selection operator (Lasso), adaptive lasso, and elastic net algorithms followed by best subset prognostic model were proposed. The proposed method identified potential prognostic and diagnostic gene signatures that could discriminate high-risk of hepatocellular Carcinoma(HCC) using TCGA dataset. The robustness of our results were verified by various public and experimental datasets. Finally, this dissertation present, a novel miRNA signature-based classification model of risk and stages based on a new statistical and machine learning approaches were discussed. The proposed method identified potential miRNAs that were related with both prognosis and tumor stages of clear cell renal cell carcinoma (CCRCC) patients and this miRNA signature improves risk and diagnostic classification performance of CCRCC patients. Overall, the results of these studies were developed based on statistical and machine learning models to predict prognosis, diagnosis and tumor stages of cancer patients and the biological function of biomarkers that are identified by the proposed methods were provided to elucidate the biological mechanisms and progression of cancer. |