Abstract: | 在人類的死亡原因裡,惡性腫瘤(癌症)是主要的原因之一,以台灣為例,癌症近年來都是佔據第一名的死因,並且肺癌是十大癌症死亡機率中最高的疾病。世界衛生組織(World Health Organization,簡稱WHO)指出,肺癌的危險因素包含抽菸、吸入二手菸、空氣中的污染、吸入石綿、肺癌家族病史、等相關原因。近來WHO提醒室外空氣污染對肺癌之重要性已升高,但吸菸仍是肺癌主要危險因子,造成全球肺癌7成左右的疾病負擔,不可輕忽。因此,本研究以肺癌為研究標的,為了探討吸菸導致肺癌的影響,我們將分析吸煙者與非吸煙者之全基因體基因的表現量、基因的突變量,結合病人臨床數據,以期發現肺癌之致病因子。本研究從NCBI-GEO資料庫中輸入Lung Cancer搜尋吸菸者與肺癌相關實驗,並使用GEO-2R取得三組實驗數據,共188筆病例數,實驗項目為吸菸者與非吸菸者在大氣道與小氣道中基因表現量的差異,接著進行條件篩選P-Value值≦0.01為分析目標,並且具有至少兩倍差異量的基因,三組實驗數據的全體基因篩選後發現共有93個共同基因,將共同基因數據資料導入GENE-E分析出的熱圖(Heatmap),熱圖分析結果中91個基因表現大致相同,並依照基因表現量分群成6個高度表現基因: AKR1B10、UCHL1、CYP1B1、SLC7A11、SFRP2、CYP1A1;3個低度表現基因: APELA、ITLN1、PPP4R4。同時將93個共同基因使用TCGA查看基因的突變率,取得15個突變率較高的基因。將前述篩選後的93個共同基因群透過DAVID (The Database for Annotation, Visualization and Integrated Discovery,簡稱DAVID )分析,探討其基因功能區及基因所參與之代謝途徑,代謝途徑中細胞P450、aromatic hydrocarbons、Cyclophosphamide與肺癌風險及化療有明顯相關。將前述高度表現、低度表現以及突變率較高的基因使用GeneCards查詢基因的相關的功能及有關的疾病,發現基因AKR1B10、UCHL1、CYP1A1、SLC7A11、NAV3、C3跟肺癌有密切的關係,或可能參與早期肺癌發生。接著使用cBioPortal for cancer genomics分析前述差異表現及前15個突變率較高的基因之預後生存狀況,發現有NAV3、C3、BPIFB2參與的預後生存狀況皆明顯變差,有NAV3參與的預後生存狀況減少了15%的生存月份、有C3參與的預後生存狀況減少了29%的生存月份、有BPIFB2參與的預後生存狀況減少了34%的生存月份。NAV3、C3、BPIFB2合併分析結果也是負向趨勢,就目前研究結果而言,高突變率之基因NAV3、C3、BPIFB2在肺癌具有較差的臨床預後,並且P-Value 值≦0.05,存活分析結果有其意義,可作為肺癌病人的不良預後因子。 Among the causes of death in humans, malignant tumors (cancer) are one of the main causes. For example Taiwan, Cancer has been the leading cause of death in recent years. According to the World Health Organization(WHO), Risk factors for lung cancer include smoking, second-hand smoke exposure, air pollution, occupational asbestos exposure, family history of lung cancer, cooking oil fume and other related factors. WHO reminds the importance of outdoor air pollution to lung cancer deaths has increased, But smoking is still the main risk factor for lung cancer. Cause about 70% of the global burden of lung cancer. Therefore, this study takes lung cancer as the research subject, to explore the impact of smoking on lung cancer, We will analyze the expression levels of all-genome genes in smokers and non-smokers, gene mutation, combining the clinical data of the patients, with a view to discovering the causative factors of lung cancer. This study imported Lung Cancer from the NCBI-GEO database to search for articles related to smokers and lung cancer, and use GEO-2R to obtain three sets of experimental data, a total of 188 cases, , The experimental item is the difference in gene expression between smokers and non-smokers in the large airways and small airways,Then proceed to conditional screening. P-Value value≦0.01 is the analysis target, and have at least twice the amount of difference genes, After screening the three sets of experimental data for all genes, we found a total of 93 common genes, Import common gene data into the heatmap analyzed by GENE-E, The 91 genes in the analysis results of the heat map are roughly the same. Grouped into 6 highly expressive genes according to gene expression level: AKR1B10、UCHL1、CYP1B1、SLC7A11、SFRP2、CYP1A1, 3 low-performance genes: APELA、ITLN1、PPP4R4, And use TCGA to check the mutation rate of the 93 common genes, After analysis, 10 genes with a higher mutation rate were obtained. Use the DAVID (The Database for Annotation, Visualization and Integrated Discovery,DAVID)analysis platform for the 93 common genes after the aforementioned screening, , to explore the functional areas of genes and their metabolic pathways, Cellular P450 , aromatic hydrocarbons , Cyclophosphamide is significantly related to lung cancer risk and chemotherapy.Use GeneCards to query the related functions and related diseases of the aforementioned genes with high performance, low performance, and high mutation rate. Found that the gene AKR1B10、UCHL1、CYP1A1、SLC7A11、NAV3、C3 is closely related to lung cancer. This may be the early process of lung cancer.We use cBioPortal for cancer genomics to analyze the prognostic survival status of the aforementioned differential performance and the first 15 genes with higher mutation rates, It was found that the prognostic survival status of NAV3 and C3 were significantly worsened. The prognostic survival status of NAV3 was reduced by 15%, the survival month of C3 was reduced by 29%, and the prognosis of BPIFB2 was reduced. Survival status has been reduced by 34% in survival months. The combined analysis results of NAV3, C3, and BPIFB2 also have a negative trend. As far as the results of the current study are concerned, genes with high mutation rates(NAV3、C3、BPIFB2) have poor clinical prognosis in lung cancer, and the P-Value value ≦0.05, the survival analysis results are meaningful. As a poor prognostic factor for lung cancer patients. |