Abstract: | 生理時鐘(Circadian rhythm),是一種生物體的生理現象,在生物的內在持續不間斷、以大約24小時為一個輪迴週期的方式影響生物體的各種機能、活動及行為表現。這個穩定的週期輪迴被中斷導致節律混亂,容易導致各種心血管方面及慢性疾病產生。
依據世界衛生組織World Health Organization(WHO)的統計,癌症是全球主要的死亡原因,在2020年間導致將近1000萬人死亡,在肝癌(Liver cancer)的部分,導致約83萬人死亡,是2020年癌症死亡原因的第三名。主要原因為肝癌患者初期症狀並不明顯,發現時大多已是晚期,錯過能利用治癒性治療的機會,僅能以較被動的方式延續生命。
本研究利用PubMed進行文獻探勘,製作有關生理時鐘晝夜節律基因群,接著使用TCGA資料庫資源抓取肝癌患者有關基因表現量及突變位點資料,進行基因差異表現分析及具有差異表現突變位點分析,差異表現分析部分,使用從TCGA取得的正常組織(Normal)、癌化組織(Tumor)、配對組織(Paired),以生理時鐘晝夜基因群篩選保留相關基因,正常組織(Normal)及癌化組織(Tumor)分析其基因表現與平均值之差異後,並與配對組織(Paired)資料,以公式計算log2值後,將這三筆數據篩選七成以上(含)病人且至少具備兩倍差異表現基因(Differentially Expression Genes, 簡稱DEG),並且將結果輸出繪製熱圖(Heatmap),接著使用文氏圖取得各個資料間共同存在的基因群,突變位點分析部分,將從TCGA取得的基因突變資料,依據基因突變位點位置,將基因名稱及突變位點次數以程式整理後,取得位點突變基因及突變次數資料,使用單一位點突變次數與case總數計算突變率,並依據單核苷酸多態性變化,設定突變率在1%以上之基因,為具有差異表現之突變位點基因,接著使用DAVID Bioinformatics Resources,將文氏圖比對取得之基因群及具有差異表現之突變位點基因群進行分析,探討其生物學過程(Biological process)、細胞組成(Cellular component)及分子功能(Molecular function)及代謝途徑(Pathway)等資訊,最後使用cBioportal for cancer genomics資料庫查詢具有意義的生存預,P-Value設定≦0.05,發現共有14筆基因資料:CENPA、CLSPN、MCM10、NETO2、SP6、BMPER、CRHBP、IRF4、PZP、TMEM132A、CXCR2、TP53、GJB1、PAX3,分析結果發現14個與生理時鐘晝夜節律基因群參與在肝癌之中的生存預後普遍不佳,為負成長趨勢,可做為肝癌病人不良預後因子或預測不良預後,提供臨床醫師決定治療策略之參考。
Circadian rhythm is a physiological phenomenon of an organism, which continues uninterruptedly within the organism and affects various functions, activities and behaviors of the organism in a cycle of about 24 hours. The interruption of this stable cycle leads to rhythm disorder, which can easily lead to various cardiovascular and chronic diseases.
According to the World Health Organization (WHO), cancer is the leading cause of death worldwide, causing nearly 10 million deaths in 2020, and about 830,000 deaths in 2020 due to liver cancer. The third leading cause of death from cancer. The main reason is that the initial symptoms of liver cancer patients are not obvious, and most of them are at an advanced stage when they are discovered. They miss the opportunity to use curative treatment and can only prolong their lives in a passive way.
In this study, PubMed was used to conduct literature search to generate a gene group related to the circadian rhythm of the circadian clock. Then, the TCGA database resource was used to capture the data of gene expression and mutation sites in patients with liver cancer, and to analyze gene differential expression and differentially expressed mutation sites. , Differential expression analysis part, using normal tissue (Normal), cancerous tissue (Tumor), paired tissue (Paired) obtained from TCGA, and screening and retaining related genes by biological clock circadian gene group, normal tissue (Normal) and cancerous tissue After analyzing the difference between the gene expression and the average value (Tumor), and calculating the log2 value with the paired tissue (Paired) data, the three data are screened for more than 70% (inclusive) patients with at least twice the difference in performance. Genes (Differentially Expression Genes, DEG for short), and output the results to draw a heatmap (Heatmap), and then use the Venn diagram to obtain the gene groups that coexist between each data. The mutation site analysis section uses the gene mutation data obtained from TCGA, according to the position of the gene mutation site, the name of the gene and the number of mutation sites are divided into After the program is sorted out, the data of site mutation genes and mutation times are obtained, and the mutation rate is calculated by using the number of single site mutation and the total number of cases. According to the change of single nucleotide polymorphism, genes with mutation rate of more than 1% are set as those with mutation rate. Differentially expressed mutation site genes, Then, using DAVID Bioinformatics Resources, the Venn diagram was used to compare the obtained gene groups and gene groups with differentially expressed mutation sites to analyze their biological processes, cellular components and molecular functions. Finally, the cBioportal for cancer genomics database was used to query meaningful survival predictions, P-Value was set to ≤ 0.05, and a total of 14 gene data were found: CENPA, CLSPN, MCM10, NETO2, SP6, BMPER, CRHBP, IRF4, PZP, TMEM132A, CXCR2, TP53, GJB1, PAX3, the analysis results found that the survival prognosis of 14 circadian rhythm genes involved in liver cancer is generally poor, showing a negative growth trend, which can be used as The poor prognostic factors or poor prediction of liver cancer patients provide a reference for clinicians to decide treatment strategies. |