10  数据汇总

在本教程中,汇总了三个肝细胞癌(HCC)的转录组数据集,分别是LIRI-JPLIHC-US/TCGA-LIHCGSE14520,以及一个HCC的单细胞数据集GSE149614的临床表型信息。这些数据集为科研人员提供了丰富的基因表达数据和相关的临床信息,有助于科研人员更深入地理解HCC的分子机制和临床特征。 ## 加载R包 {#sec-gtsummary-packages}

使用rm(list = ls())来清空环境中的所有变量。

library(tidyverse)
library(Biobase)
library(gtsummary)
library(xlsx)

rm(list = ls())
options(stringsAsFactors = F)
options(future.globals.maxSize = 10000 * 1024^2)

grp_names <- c("Early Stage", "Late Stage")
grp_colors <- c("#8AC786", "#B897CA")
grp_shapes <- c(15, 16)

10.1 导入数据


ExprSet_LIRI_JP <- readRDS("./data/result/ExpSetObject/Final_ExpSet_LIRI_JP_TrueCounts.RDS")
ExprSet_TCGA_LIHC <- readRDS("./data/result/ExpSetObject/Final_ExpSet_TCGA_LIHC_TrueCounts.RDS")

ExprSet_dis <- readRDS("./data/result/ExpSetObject/MergeExpSet_VoomSNM_VoomSNM_LIRI-JP_TCGA-LIHC.RDS")
ExprSet_val <- readRDS("./data/result/ExpSetObject/GSE14520_ExpSet_counts.RDS")

meatdata_sc <- readxl::read_xlsx("./data/GSE149614_scRNA/SupplementaryData/Supplementary Data 1.xlsx", sheet = 1)

10.2 汇总表格

meta_dis <- pData(ExprSet_dis) 
meta_val <- pData(ExprSet_val)
meta_sc <- meatdata_sc %>%
  na.omit()
colnames(meta_sc) <- c(as.character(meatdata_sc[2, ]))

meta_tbl <- metadata %>%
  dplyr::select(-all_of(c("SampleID", "Status", "Time"))) %>%
  tbl_summary(by = "ProjectID") %>%
  add_p() %>%
  bold_labels()

meta_tbl

结果:本教程涉及到的肝细胞癌(HCC)患者的临床病理特征的数据分布情况

10.3 输出结果


if (!dir.exists("./data/result/metadata/")) {
  dir.create("./data/result/ExpSetObject", recursive = TRUE)
}


filename <- paste0("./data/result/metadata", "/HCC_summary_metadata", ".xlsx")

10.4 总结

总计位患者用于本教程。

系统信息
sessionInfo()
#> R version 4.3.3 (2024-02-29)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Sonoma 14.2
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Asia/Shanghai
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices datasets  utils     methods   base     
#> 
#> other attached packages:
#>  [1] xlsx_0.6.5          gtsummary_1.7.2     Biobase_2.62.0     
#>  [4] BiocGenerics_0.48.1 lubridate_1.9.3     forcats_1.0.0      
#>  [7] stringr_1.5.1       dplyr_1.1.4         purrr_1.0.2        
#> [10] readr_2.1.5         tidyr_1.3.1         tibble_3.2.1       
#> [13] ggplot2_3.5.1       tidyverse_2.0.0    
#> 
#> loaded via a namespace (and not attached):
#>  [1] gt_0.10.1            xlsxjars_0.6.1       utf8_1.2.4          
#>  [4] generics_0.1.3       renv_1.0.0           xml2_1.3.6          
#>  [7] stringi_1.8.4        hms_1.1.3            digest_0.6.35       
#> [10] magrittr_2.0.3       evaluate_0.23        grid_4.3.3          
#> [13] timechange_0.3.0     sysfonts_0.8.9       fastmap_1.1.1       
#> [16] broom.helpers_1.15.0 jsonlite_1.8.8       BiocManager_1.30.23 
#> [19] fansi_1.0.6          scales_1.3.0         cli_3.6.2           
#> [22] rlang_1.1.3          munsell_0.5.1        withr_3.0.0         
#> [25] yaml_2.3.8           tools_4.3.3          tzdb_0.4.0          
#> [28] colorspace_2.1-0     vctrs_0.6.5          R6_2.5.1            
#> [31] lifecycle_1.0.4      htmlwidgets_1.6.4    pkgconfig_2.0.3     
#> [34] rJava_1.0-11         pillar_1.9.0         gtable_0.3.5        
#> [37] glue_1.7.0           xfun_0.43            tidyselect_1.2.1    
#> [40] rstudioapi_0.16.0    knitr_1.46           htmltools_0.5.8.1   
#> [43] rmarkdown_2.26       compiler_4.3.3