單細胞分析（20）—

單細胞分析（20）——inferCNV分析

InferCNV分析筆記

1. 分析目標

InferCNV（Inference of Copy Number Variations）是一種基于單細胞轉錄組數據推斷**拷貝數變異（CNV）**的方法，推測其基因組變異情況。

2. 數據準備

2.1 載入數據

library(Seurat)
setwd(here::here())  # 設置工作目錄
scRNA1 = readRDS('./scRNA_harmony_EP.rds')

載入單細胞RNA測序數據（已經過 Harmony 校正）。
scRNA1 是 Seurat 對象，包含元數據和表達矩陣。

2.2 選擇分析細胞

2.2.1 細胞類型定義

library(dplyr)meta_data <- scRNA1@meta.data %>%mutate(cell_type = case_when(RNA_snn_res.0.6 %in% c("8", "18") ~ "Normal",  # 參考細胞（正常/免疫）TRUE ~ "Tumor"  # 觀測細胞（腫瘤）))

正常細胞：RNA_snn_res.0.6 細胞亞群 8、18（參考細胞）。
腫瘤細胞：其余亞群。

2.2.2 僅對腫瘤細胞進行采樣，保留所有參考細胞

# 提取細胞名稱
tumor_cells <- meta_data %>% filter(cell_type == "Tumor") %>% pull(rownames)
normal_cells <- meta_data %>% filter(cell_type == "Normal") %>% pull(rownames)# 對腫瘤細胞進行隨機采樣（最多3000個）
set.seed(123)
tumor_sample <- sample(tumor_cells, min(3000, length(tumor_cells)), replace = FALSE)# 合并所有參考細胞（不采樣）與采樣后的腫瘤細胞
selected_cells <- c(tumor_sample, normal_cells)# 構建新的 Seurat 對象
new.scRNA <- subset(scRNA1, cells = selected_cells)

保留所有參考細胞（用于 CNV 歸一化）。
對腫瘤細胞進行采樣（最多3000個），避免計算負擔過大。

3. 構建 InferCNV 對象

3.1 提取表達矩陣和細胞注釋

exprMatrix <- as.matrix(GetAssayData(new.scRNA, slot='counts'))
cellAnnota <- subset(new.scRNA@meta.data, select='RNA_snn_res.0.6')

exprMatrix：基因表達矩陣（raw counts）。
cellAnnota：細胞分類信息（用于 InferCNV 分組）。

3.2 創建 InferCNV 對象

library(infercnv)infercnv_obj = CreateInfercnvObject(raw_counts_matrix=exprMatrix,annotations_file=cellAnnota,delim="\t",gene_order_file= "/home/jianpeng/yard/BC_TLLZ/TLLZ02_LB/BC-TLLZ02-LB-process/data/gencode_v19_gene_pos.txt",ref_group_names=c("8", "18"))

gene_order_file：基因的染色體位置信息（gencode_v19）。
ref_group_names=c("8", "18")：
- 參考組（非惡性細胞）。
- 僅使用 Normal 細胞 作為參照，提高CNV推斷精度。

4. 運行 InferCNV

infercnv_obj = infercnv::run(infercnv_obj,cutoff=0.1,  # 10X數據推薦0.1，Smart-seq推薦1out_dir=  './output/cnv_epithelial/',cluster_by_groups=F,   # 是否按照組分類再聚類hclust_method="ward.D2", plot_steps=F, HMM = F, denoise=T,  # 去噪write_expr_matrix = T)

cutoff=0.1：基因表達閾值，適用于10X數據。
denoise=T：啟用去噪處理，提高信噪比。
hclust_method="ward.D2"：層次聚類方法，適合CNV分析。

5. CNV評分計算

library(tidyverse)# 讀取InferCNV結果
obs <- read.table("./output/cnv_epithelial/infercnv.observations.txt", header=T)
ref <- read.table("./output/cnv_epithelial/infercnv.references.txt", header=T)# 合并參考與觀測細胞
expr <- cbind(obs, ref)
expr.scale <- scale(t(expr))# 歸一化 CNV 評分計算
tmp1 <- sweep(expr.scale, 2, apply(expr.scale, 2, min),'-')
tmp2 <- apply(expr.scale, 2, max) - apply(expr.scale,2,min)
expr_1 <- t(2*sweep(tmp1, 2, tmp2, "/")-1)cnv_score <- as.data.frame(colSums(expr_1 * expr_1))
colnames(cnv_score) = "cnv_score"
cnv_score <- rownames_to_column(cnv_score, var='cell')
cnv_score$cell <- gsub("\\.", "-", cnv_score$cell)

計算 CNV評分（變異程度越高，評分越高）。
標準化后計算平方和 以量化CNV程度。

6. 結果可視化

6.1 關聯 Seurat Meta 數據

scRNA_sample = subset(scRNA1, cells = cnv_score$cell)meta <- scRNA_sample@meta.data %>%rownames_to_column(var='cell') %>%inner_join(cnv_score, by='cell') %>%column_to_rownames(var='cell')

將CNV評分合并到元數據，用于可視化分析。

6.2 可視化 CNV 評分

library(ggpubr)
p = ggboxplot(meta, "RNA_snn_res.0.6", "cnv_score", fill = "RNA_snn_res.0.6") + scale_y_continuous(limits = c(0, 3000)) +  xlab("Cluster") +  ylab("CNV Score") +theme(panel.border = element_rect(colour = "black", fill = NA, size = 0.5)) +theme(legend.position = "none")ggsave(p, filename = './output/epithelial_cell/CNV/CNV_cluster_RNA_snn_res.0.6.pdf', width = 7, height = 4)

箱線圖展示CNV評分與細胞群體的關系。
可用于對比不同細胞群體的CNV變異趨勢。

7. 結果解讀

CNV評分較高的細胞群體可能包含惡性腫瘤細胞。
參考細胞（如免疫細胞）應具有較低CNV評分，用于區分正常/腫瘤細胞。
可結合其他單細胞分析方法（NicheNet, SCENIC）深入解析CNV相關的基因調控網絡。

8. 總結

優化采樣策略：
- 僅對腫瘤細胞進行采樣（最大 3000 個），避免數據量過大。
- 保留所有參考細胞（Normal 細胞），確保 InferCNV 計算的準確性。
InferCNV 主要流程：
1. 數據篩選與預處理（分層采樣，選擇參考細胞）。
2. 創建InferCNV對象（設定參考組，載入基因位置信息）。
3. 運行 InferCNV 分析（聚類，去噪，計算CNV）。
4. 計算 CNV 評分（標準化，平方和計算）。
5. 可視化分析（展示 CNV 評分的分布）。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/web/71215.shtml
繁體地址，請注明出處：http://hk.pswp.cn/web/71215.shtml
英文地址，請注明出處：http://en.pswp.cn/web/71215.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！