重測序關系矩陣構建方式匯總

樣本間親緣關系矩陣（kinship matrix）和同源性矩陣（IBS matrix）構建的方式

1. 可以使用plink的–make-rel計算個體之間的親緣關系（強調個體之間的遺傳相似性）

/opt/software/plink --bfile vcf_bfile--make-rel --out relatedness_matrix # 得到親緣關系距離矩陣：
# relatedness_matrix.rel

2. kinship

# 利用tassel計算
run_pipeline.pl -Xmx1536m-Xms512m -SortGenotypeFilePlugin -inputFile 你的vcf文件 -outputFile outvcf -fileType VCF
run_pipeline.pl-Xmx1536m -Xms512m -importGuess outvcf -KinshipPlugin -methodCentered_IBS -endPlugin -export tassel_kinship.txt -exportType SqrMatrix
# 利用gcta計算
使用 --make-grm-alg 1 或 --make-grm 0
gcta --make-grm --make-grm-alg 1 --out snp.gcta --bfile vcf_bfile snp --autosome-num 90

3. IBS

/opt/software/plink --bfile  vcf_bfile --make-bed --out IBS_matrix --maf 0.05 --recode --double-id --allow-extra-chr --chr-set 90  --distance square ibs
要計算遺傳距離，使用1-ibs

群體關系矩陣如何構建？

轉換方法：平均IBS（1-IBS）（個體對間均值）

計算所有個體兩兩之間的IBS（1-IBS）均值，反映群體內遺傳相似性、遺傳距離。

python3 /home/huguang/script/GWAS/group_dis.py distance_matrix group_labels # 具體group_dis.py代碼如下

import sys
import numpy as np
import pandas as pd
from itertools import combinationsdef compute_intergroup_distances(distance_matrix, group_labels):"""計算組間遺傳距離矩陣（組間平均距離）參數：distance_matrix: 全樣本的遺傳距離矩陣 (n x n)group_labels: 樣本分組標簽列表 (length = n)，例如 [0,0,1,1,2,2]返回：group_dist_matrix: 組間距離矩陣 (k x k)，k為組數"""# 參數檢查assert distance_matrix.shape[0] == distance_matrix.shape[1], "距離矩陣必須是方陣"assert len(group_labels) == distance_matrix.shape[0], "標簽數與矩陣維度不匹配"groups = np.unique(group_labels)k = len(groups)group_dist_matrix = np.zeros((k, k))# 計算每對組間的平均距離for i, j in combinations(range(k), 2):# 獲取組i和組j的樣本索引idx_i = np.where(group_labels == groups[i])[0]idx_j = np.where(group_labels == groups[j])[0]# 提取子矩陣submatrix = distance_matrix[np.ix_(idx_i, idx_j)]# 計算平均距離（可替換為中位數等其他統計量）avg_dist = np.nanmean(submatrix)# 對稱賦值group_dist_matrix[i, j] = avg_distgroup_dist_matrix[j, i] = avg_dist# 對角線（組內距離）特殊處理for i in range(k):idx = np.where(group_labels == groups[i])[0]submatrix = distance_matrix[np.ix_(idx, idx)]group_dist_matrix[i, i] = np.nanmean(submatrix[np.triu_indices(len(idx), k=1)])return group_dist_matrix# 示例用法
if __name__ == "__main__":full_matrix = pd.read_csv(sys.argv[1], sep='\t', header=None, index_col=None,  engine='c').to_numpy()groups = pd.read_csv(sys.argv[2], sep="\t", header=None, index_col=None).iloc[:, 0].to_numpy()# 計算組間距離矩陣result = compute_intergroup_distances(full_matrix, groups)print("組間遺傳距離矩陣：")print(np.round(result, 4))

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/bicheng/79452.shtml
繁體地址，請注明出處：http://hk.pswp.cn/bicheng/79452.shtml
英文地址，請注明出處：http://en.pswp.cn/bicheng/79452.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！