介紹
PTXQC包是2016年發表在J Proteome Res期刊上的R包,它主要是對MaxQuant輸出結果進行提取處理從而獲得評估蛋白質質量結果。
安裝
從github安裝,安裝過程會自動構建tutorial。
devtools::install_github("cbielow/PTXQC", build_vignettes = TRUE, dependencies = TRUE)
library(PTXQC)
- 查看幫助文檔,幫助文檔是以html方式展示
help(package="PTXQC")
browseVignettes(package = 'PTXQC')
輸入文件
輸入文件是MaxQuant結果文件的txt里面的:/combined/txt
- parameters.txt
- summary.txt
- proteinGroups.txt
- evidence.txt
- msms.txt
- msmsScans.txt
運行
輸入包含上述輸入文件的目錄即可,然后使用createReport函數。
r = createReport(txt_folder)
cat(paste0("\nReport generated as '", r$report_file, "'\n\n"))
它也提供了修改報告主題或者評估步驟選擇的方法,需要修改yaml_file文件,可參考如下。
require(PTXQC)
require(yaml)## the next require() is needed to prevent a spurious error in certain R versions (might be a bug in R or a package)
## error message is:
## Error in Scales$new : could not find function "loadMethod"
require(methods)## specify a path to a MaxQuant txt folder
## Note: This folder can be incomplete, depending on your YAML config
if (1) {## we will use an example dataset from PRIDE (dataset 2 of the PTXQC publication)local_zip = tempfile(fileext=".zip")download.file("ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2015/11/PXD003133/txt_20min.zip", destfile = local_zip)unzip(local_zip, exdir = tempdir()) ## extracts contenttxt_folder = file.path(tempdir(), "txt_20min")
} else {## if you have local MaxQuant output, just use ittxt_folder = "c:/Proteomics/MouseLiver/combined/txt"
}## use a YAML config inside the target directory if present
fh_out = getReportFilenames(txt_folder)
if (file.exists(fh_out$yaml_file))
{cat("\nUsing YAML config already present in target directory ...\n")yaml_config = yaml.load_file(input = fh_out$yaml_file)
} else {cat("\nYAML config not found in folder '", txt_folder, "'. The first run of PTXQC will create one for you.", sep="")yaml_config = list()
}r = createReport(txt_folder, mztab_file = NULL, yaml_obj = yaml_config)cat(paste0("\nReport generated as '", r$report_file, "'\n\n"))
結果
輸出的報告文件可以是html也可以是PDF格式,如圖:基于PTXQC包評估以下部分
- 樣本制備(1-5);
- 液相色譜分離肽段(6-9);
- 質譜過程(10-18);
- 鑒定蛋白效果(19-22)。
不同顏色代表實驗過程的優劣。從圖中可以看出,樣本制備和質譜過程獲得評分是best,而鑒定蛋白質效果這一步效果最差,這是因為我們的蛋白質樣本是血液外泌體蛋白質,它含量相對血液而已本身就較少,無法達到該包給的閾值(Peptide Count > 15,000; Protein Count > 3,500),但我們可以看到Average Overall Quality是偏較好評價的(偏深綠色)。另外我們蛋白質質譜過程使用了PAIMS技術分離蛋白質,這也是我們能看到每個file有三個bar圖(40;60;80電壓)。
在獲取圖譜過程中,常會用到2018年才推出的FAIMS(High-Field Asymmetric Waveform Ion Mobility Spectrometry)技術以用于加載不同電壓(肽段在ESI離子化后,進入質譜之前實現快速氣相分離,提高分離的峰容量),直接使用多電壓下的raw data做MaxQuant定量分析是錯誤的,MaxQuant軟件只能識別單電壓的raw data,因此需要使用FAIMS MzXML Generator 軟件將raw data轉換成各自電壓下的MzXML文件。
List of metrics
systemic information
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 8 (Core)Matrix products: default
BLAS/LAPACK: /disk/share/anaconda3/lib/libopenblasp-r0.3.10.solocale:[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages:
[1] stats graphics grDevices utils datasets methods base other attached packages:
[1] PTXQC_1.0.12 tibble_3.1.5 dplyr_1.0.7 loaded via a namespace (and not attached):[1] tinytex_0.32 tidyselect_1.1.1 xfun_0.24 bslib_0.2.5.1 reshape2_1.4.4 purrr_0.3.4 [7] colorspace_2.0-2 vctrs_0.3.8 generics_0.1.0 viridisLite_0.4.0 htmltools_0.5.1.1 yaml_2.2.1
[13] utf8_1.2.1 rlang_0.4.11 jquerylib_0.1.4 pillar_1.6.4 glue_1.4.2 DBI_1.1.1
[19] gdtools_0.2.2 RColorBrewer_1.1-2 lifecycle_1.0.0 plyr_1.8.6 stringr_1.4.0 munsell_0.5.0
[25] gtable_0.3.0 rvest_0.3.6 kableExtra_1.3.4 evaluate_0.14 knitr_1.33 UpSetR_1.4.0
[31] fansi_0.5.0 Rcpp_1.0.7 scales_1.1.1 webshot_0.5.2 jsonlite_1.7.2 systemfonts_0.3.2
[37] gridExtra_2.3 ggplot2_3.3.5 digest_0.6.27 stringi_1.4.6 ade4_1.7-18 cowplot_1.1.0
[43] grid_4.0.2 tools_4.0.2 magrittr_2.0.1 sass_0.4.0 ggdendro_0.1.22 R6P_0.2.2
[49] seqinr_4.2-4 crayon_1.4.1 tidyr_1.1.4 pkgconfig_2.0.3 ellipsis_0.3.2 MASS_7.3-54
[55] data.table_1.14.0 xml2_1.3.2 assertthat_0.2.1 rmarkdown_2.9 svglite_1.2.3.2 httr_1.4.2
[61] rstudioapi_0.13 R6_2.5.0 compiler_4.0.2
Reference
-
Proteomics quality control: quality control software for MaxQuant results
-
PTXQC