AlphaFold3 feature_processing_multimer模塊的crop_chains函數的功能是對多條鏈的蛋白質結構預測任務中的MSA(多序列比對)特征和模板特征進行裁剪(cropping)。裁剪的目的是為了控制輸入模型的MSA序列數量和模板數量,以適應模型的輸入限制或優化計算效率。
源代碼:
def crop_chains(chains_list: List[Mapping[str, np.ndarray]],msa_crop_size: int,pair_msa_sequences: bool,max_templates: int
) -> List[Mapping[str, np.ndarray]]:"""Crops the MSAs for a set of chains.Args:chains_list: A list of chains to be cropped.msa_crop_size: The total number of sequences to crop from the MSA.pair_msa_sequences: Whether we are operating in sequence-pairing mode.max_templates: The maximum templates to use per chain.Returns:The chains cropped."""# Apply the cropping.cropped_chains = []for chain in chains_list:cropped_chain = _crop_single_chain(chain,msa_crop_size=msa_crop_size,pair_msa_sequences=pair_msa_sequences,max_templates=max_templates)cropped_chains.append(cropped_chain)return cropped_chainsdef _crop_single_chain(chain: Mapping[str, np.ndarray],msa_crop_size: int,pair_msa_sequences: bool,max_templates: int) -> Mapping[str, np.ndarray]:"""Crops msa sequences to `msa_crop_size`."""msa_size = chain['num_alignments']if pair_msa_sequences:msa_size_all_seq = chain['num_alignments_all_seq']msa_crop_size_all_seq = np.minimum(msa_size_all_seq, msa_crop_size // 2)# We reduce the number of un-paired sequences, by the nu