AlphaFold3 中 MmcifObject類
是 解析 mmCIF 文件的核心數據結構,用于存儲解析后的蛋白質結構信息,包含PDB 頭部信息、Biopython 解析的結構、鏈序列信息等。
下面代碼包含 Monomer 、AtomSite、ResiduePosition、ResidueAtPosition、 MmcifObject以及ParsingResult數據類的定義。
源代碼:
# Type aliases:
ChainId = str
PdbHeader = Mapping[str, Any]
PdbStructure = PDB.Structure.Structure
SeqRes = str
MmCIFDict = Mapping[str, Sequence[str]]@dataclasses.dataclass(frozen=True)
class Monomer:id: strnum: int# Note - mmCIF format provides no guarantees on the type of author-assigned
# sequence numbers. They need not be integers.
@dataclasses.dataclass(frozen=True)
class AtomSite:residue_name: strauthor_chain_id: strmmcif_chain_id: strauthor_seq_num: strmmcif_seq_num: intinsertion_code: strhetatm_atom: strmodel_num: int# Used to map SEQRES index to a residue in the structure.
@dataclasses.dataclass(frozen=True)
class ResiduePosition:chain_id: strresidue_number: intinsertion_code: str@dataclasses.dataclass(frozen=True)
class ResidueAtPosition:position: Optional[ResiduePosition]name: stris_missing: boolhetflag: str@dataclasses.dataclass(frozen=True)
class MmcifObject:"""Representation of a parsed mmCIF file.Contains:file_id: A meaningful name, e.g. a pdb_id. Should be unique amongst allfiles being processed.header: Biopython header.structure: Biopython structure.chain_to_seqres: Dict mapping chain_id to 1 letter amino acid sequence. E.g.{'A': 'ABCDEFG'}seqres_to_structure: Dict; for each chain_id contains a mapping betweenSEQRES index and a ResidueAtPosition. e.g. {'A': {0: ResidueAtPosition,1: ResidueAtPosition,...}}raw_string: The raw string used to construct the MmcifObject."""file_id: strheader: PdbHeaderstructure: PdbStructurechain_to_seqres: Mapping[ChainId, SeqRes]seqres_to_structure: Mapping[ChainId, Mapping[int, ResidueAtPosition]]raw_string: Any@dataclasses.dataclass(frozen=True)
class ParsingResult:"""Returned by the parse function.Contains:mmcif_object: A MmcifObject, may be None if no chain could be successfullyparsed.errors: A dict mapping (file_id, chain_id) to any exception generated."""mmcif_object: Optional[MmcifObject]errors: Mapping[Tuple[str, str], Any]
代碼解讀:
類型別名 (Type Aliases)
類型別名是對復雜類型的簡寫,方便代碼的可讀性和維護。
ChainId = str
PdbHeader = Mapping[str, Any]
PdbStructure = PDB.Structure.Structure
SeqRes = str
MmCIFDict = Mapping[str, Sequence[str]]
ChainId = str
- 表示蛋白質鏈的 ID,例如
"A"
、"B"
。
- 表示蛋白質鏈的 ID,例如
PdbHeader = Mapping[str, Any]
- 表