R: Split a PDB File Into Separate Files, One For Each Chain.

pdbsplit {bio3d}

R Documentation

Split a PDB File Into Separate Files, One For Each Chain.

Description

Split a Protein Data Bank (PDB) coordinate file into new separate files with one file for each chain.

Usage

pdbsplit(pdb.files, ids = NULL, path = "split_chain", overwrite=TRUE,
         verbose = FALSE, mk4=FALSE, ncore = 1, progress = NULL, ...)

Arguments

`pdb.files`	a character vector of PDB file names.
`ids`	a character vector of PDB and chain identifiers (of the form: ‘pdbId_chainId’, e.g. ‘1bg2_A’). Used for filtering chain IDs for output (in the above example only chain A would be produced).
`path`	output path for chain-split files.
`overwrite`	logical, if FALSE the PDB structures will not be read and written if split files already exist.
`verbose`	logical, if TRUE details of the PDB header and chain selections are printed.
`mk4`	logical, if TRUE output filenames will use only the first four characters of the input filename (see `basename.pdb` for details).
`ncore`	number of CPU cores used for the calculation. `ncore>1` requires package ‘parallel’ be installed.
`progress`	progress bar for use with shiny web app.
`...`	additional arguments to `read.pdb`. Useful e.g. for parsing multi model PDB files, including ALT records etc. in the output files.

Details

This function will produce single chain PDB files from multi-chain input files. By default all separate filenames are returned. To return only a subset of select chains the optional input ‘ids’ can be provided to filter the output (e.g. to fetch only chain C, of a PDB object with additional chains A+B ignored). See examples section for further details.

Note that multi model atom records will only split into individual PDB files if multi=TRUE, else they are omitted. See examples.

Value

Returns a character vector of chain-split file names.

Author(s)

Barry Grant

References

Grant, B.J. et al. (2006) Bioinformatics 22, 2695–2696.

For a description of PDB format (version3.3) see:
http://www.wwpdb.org/documentation/format33/v3.3.html.

Examples

## Not run: 
  ## Save separate PDB files for each chain of a local or on-line file
  pdbsplit( get.pdb("2KIN", URLonly=TRUE) )


  ## Split several PDBs by chain ID and multi-model records
  raw.files <- get.pdb( c("1YX5", "3NOB") , URLonly=TRUE)
  chain.files <- pdbsplit(raw.files,  path=tempdir(), multi=TRUE)
  basename(chain.files)


  ## Output only desired pdbID_chainID combinations
  ## for the last entry (1f9j), fetch all chains
  ids <- c("1YX5_A", "3NOB_B", "1F9J")
  raw.files <- get.pdb( ids , URLonly=TRUE)
  chain.files <- pdbsplit(raw.files, ids, path=tempdir())
  basename(chain.files)

## End(Not run)

[Package bio3d version 2.4-4 Index]