create_structure_contact_map {protti} | R Documentation |
Creates a contact map of all atoms from a structure file
Description
Creates a contact map of a subset or of all atom or residue distances in a structure or
AlphaFold prediction file. Contact maps are a useful tool for the identification of protein
regions that are in close proximity in the folded protein. Additionally, regions that are
interacting closely with a small molecule or metal ion can be easily identified without the
need to open the structure in programs such as PyMOL or ChimeraX. For large datasets (more
than 40 contact maps) it is recommended to use the parallel_create_structure_contact_map()
function instead, regardless of if maps should be created in parallel or sequential.
Usage
create_structure_contact_map(
data,
data2 = NULL,
id,
chain = NULL,
auth_seq_id = NULL,
distance_cutoff = 10,
pdb_model_number_selection = c(0, 1),
return_min_residue_distance = TRUE,
show_progress = TRUE,
export = FALSE,
export_location = NULL,
structure_file = NULL
)
Arguments
data |
a data frame containing at least a column with PDB ID information of which the name
can be provided to the |
data2 |
optional, a data frame that contains a subset of regions for which distances to regions
provided in the |
id |
a character column in the |
chain |
optional, a character column in the |
auth_seq_id |
optional, a character (or numeric) column in the |
distance_cutoff |
a numeric value specifying the distance cutoff in Angstrom. All values for pairwise comparisons are calculated but only values smaller than this cutoff will be returned in the output. If a cutoff of e.g. 5 is selected then only residues with a distance of 5 Angstrom and less are returned. Using a small value can reduce the size of the contact map drastically and is therefore recommended. The default value is 10. |
pdb_model_number_selection |
a numeric vector specifying which models from the structure files should be considered for contact maps. E.g. NMR models often have many models in one file. The default for this argument is c(0, 1). This means the first model of each structure file is selected for contact map calculations. For AlphaFold predictions the model number is 0 (only .pdb files), therefore this case is also included here. |
return_min_residue_distance |
a logical value that specifies if the contact map should be returned for all atom distances or the minimum residue distances. Minimum residue distances are smaller in size. If atom distances are not strictly needed it is recommended to set this argument to TRUE. The default is TRUE. |
show_progress |
a logical value that specifies if a progress bar will be shown (default is TRUE). |
export |
a logical value that indicates if contact maps should be exported as ".csv". The
name of the file will be the structure ID. Default is |
export_location |
optional, a character value that specifies the path to the location in
which the contact map should be saved if |
structure_file |
optional, a character value that specifies the path to the location and
name of a structure file in ".cif" or ".pdb" format for which a contact map should be created.
All other arguments can be provided as usual with the exception of the |
Value
A list of contact maps for each PDB or UniProt ID provided in the input is returned.
If the export
argument is TRUE, each contact map will be saved as a ".csv" file in the
current working directory or the location provided to the export_location
argument.
Examples
# Create example data
data <- data.frame(
pdb_id = c("6NPF", "1C14", "3NIR"),
chain = c("A", "A", NA),
auth_seq_id = c("1;2;3;4;5;6;7", NA, NA)
)
# Create contact map
contact_maps <- create_structure_contact_map(
data = data,
id = pdb_id,
chain = chain,
auth_seq_id = auth_seq_id,
return_min_residue_distance = TRUE
)
str(contact_maps[["3NIR"]])
contact_maps