Skip to contents

Cleanup Domain Architectures

Cleans the DomArch column by replacing/removing certain domains

This function cleans the DomArch column of one data frame by renaming certain domains according to a second data frame. Certain domains can be removed according to an additional data frame. The original data frame is returned with the clean DomArchs column and the old domains in the DomArchs.old column.

Usage

cleanup_domarch(
  prot,
  old = "DomArch.orig",
  new = "DomArch",
  domains_keep,
  domains_rename,
  repeat2s = TRUE,
  remove_tails = FALSE,
  remove_empty = F,
  domains_ignore = NULL
)

Arguments

prot

A data frame containing a 'DomArch' column

domains_keep

A data frame containing the domain names to be retained.

domains_rename

A data frame containing the domain names to be replaced in a column 'old' and the corresponding replacement values in a column 'new'.

repeat2s

Boolean. If TRUE, repeated domains in 'DomArch' are condensed. Default is TRUE.

remove_tails

Boolean. If TRUE, 'ClustName' will be filtered based on domains to keep/remove. Default is FALSE.

remove_empty

Boolean. If TRUE, rows with empty/unnecessary values in 'DomArch' are removed. Default is FALSE.

domains_ignore

A data frame containing the domain names to be removed in a column called 'domains'

Value

The original data frame is returned with the clean DomArchs column and the old domains in the DomArchs.old column.

Examples

if (FALSE) { # \dontrun{
cleanup_domarch(prot, TRUE, FALSE, domains_keep, domains_rename, domains_ignore = NULL)
} # }