Evaluates whether a drug-bug combination should be skipped based on insufficient samples or class imbalance for cross-validation or train/val/test splits.

skipImbalancedMatrix(
  genome_ids,
  phenotype_counts,
  n_fold = 5,
  split = c(1, 0),
  min_total_obs = 40,
  verbosity = c("minimal", "debug")
)

Arguments

genome_ids

The subsetted genome IDs for a given matrix combination

phenotype_counts

The count of AMR observations for that combination

n_fold

Number of CV folds

split

For CV: c(1, 0); for classical splits: c(train_prop, val_prop)

min_total_obs

Minimum total observations required before attempting CV (default 40)

verbosity

"minimal" or "debug"; "debug" offers more info per processed matrix

Value

Logical; TRUE if matrix should be skipped, FALSE if matrix should be generated

Examples

if (FALSE) { # \dontrun{
# Check if a drug-bug combination should be skipped
genome_ids <- c("genome1", "genome2", "genome3")
phenotype_counts <- data.frame(
  phenotype = c("Resistant", "Susceptible"),
  count = c(2, 1)
)

# For 5-fold CV
skip <- skipImbalancedMatrix(
  genome_ids = genome_ids,
  phenotype_counts = phenotype_counts,
  n_fold = 5,
  split = c(1, 0),
  min_total_obs = 40,
  verbosity = "minimal"
)
} # }