Splits an ML-ready tibble into training and testing (and in some cases validation) sets.

splitMLInputTibble(ml_input_tibble, split = c(0.6, 0.2), seed = 5280)

Arguments

ml_input_tibble

An ML-ready tibble generated by loadMLInputTibble(). This must have a target variable column named either genome_drug.resistant_phenotype ("Resistant" or "Susceptible " classification for one bug/drug combination) or resistant_classes (multi-class classification for determining the drug classes to which each genome is resistant), but not both.

split

pillar::num Vector of length 2 indicating the proportion of data to be designated as training and validation, respectively.

seed

pillar::num For reproducible analysis

Value

An rsplit object