A collection of summary functions for the MolEvolvR package.
Function to summarize and retrieve counts by Domains & Domains+Lineage
Function to retrieve counts of how many lineages a DomArch appears in
Creates a data frame with a totalcount column
This function is designed to sum the counts column by either Genomic Context or Domain Architecture and creates a totalcount column from those sums.
Usage
summarizeByLineage(prot = "prot", column = "DomArch", by = "Lineage", query)
summarizeDomArch_ByLineage(x)
summarizeDomArch(x)
summarizeGenContext_ByDomArchLineage(x)
summarizeGenContext_ByLineage(x)
summarizeGenContext(x)
totalGenContextOrDomArchCounts(
prot,
column = "DomArch",
lineage_col = "Lineage",
cutoff = 90,
RowsCutoff = FALSE,
digits = 2
)Arguments
- prot
A data frame that must contain columns:
count
- column
Character. The column to summarize, default is "DomArch".
- by
A string representing the grouping column (e.g.,
Lineage). Default is "Lineage".- query
A string specifying the query pattern for filtering the target column. Use "all" to skip filtering and include all rows.
- x
A dataframe or tibble containing the data. It must have columns named
GenContext,DomArch,Lineage, andcount.- lineage_col
Character. The name of the lineage column, default is "Lineage".
- cutoff
Numeric. Cutoff for total count. Counts below this cutoff value will not be shown. Default is 0.
- RowsCutoff
Logical. If TRUE, filters based on cumulative percentage cutoff. Default is FALSE.
- digits
Numeric. Number of decimal places for percentage columns. Default is 2.
Value
A tibble summarizing the counts of occurrences of elements in
the column, grouped by the by column. The result includes the number
of occurrences (count) and is arranged in descending order of count.
A tibble summarizing the counts of unique domain architectures
(DomArch) per lineage (Lineage). The resulting table contains three
columns: DomArch, Lineage, and count, which indicates the frequency
of each domain architecture for each lineage. The results are arranged in
descending order of count.
A tibble summarizing each unique DomArch, along with the following
columns:
totalcount: The total occurrences of eachDomArchacross all lineages.totallin: The total number of unique lineages in which eachDomArchappears. The results are arranged in descending order oftotallinandtotalcount.
A tibble summarizing each unique combination of GenContext,
DomArch, and Lineage, along with the following columns:
GenContext: The genomic context for each entry.DomArch: The domain architecture for each entry.Lineage: The lineage associated with each entry.count: The total number of occurrences for each combination ofGenContext,DomArch, andLineage.
The results are arranged in descending order of count.
A tibble summarizing each unique combination of GenContext and Lineage,
along with the count of occurrences. The results are arranged in descending order of count.
A tibble summarizing each unique GenContext, along with the following columns:
totalcount: The total count for eachGenContext.totalDA: The number of distinctDomArchfor eachGenContext.totallin: The number of distinctLineagefor eachGenContext.
The results are arranged in descending order of totalcount, totalDA, and totallin.
A data frame with the following columns:
{{ column }}: Unique values from the specified column.totalcount: The total count of occurrences for each unique value in the specified column.IndividualCountPercent: The percentage of eachtotalcountrelative to the overall count.CumulativePercent: The cumulative percentage of total counts.
Examples
if (FALSE) { # \dontrun{
library(tidyverse)
tibble(DomArch = c("a+b", "a+b", "b+c", "a+b"), Lineage = c("l1", "l1", "l1", "l2")) |>
summarizeByLineage(query = "all")
} # }
if (FALSE) { # \dontrun{
summarizeDomArch_ByLineage(data1)
} # }
if (FALSE) { # \dontrun{
summarizeDomArch(data1)
} # }
if (FALSE) { # \dontrun{
summarizeGenContext_ByDomArchLineage(your_data)
} # }
if (FALSE) { # \dontrun{
summarizeGenContext_ByLineage(your_data)
} # }
if (FALSE) { # \dontrun{
summarizeGenContext(data1)
} # }
if (FALSE) { # \dontrun{
totalGenContextOrDomArchCounts(pspa - gc_lin_counts, 0, "GC")
} # }