Skip to contents

All functions

BinaryDomainNetwork()
Domain Network
DownloadAssemblySummary()
Download the combined assembly summaries of genbank and refseq
GCA2lin()
Function to map GCA_ID to TaxID, and TaxID to Lineage
GenContextNetwork()
Genomic Context Directed Network
LineagePlot()
LineagePlot
RepresentativeAccNums()
Function to generate a vector of one Accession number per distinct observation from 'reduced' column
acc2fa()
acc2fa converts protein accession numbers to a fasta format.
acc2lin()
acc2lin
add_leaves()
Adding Leaves to an alignment file w/ accessions
add_lins()
Add Lineages
add_name()
Add Name
add_tax()
add_tax
advanced_opts2est_walltime()
Given MolEvolvR advanced options and number of inputs, calculate the total estimated walltime for the job
alignFasta()
Perform a Multiple Sequence Alignment on a FASTA file.
assert_count_df()
assert_count_df
assign_job_queue()
Decision function to assign job queue
clean_clust_file()
Clean Cluster File
clean_string()
Clean String
cleanup_GeneDesc()
Cleanup GeneDesc
cleanup_clust()
Cleanup Clust
cleanup_domarch()
Cleanup DomArch
cleanup_fasta_header()
Cleanup FASTA Header
cleanup_gencontext()
Cleanup Genomic Contexts
cleanup_lineage()
Cleanup Lineage
cleanup_species()
Cleanup Species
combine_files()
Download the combined assembly summaries of genbank and refseq
combine_full()
Combining full_analysis files
combine_ipr()
Combining clean ipr files
convert_aln2fa()
Adding Leaves to an alignment file w/ accessions
convert_fa2tre()
convert_fa2tre
count_bycol()
Count Bycol
count_to_sunburst() count_to_treemap()
Create an interactive plotly from count data
create_all_col_params()
create_all_col_params
create_lineage_lookup()
Create a look up table that goes from TaxID, to Lineage
create_one_col_params()
create_one_col_params
df_iprscan_domains2fasta()
Using the table returned from make_df_iprscan_domains, construct a domain fasta for a single accession number in the original fasta (i.e., the original fasta argument to make_df_iprscan_domains())
domain_network()
Domain Network
efetch_ipg()
efetch_ipg
elements2words()
Elements 2 Words
exec_interproscan()
exec_interproscan
fasta2fasta_domain()
fasta2fasta_domain
filter_by_doms()
Filter by Domains
filter_freq()
Filter Frequency
find_paralogs()
Find Paralogs
find_top_acc()
Group by lineage + DA then take top 20
format_job_args()
Format job arguments into html-formatted key/value pairs, for including in an email
gc_undirected_network()
Domain Network
generate_all_aln2fa()
Adding Leaves to an alignment file w/ accessions
generate_fa2tre()
generate_fa2tre
generate_msa()
Function to generate MSA using kalign
generate_trees()
generate_trees
get_accnums_from_fasta_file()
Get accnums from fasta file
get_df_ipr_col_names()
Constructor function for interproscan column names (based upon the global variable written in molevol_scripts/R/colnames_molevol.R)
get_df_ipr_col_types()
construct column types for reading interproscan output TSVs (based upon the global variable written in molevol_scripts/R/colnames_molevol.R)
get_job_message()
Produces a mail message that can be sent to a user when their job is accepted. Used by the send_job_status_email() method.
get_proc_medians()
Scrape MolEvolvR logs and calculate median processes
get_proc_weights()
Quickly get the runtime weights for MolEvolvR backend processes
ipg2lin()
ipg2lin
ipr2viz()
IPR2Viz
ipr2viz_web()
IPR2Viz Web
lineage.DA.plot()
Lineage Plot: Heatmap of Domains/DAs/GCs vs Lineages
lineage.Query.plot()
Lineage Plot: Heatmap of Queries vs Lineages
lineage.domain_repeats.plot()
Lineage Domain Repeats Plot
lineage.neighbors.plot()
Lineage Plot for top neighbors
lineage_sunburst()
Lineage Sunburst
make_accnums_unique()
make accnums unique
make_df_iprscan_domains()
For a given accession number, get the domain sequences using a interproscan output table & the original FASTA file
make_job_results_url()
Given a pin_id, returns the URL where the user can check the status of their job
make_opts2procs()
Construct list where names (MolEvolvR advanced options) point to processes
map_acc2name()
Default rename_fasta() replacement function. Maps an accession number to its name
map_advanced_opts2procs()
Use MolEvolvR advanced options to get associated processes
msa_pdf()
Multiple Sequence Alignment
pick_longer_duplicate()
Pick Longer Duplicate
plot_estimated_walltimes()
Plot the estimated runtimes for different advanced options and number of inputs
prot2tax()
prot2tax
prot2tax_old()
prot2tax_old
read_iprscan_tsv()
Read an interproscan output TSV with standardized column names and types
remove_astrk()
Remove Astrk
remove_empty()
Remove Empty
remove_tails()
Remove Tails
rename_fasta()
Rename the labels of fasta files
repeat2s()
repeat2s
replaceQMs()
Replace QMs
reveql()
reveql
reverse_operon()
reverse_operon
run_deltablast()
Run DELTABLAST to find homologs for proteins of interest
run_rpsblast()
Run RPSBLAST to generate domain architectures for proteins of interest
send_job_status_email()
Sends a "job accepted" email to a user when their job is accepted, including details about the job submission and how to check its status.
shorten_lineage()
Shorten Lineage
sink.reset()
Sink Reset
stacked_lin_plot()
Stacked Lineage Plot
string2accnum()
string2accnum
summ.DA()
summ.DA
summ.DA.byLin()
summ.DA.byLin
summ.GC()
summ.GC
summ.GC.byDALin()
summ.GC.byDALin
summ.GC.byLin()
summ.GC.byLin
summarize_bylin()
Summarize by Lineage
theme_genes2()
Theme Genes2
to_titlecase()
Changing case to 'Title Case'
total_counts()
Total Counts
upset.plot()
UpSet Plot
wordcloud2_element()
Wordclouds for the predominant domains, domain architectures.
wordcloud3()
Wordcloud3
wordcloud_element()
Wordclouds for the predominant domains, domain architectures
words2wc()
Words 2 Word Counts
write.MsaAAMultipleAlignment()
Write MsaAAMultpleAlignment Objects as algined fasta sequence
write_proc_medians_table()
Write a table of 2 columns: 1) process and 2) median seconds
write_proc_medians_yml()
Compute median process runtimes, then write a YAML list of the processes and their median runtimes in seconds to the path specified by 'filepath'.