This function combines 'efetch_ipg()' and 'ipg2lin()' to map a set of protein accessions to their assembly (GCA_ID), tax ID, and lineage.
Function to map protein accession numbers to lineage
This function combines 'efetch_ipg()' and 'ipg2lin()' to map a set of protein accessions to their assembly (GCA_ID), tax ID, and lineage.
Usage
acc2lin(
accessions,
assembly_path,
lineagelookup_path,
ipgout_path = NULL,
plan = "multicore"
)
acc2lin(
accessions,
assembly_path,
lineagelookup_path,
ipgout_path = NULL,
plan = "multicore"
)
Arguments
- accessions
Character vector of protein accessions
- assembly_path
String of the path to the assembly_summary path This file can be generated using the "DownloadAssemblySummary()" function
- lineagelookup_path
String of the path to the lineage lookup file (taxid to lineage mapping). This file can be generated using the
- ipgout_path
Path to write the results of the efetch run of the accessions on the ipg database. If NULL, the file will not be written. Defaults to NULL
- plan