API reference¶
The package consists of just one module.
dbotu module¶
-
class
dbotu.
DBCaller
(seq_table, records, max_dist, min_fold, threshold_pval, log=None, debug=None)[source]¶ Bases:
object
Object for processing the sequence table and distance matrix into an OTU table.
-
ga_matches
(candidate)[source]¶ OTUs that meet the genetic and abundance criteria
- candidate: OTU
- sequence to evaluate
returns: nothing
-
write_fasta
(output)[source]¶ Write the output fasta with dbOTU representative sequences.
output: filehandle
returns: nothing
-
-
class
dbotu.
OTU
(name, sequence, counts)[source]¶ Bases:
object
Object for keeping track of an OTU’s distribution and computing genetic distances
-
dbotu.
call_otus
(seq_table_fh, fasta_fn, output_fh, gen_crit, abund_crit, pval_crit, log=None, membership=None, debug=None)[source]¶ Read in input files, call OTUs, and return output.
- seq_table_fh: filehandle
- sequence count table, tab-separated
- fasta_fn: str
- sequences fasta filename
- output_fh: filehandle
- place to write main output OTU table
- gen_crit, abund_crit, pval_crit: float
- threshold values for genetic criterion, abundance criterion, and distribution criterion (pvalue)
- log, membership, debug: filehandles
- places to write supplementary output
-
dbotu.
read_sequence_table
(fn)[source]¶ Read in a table of sequences. The table must be tab-separated with exactly one header line of a field naming the sequences (e.g., “OTU”, “OTU_ID”, “seq”, etc.) followed by tab-separated sample names. Sequence names are the first field of the following rows. The cells in the table are the counts of that sequence in that sample.
fn: filename (or handle)
returns: pandas.DataFrame