API reference¶

The package consists of just one module.

dbotu module¶

class dbotu.DBCaller(seq_table, records, max_dist, min_fold, threshold_pval, log=None, debug=None)[source]¶

Bases: object

Object for processing the sequence table and distance matrix into an OTU table.

ga_matches(candidate)[source]¶

OTUs that meet the genetic and abundance criteria

candidate: OTU: sequence to evaluate

returns: nothing

otu_table()[source]¶

Generate OTU table.

returns: pandas.DataFrame

run()[source]¶

Process all the input sequences in order of their abundance.

returns: nothing

write_fasta(output)[source]¶

Write the output fasta with dbOTU representative sequences.

output: filehandle

returns: nothing

write_membership(output)[source]¶

Write the QIIME-style OTU mapping information to a file.

output: filehandle

returns: nothing

write_otu_table(output)[source]¶

Write the QIIME-style OTU table to a file.

output: filehandle

returns: nothing

class dbotu.OTU(name, sequence, counts)[source]¶

Bases: object

Object for keeping track of an OTU’s distribution and computing genetic distances

absorb(other)[source]¶

Add another OTU’s counts to this one

other: OTU

returns: nothing

distance_to(other)[source]¶

Length-adjusted Levenshtein “distance” to other OTU

other: OTU: distance to this OTU

returns: float

distribution_pval(other)[source]¶

P-value from the likelihood ratio test comparing the distribution of the abundances of two OTU objects. See docs for explanation of the test.

other: OTU

returns: float

dbotu.call_otus(seq_table_fh, fasta_fn, output_fh, gen_crit, abund_crit, pval_crit, log=None, membership=None, debug=None)[source]¶

Read in input files, call OTUs, and return output.

seq_table_fh: filehandle: sequence count table, tab-separated
fasta_fn: str: sequences fasta filename
output_fh: filehandle: place to write main output OTU table
gen_crit, abund_crit, pval_crit: float: threshold values for genetic criterion, abundance criterion, and distribution criterion (pvalue)
log, membership, debug: filehandles: places to write supplementary output

dbotu.read_sequence_table(fn)[source]¶

Read in a table of sequences. The table must be tab-separated with exactly one header line of a field naming the sequences (e.g., “OTU”, “OTU_ID”, “seq”, etc.) followed by tab-separated sample names. Sequence names are the first field of the following rows. The cells in the table are the counts of that sequence in that sample.

fn: filename (or handle)

returns: pandas.DataFrame

API reference¶

dbotu module¶

dbotu3

Navigation

Related Topics