- works with all types of tokens, not just text
- allows purging of low-frequency tokens (for performance)
- uses log probabilities to avoid underflow
- allows prior distribution on classes to be assumed uniform
- customizable constant value for Laplacian smoothing
- allows for multiple categories
- optional binarized mode