Skip to content

Function `baltic.make_tree`

Barney Potter edited this page Oct 14, 2024 · 1 revision

make_tree() Function

Description

The make_tree() function in BALTIC is used to parse a tree string and create a tree object. This function is crucial for converting string representations of phylogenetic trees (such as Newick format) into BALTIC tree objects that can be manipulated and analyzed.

Syntax

def make_tree(data, ll=None, verbose=False)

Parameters

  • data (str): The tree string to be parsed.
  • ll (tree or None): An instance of a tree object. If None, a new tree object is created. Default is None.
  • verbose (bool): If True, prints verbose output during the process. Default is False.

Return Value

  • tree: The tree object created from the parsed tree string.

Functionality

  1. Defines regular expression patterns for identifying different tree components (e.g., tips, nodes, comments).
  2. Iterates through the tree string character by character.
  3. Identifies and processes different tree elements:
    • Nodes
    • Tips (in BEAST or non-BEAST format)
    • Branch lengths
    • Comments and annotations
    • Reticulate branches (for network-like structures)
  4. Constructs the tree structure as it parses the string.
  5. Assigns traits and other attributes to nodes and tips based on the parsed information.

Use Cases

  1. Converting Newick format strings into BALTIC tree objects.
  2. Parsing tree strings from various sources (e.g., phylogenetic software output).
  3. Creating tree objects for further analysis or visualization in BALTIC.
  4. Handling annotated trees with additional metadata in the tree string.

Example

import baltic as bt

# Simple Newick string
newick_string = "(A:0.1,B:0.2,(C:0.3,D:0.4):0.5);"

# Create a tree object
tree = bt.make_tree(newick_string, verbose=True)

# Print some basic information about the tree
print(f"Number of tips: {len(tree.getExternal())}")
print(f"Number of internal nodes: {len(tree.getInternal())}")

# Visualize the tree
import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(10, 8))
tree.plotTree(ax)
tree.addText(ax)
plt.show()

# Parse a more complex tree with annotations
complex_tree_string = "((A[&trait=1]:0.1,B[&trait=2]:0.2)[&label=AB]:0.3,(C[&trait=3]:0.4,D[&trait=4]:0.5)[&label=CD]:0.6)[&label=root];"
complex_tree = bt.make_tree(complex_tree_string, verbose=True)

# Access annotations
for node in complex_tree.Objects:
    if 'trait' in node.traits:
        print(f"Node {node.index}: trait = {node.traits['trait']}")

Notes

  • This function can handle various tree formats, including those with annotations (e.g., BEAST output).
  • It supports the parsing of reticulate branches, allowing for network-like structures.
  • The function is robust to different naming conventions for tips and can handle quoted names.
  • Annotations and comments in square brackets are parsed and added to the traits dictionary of each node.
  • For large or complex trees, setting verbose=True can help in debugging by showing the parsing process.
  • The function can handle multitype tree singletons and both incoming and outgoing reticulation branches.
  • If the input string is not a valid tree representation, the function may raise exceptions or produce unexpected results.
  • This function is often used internally by other BALTIC functions that load trees from files or strings.
Clone this wiki locally