Welcome to treelib’s documentation!¶
🌳 Introduction¶
Tree is a fundamental data structure in computer science, essential for organizing hierarchical data efficiently. treelib provides a comprehensive, high-performance implementation of tree data structures in Python.
Why choose treelib?
🚀 Blazing Fast: O(1) node lookup and access operations
🎨 Rich Visualization: Beautiful tree display with multiple formatting options
🔧 Flexible Operations: Comprehensive tree manipulation (add, move, copy, delete)
📊 Export Ready: JSON, dictionary, and GraphViz export capabilities
🔍 Advanced Search: Powerful filtering and traversal algorithms
💾 Memory Efficient: Optimized for both small and large tree structures
Perfect for:
File system representations and directory scanning
Organizational charts and hierarchical structures
Decision trees and machine learning models
Menu systems and navigation structures
Category taxonomies and classification systems
Family trees and genealogical data
Abstract syntax trees for parsers
Game tree structures and AI algorithms
📦 Installation¶
Install treelib using pip for the latest stable version:
pip install treelib
Or install from source for the latest development features:
git clone https://github.com/caesar0301/treelib.git
cd treelib
pip install poetry
poetry install
- System Requirements:
Python 3.7+
🚀 Quick Start Guide¶
Ready to build your first tree? Let’s start with a simple example:
Basic Tree Creation¶
from treelib import Tree
# Create a new tree
tree = Tree()
# Add root node
tree.create_node("Company", "company")
# Add departments
tree.create_node("Engineering", "eng", parent="company")
tree.create_node("Sales", "sales", parent="company")
tree.create_node("HR", "hr", parent="company")
# Add team members
tree.create_node("Alice (CTO)", "alice", parent="eng")
tree.create_node("Bob (Developer)", "bob", parent="eng")
tree.create_node("Carol (Sales Manager)", "carol", parent="sales")
tree.create_node("Dave (HR Manager)", "dave", parent="hr")
# Display the tree
tree.show()
Output:
Company
├── Engineering
│ ├── Alice (CTO)
│ └── Bob (Developer)
├── Sales
│ └── Carol (Sales Manager)
└── HR
└── Dave (HR Manager)
Working with Custom Data¶
Store rich data in your tree nodes:
from treelib import Tree
# Employee data structure
class Employee:
def __init__(self, name, role, salary):
self.name = name
self.role = role
self.salary = salary
def __str__(self):
return f"{self.name} ({self.role})"
# Create tree with custom data
tree = Tree()
tree.create_node("Company", "company")
# Add employees with rich data
tree.create_node("Alice", "alice", parent="company",
data=Employee("Alice Johnson", "CTO", 150000))
tree.create_node("Bob", "bob", parent="alice",
data=Employee("Bob Smith", "Senior Developer", 120000))
# Display with custom data property
tree.show(data_property="role")
Output:
Company
├── CTO
└── Senior Developer
🎯 Core Concepts¶
Understanding Nodes¶
Nodes are the building blocks of trees. Each node contains:
Identifier: Unique ID for referencing (auto-generated if not provided)
Tag: Human-readable label for display
Data: Optional custom payload (any Python object)
Parent/Children: Relationships to other nodes
from treelib import Node, Tree
# Create nodes with different configurations
node1 = Node("Simple Node", "n1")
node2 = Node("Rich Node", "n2", data={"type": "folder", "size": 1024})
node3 = Node("Hidden Node", "n3", expanded=False) # Initially collapsed
Understanding Trees¶
Trees manage collections of nodes with these key properties:
Single Root: Every tree has exactly one root node (or is empty)
Hierarchy: Each non-root node has exactly one parent
Unique IDs: Node identifiers must be unique within the tree
Efficient Access: O(1) lookup time for any node
tree = Tree()
# Tree properties
print(f"Tree size: {tree.size()}") # Number of nodes
print(f"Tree depth: {tree.depth()}") # Maximum depth
print(f"Root node: {tree.root}") # Root identifier
print(f"Is empty: {len(tree) == 0}") # Empty check
📚 Comprehensive API Guide¶
Tree Creation and Basic Operations¶
Creating Trees
# Empty tree
tree1 = Tree()
# Copy existing tree (shallow)
tree2 = Tree(tree1)
# Deep copy with independent data
tree3 = Tree(tree1, deep=True)
# Tree with custom identifier
tree4 = Tree(identifier="my_tree")
Adding Nodes
# Basic node creation
tree.create_node("Root", "root")
tree.create_node("Child", "child", parent="root")
# Node with custom data
tree.create_node("Data Node", "data", parent="root",
data={"key": "value"})
# Pre-created node
node = Node("Pre-made", "premade")
tree.add_node(node, parent="root")
Tree Modification¶
Moving and Reorganizing
# Move node to new parent
tree.move_node("source_id", "new_parent_id")
# Remove node and all descendants
removed_count = tree.remove_node("node_id")
# Link past a node (remove node but keep children)
tree.link_past_node("node_id")
Copying and Merging
# Create subtree
subtree = tree.subtree("root_of_subtree")
# Remove subtree (returns removed tree)
removed_tree = tree.remove_subtree("node_id")
# Paste another tree
tree.paste("target_node", another_tree)
# Merge another tree (paste children only)
tree.merge("target_node", another_tree)
Advanced Features¶
Filtering and Analysis
# Filter nodes by condition
large_files = tree.filter_nodes(lambda node:
hasattr(node.data, 'size') and node.data.size > 1000)
# Get all leaf nodes
leaves = tree.leaves()
# Get nodes at specific level
level_2_nodes = [node for node in tree.all_nodes()
if tree.level(node.identifier) == 2]
# Get all paths from root to leaves
all_paths = tree.paths_to_leaves()
Tree Metrics
# Basic metrics
total_nodes = tree.size()
max_depth = tree.depth()
# Level-specific metrics
nodes_at_level_2 = tree.size(level=2)
# Custom analysis
def analyze_tree(tree):
analysis = {
'total_nodes': tree.size(),
'depth': tree.depth(),
'leaves': len(tree.leaves()),
'internal_nodes': tree.size() - len(tree.leaves()),
'branching_factor': sum(len(tree.children(node.identifier))
for node in tree.all_nodes()) / tree.size()
}
return analysis
🎨 Visualization and Display¶
Rich Display Options¶
# Basic display
tree.show()
# Custom line styles
tree.show(line_type="ascii-em") # Double lines
tree.show(line_type="ascii-emv") # Mixed vertical
tree.show(line_type="ascii") # Simple ASCII
# Show node IDs
tree.show(idhidden=False)
# Sort nodes at each level
tree.show(key=lambda x: x.tag, reverse=True)
# Display custom data property
tree.show(data_property="name")
Available Line Styles:
ascii: |-- Child
ascii-ex: ├── Child (default)
ascii-exr: ├── Child (rounded)
ascii-em: ╠══ Child (double)
ascii-emv: ╟── Child (mixed vertical)
ascii-emh: ╞══ Child (mixed horizontal)
Conditional Display¶
# Hide specific branches
tree.show(filter=lambda x: x.identifier != "hidden_branch")
# Show only certain node types
tree.show(filter=lambda x: hasattr(x.data, 'type') and x.data.type == "folder")
# Custom formatting function
def custom_filter(node):
# Show only nodes with tags starting with uppercase
return node.tag[0].isupper()
tree.show(filter=custom_filter)
💾 Export and Persistence¶
JSON Export/Import¶
# Export to JSON string
json_string = tree.to_json()
# Export with data included
json_with_data = tree.to_json(with_data=True)
# Pretty printed JSON
import json
formatted_json = json.dumps(json.loads(tree.to_json()), indent=2)
Dictionary Conversion¶
# Convert to dictionary
tree_dict = tree.to_dict()
# Include data in dictionary
tree_dict = tree.to_dict(with_data=True)
# Custom sorting
tree_dict = tree.to_dict(sort=True, reverse=True)
File Operations¶
# Save tree structure to file
tree.save2file("tree_structure.txt")
# Custom formatting when saving
tree.save2file("tree.txt", line_type="ascii-em", data_property="name")
GraphViz Export¶
# Export to DOT format for GraphViz
tree.to_graphviz("tree.dot")
# Custom node shapes
tree.to_graphviz("tree.dot", shape="box")
# Directed graph
tree.to_graphviz("tree.dot", graph="digraph")
🏗️ Real-World Examples¶
File System Scanner¶
Build a directory tree scanner:
import os
from treelib import Tree
def scan_directory(path, tree=None, parent=None, max_depth=3, current_depth=0):
"""Scan directory and build tree structure."""
if tree is None:
tree = Tree()
tree.create_node(os.path.basename(path) or path, path)
parent = path
if current_depth >= max_depth:
return tree
try:
for item in sorted(os.listdir(path)):
item_path = os.path.join(path, item)
if os.path.isdir(item_path):
tree.create_node(f"📁 {item}", item_path, parent=parent)
scan_directory(item_path, tree, item_path, max_depth, current_depth + 1)
else:
size = os.path.getsize(item_path)
tree.create_node(f"📄 {item} ({size} bytes)", item_path, parent=parent)
except PermissionError:
pass
return tree
# Usage
file_tree = scan_directory("/path/to/directory", max_depth=2)
file_tree.show()
Organization Chart¶
Create a company organizational structure:
from treelib import Tree
class Employee:
def __init__(self, name, title, department, email=None):
self.name = name
self.title = title
self.department = department
self.email = email
def __str__(self):
return f"{self.name} - {self.title}"
def build_org_chart():
org = Tree()
# CEO
org.create_node("CEO", "ceo",
data=Employee("John Smith", "Chief Executive Officer", "Executive"))
# VPs
org.create_node("VP Engineering", "vp_eng", parent="ceo",
data=Employee("Sarah Johnson", "VP Engineering", "Engineering"))
org.create_node("VP Sales", "vp_sales", parent="ceo",
data=Employee("Mike Wilson", "VP Sales", "Sales"))
# Engineering Team
org.create_node("Engineering Manager", "eng_mgr", parent="vp_eng",
data=Employee("Alice Brown", "Engineering Manager", "Engineering"))
org.create_node("Senior Developer", "senior_dev", parent="eng_mgr",
data=Employee("Bob Davis", "Senior Developer", "Engineering"))
org.create_node("Junior Developer", "junior_dev", parent="eng_mgr",
data=Employee("Carol White", "Junior Developer", "Engineering"))
# Sales Team
org.create_node("Sales Manager", "sales_mgr", parent="vp_sales",
data=Employee("Dave Green", "Sales Manager", "Sales"))
org.create_node("Sales Rep", "sales_rep", parent="sales_mgr",
data=Employee("Eve Black", "Sales Representative", "Sales"))
return org
# Usage
org_chart = build_org_chart()
# Display with titles
org_chart.show(data_property="title")
# Find all engineering employees
engineering_staff = [node for node in org_chart.all_nodes()
if node.data.department == "Engineering"]
Decision Tree¶
Implement a simple decision tree:
from treelib import Tree
class DecisionNode:
def __init__(self, question=None, answer=None, condition=None):
self.question = question
self.answer = answer
self.condition = condition
def __str__(self):
if self.answer:
return f"Answer: {self.answer}"
return f"Question: {self.question}"
def build_decision_tree():
"""Build a simple decision tree for weather activities."""
tree = Tree()
# Root decision
tree.create_node("Weather Decision", "root",
data=DecisionNode("Is it sunny?"))
# Sunny branch
tree.create_node("Sunny", "sunny", parent="root",
data=DecisionNode("Is it hot?"))
tree.create_node("Go Swimming", "swim", parent="sunny",
data=DecisionNode(answer="Go to the beach!"))
tree.create_node("Go Hiking", "hike", parent="sunny",
data=DecisionNode(answer="Perfect for a hike!"))
# Not sunny branch
tree.create_node("Not Sunny", "cloudy", parent="root",
data=DecisionNode("Is it raining?"))
tree.create_node("Stay Inside", "inside", parent="cloudy",
data=DecisionNode(answer="Movie day!"))
tree.create_node("Light Activity", "light", parent="cloudy",
data=DecisionNode(answer="Good for shopping!"))
return tree
# Usage
decision_tree = build_decision_tree()
decision_tree.show()
📊 Performance and Best Practices¶
Performance Characteristics¶
- Time Complexity:
Node access: O(1)
Node insertion: O(1)
Tree traversal: O(n)
Search operations: O(n)
Subtree operations: O(k) where k is subtree size
- Memory Usage:
Each node: ~200 bytes + data size
Tree overhead: ~100 bytes + node dictionary
Shallow copy: Shares node references (minimal memory)
Deep copy: Duplicates all data (2x memory usage)
Best Practices¶
Choosing Identifiers
# Good: Meaningful, unique identifiers
tree.create_node("User Profile", "user_123")
tree.create_node("Settings", "user_123_settings", parent="user_123")
# Avoid: Generic or potentially conflicting IDs
tree.create_node("Item", "1") # Too generic
tree.create_node("Data", "data") # Might conflict
Memory Management
# For large trees, consider lazy loading
def load_children_on_demand(tree, node_id):
if not tree.is_branch(node_id): # No children loaded yet
# Load children from database/file
load_node_children(tree, node_id)
# Use shallow copies when possible
backup_tree = Tree(original_tree, deep=False)
# Clean up references when done
del large_tree
Error Handling
from treelib.exceptions import NodeIDAbsentError, DuplicatedNodeIdError
try:
tree.create_node("New Node", "existing_id")
except DuplicatedNodeIdError:
print("Node ID already exists!")
try:
node = tree["nonexistent"]
except NodeIDAbsentError:
print("Node not found!")
# Safe alternative
node = tree.get_node("might_not_exist")
if node is not None:
print(f"Found: {node.tag}")
🔧 Advanced Topics¶
Custom Node Classes¶
Extend the Node class for specialized functionality:
from treelib import Node, Tree
class FileNode(Node):
def __init__(self, tag, identifier=None, size=0, file_type="unknown"):
super().__init__(tag, identifier)
self.size = size
self.file_type = file_type
@property
def size_mb(self):
return self.size / (1024 * 1024)
def is_large_file(self):
return self.size > 10 * 1024 * 1024 # 10MB
# Use custom node class
file_tree = Tree(node_class=FileNode)
file_tree.create_node("Large File", "big_file", size=50*1024*1024, file_type="video")
Tree Algorithms¶
Implement custom tree algorithms:
def find_path(tree, start_id, end_id):
"""Find path between two nodes."""
# Get path from start to root
start_path = list(tree.rsearch(start_id))
# Get path from end to root
end_path = list(tree.rsearch(end_id))
# Find common ancestor
common_ancestors = set(start_path) & set(end_path)
if not common_ancestors:
return None
# Find lowest common ancestor
lca = min(common_ancestors, key=lambda x: tree.level(x))
# Construct path
start_to_lca = start_path[:start_path.index(lca)]
lca_to_end = end_path[:end_path.index(lca)]
return start_to_lca + [lca] + lca_to_end[::-1]
def tree_statistics(tree):
"""Calculate comprehensive tree statistics."""
stats = {}
# Basic stats
stats['total_nodes'] = tree.size()
stats['depth'] = tree.depth()
stats['leaves'] = len(tree.leaves())
# Level distribution
level_counts = {}
for node in tree.all_nodes():
level = tree.level(node.identifier)
level_counts[level] = level_counts.get(level, 0) + 1
stats['level_distribution'] = level_counts
# Branching factor
branching_factors = []
for node in tree.all_nodes():
children_count = len(tree.children(node.identifier))
if children_count > 0:
branching_factors.append(children_count)
if branching_factors:
stats['avg_branching_factor'] = sum(branching_factors) / len(branching_factors)
stats['max_branching_factor'] = max(branching_factors)
return stats
Multi-Tree Operations¶
Work with multiple trees simultaneously:
def merge_trees(tree1, tree2, merge_point):
"""Merge two trees at specified point."""
if not tree1.contains(merge_point):
raise ValueError(f"Merge point {merge_point} not found in tree1")
# Create deep copy to avoid modifying original
merged_tree = Tree(tree1, deep=True)
tree2_copy = Tree(tree2, deep=True)
# Merge at specified point
merged_tree.paste(merge_point, tree2_copy)
return merged_tree
def compare_trees(tree1, tree2):
"""Compare two trees structurally."""
def get_structure(tree, node_id=None):
if node_id is None:
node_id = tree.root
children = tree.children(node_id)
if not children:
return tree[node_id].tag
return {
'tag': tree[node_id].tag,
'children': [get_structure(tree, child.identifier) for child in children]
}
return get_structure(tree1) == get_structure(tree2)
🚨 Troubleshooting¶
Common Issues and Solutions¶
Problem: “NodeIDAbsentError” when accessing nodes
# Problem: Node doesn't exist
try:
node = tree["nonexistent_id"]
except NodeIDAbsentError:
print("Node not found!")
# Solution: Use safe access
node = tree.get_node("nonexistent_id")
if node is None:
print("Node not found!")
Problem: “DuplicatedNodeIdError” when creating nodes
# Problem: ID already exists
tree.create_node("First", "duplicate_id")
# tree.create_node("Second", "duplicate_id") # This will fail!
# Solution: Check existence first
if "duplicate_id" not in tree:
tree.create_node("Second", "duplicate_id")
Problem: Memory issues with large trees
# Problem: Deep copying large trees
# large_tree_copy = Tree(large_tree, deep=True) # Uses lots of memory
# Solution: Use shallow copy when possible
large_tree_copy = Tree(large_tree, deep=False) # Shares references
Problem: Performance issues with frequent modifications
# Problem: Adding many nodes one by one
for i in range(10000):
tree.create_node(f"Node {i}", f"node_{i}", parent="root")
# Solution: Batch operations when possible
root_id = "root"
nodes_to_add = [(f"Node {i}", f"node_{i}") for i in range(10000)]
for tag, node_id in nodes_to_add:
tree.create_node(tag, node_id, parent=root_id)
Debug and Inspection Tools¶
def debug_tree(tree, node_id=None):
"""Print detailed tree debug information."""
print(f"Tree Debug Info:")
print(f" Tree ID: {tree.identifier}")
print(f" Root: {tree.root}")
print(f" Size: {tree.size()}")
print(f" Depth: {tree.depth()}")
if node_id:
if tree.contains(node_id):
node = tree[node_id]
print(f"\nNode '{node_id}' Debug:")
print(f" Tag: {node.tag}")
print(f" Level: {tree.level(node_id)}")
print(f" Is Leaf: {node.is_leaf(tree.identifier)}")
print(f" Is Root: {node.is_root(tree.identifier)}")
print(f" Children: {len(tree.children(node_id))}")
print(f" Parent: {tree.parent(node_id)}")
else:
print(f"Node '{node_id}' not found!")
def validate_tree_integrity(tree):
"""Validate tree structure integrity."""
issues = []
# Check root existence
if tree.root and not tree.contains(tree.root):
issues.append("Root node reference is invalid")
# Check parent-child consistency
for node_id, node in tree.nodes.items():
parent_id = node.predecessor(tree.identifier)
if parent_id:
if not tree.contains(parent_id):
issues.append(f"Node {node_id} has invalid parent {parent_id}")
else:
parent_children = tree.children(parent_id)
if node not in parent_children:
issues.append(f"Parent-child relationship inconsistent for {node_id}")
return issues
📖 API Reference¶
For detailed API documentation of all methods and classes, see the automatically generated documentation sections below.
🤝 Contributing and Support¶
- Found a bug or have a feature request?
Open an issue on GitHub
- Want to contribute?
Fork the repository and submit a pull request!
- Need help?
Check out the examples directory for comprehensive usage examples.
- Performance benchmarks and advanced examples:
See the tree_algorithms.py example for algorithmic implementations and performance analysis.