One of the most common and intriguingdata structuresin computer science is the tree structure. Just like an ordinary tree, the tree structure has leaves and branches. But there are also nodes involved, as well as parents, children, and siblings sort of like a family tree.
The interconnected nature of the tree offers some unique benefits when it comes to representing data. There are also a variety of operations and algorithms used with trees, some of which are specifically used with them. We re going to dive into how trees are structured in this article, along with examples of their use.
What Is the Tree Data Structure?
The tree structure is very similar to a real-life tree. Elements are represented by nodes, which are connected through edges, or branches. This leads to a hierarchical structure, where the nodes forming below a node are referred to as child nodes, and the node atop is called a parent node.
The top node with no parent nodes is the root node, and the children nodes with no children themselves are known as sibling nodes.
By connecting nodes in this way, we can represent relationships between the data and organize it logically. This can be helpful to illustrate step-by-step processes or interlinked events. The degree to which nodes can branch is determined by the tree type, of which there are many kinds.
What Types of Trees Are There?
The most common types are the binary, balanced, and general trees. A binary tree can only have nodes with two children each, much like binary only has two digits. Heap trees are a subtype of binary trees, which are used for implementing heaps orpriority queues.
They possess the heap property, where the value of a node is either greater than or equal to or less than or equal to the value of its children nodes, depending on whether it s a max heap or min heap tree.
A balanced tree has the same height on its left and right branches, whereas a general tree can have nodes with any number of child nodes.
Apart from these types, there are also n-ary trees, b-trees, heap trees, quadtrees, and octrees:
- An n-ary tree is similar to a general tree but is restricted in each case to n number of child nodes. Naturally, these are used to represent relationships with varying degrees of hierarchy.
- B-Trees are also known as self-balancing trees because they readjust their nodes when elements are inserted or deleted to maintain a balanced structure. These are most often used in situations where we need fast access to large datasets, such as databases, and file systems.
- Quad and octrees are rather unique as they re used to represent 2-dimensional and 3-dimensional space respectively. They do this by splitting regions into quadrants or octants, representing either rectangular or cuboid regions in space. As such, these are used to represent spatial data.
What Operations Are Used With the Tree Data Structure?
Trees share many operations that are common to linear data structures, such as insertion, deletion and traversal. However, there are some operations unique to trees.
These include search operations, height and depth calculations, subtree operations, balancing operations, count operations, and serialization and deserialization. A brief description of these is given in the table below, along with their general syntax.
|Insertion||This adds a new node to the tree, depending on the tree rules||insert(key, value); this inserts a node with a specific key of a specific value|
|Deletion||Removes a node from the tree while maintaining its properties||delete(key)|
|Search||Finds a particular node in the tree based on its key or value||search(key)|
|Traversal||Visits each node in the tree in some order, usually inorder, preorder, or postorder||inorder(node), preorder(node), postorder(node)|
|Height/Depth||Calculates the height/depth of the tree, which is the number of edges between the root and final leaf node||height()|
|Subtree||extractSubtree(node) where the node is the root node; mergeSubtrees(tree1, tree2) where tree1 and tree2 are the subtrees you wish to merge||Includes rotation, reordering, or rebalancing to maintain a balanced tree|
|Count||Counts the number of nodes||size()|
|Balancing||Includes rotation, reordering, or rebalancing to maintain a balanced tree||balance()|
|Serialization||Converts the tree into a linear representation ready for transmission||serialize(root) where root is the root node|
|Deserialization||Reconstructs the tree from the linear sequence||deserialize(serializedTree) where serializedTree is the sequence|
Traversal Methods for the Tree Data Structure
It s worth noting you have three general methods for traversing a tree. These are:
- Inorder traversal The left subtree is visited recursively, followed by the root and then the right subtree in ascending order. Can be used to create a list of sorted elements.
- Preorder traversal The root node is visited first, then the left and right subtrees in order. This can be used to create trees due to the natural traversal order, or with the DFS (Depth-First Search) algorithm.
- Postorder traversal The left and right subtrees are visited before the root node. Commonly used for tree deletion.
Algorithms Used with the Tree Data Structure
Now that we ve covered various tree operations, let s move on to the algorithms we can use with them. The major ones are summarized in the table.
|DFS||Depth-First Search: this is where we traverse the nodes as deep as possible before moving on to the next branch.|
|BFS||Breadth-First Search: the tree is traversed as wide as possible before moving on to the next depth level.|
|Huffman Coding||Constructs a binary tree so that data can be compressed.|
If you want to know more about these particular algorithms and how they work, check out our articles onDFS,BFS, andHuffman Coding.
Tree Data Structure in Action
At this stage, it only makes sense to see how these operations work in practice. First, we re going to demonstrate how to use the operations with a simple binary tree example. Consider this code block in Python:
class Node: def __init__(self, key): self.key = key self.left = None self.right = None class BinarySearchTree: def __init__(self): self.root = None def insert(self, key): self.root = self._insert_recursive(self.root, key) def _insert_recursive(self, node, key): if node is None: return Node(key) if key < node.key: node.left = self._insert_recursive(node.left, key) elif key > node.key: node.right = self._insert_recursive(node.right, key) return node def delete(self, key): self.root = self._delete_recursive(self.root, key) def _delete_recursive(self, node, key): if node is None: return node if key < node.key: node.left = self._delete_recursive(node.left, key) elif key > node.key: node.right = self._delete_recursive(node.right, key) else: if node.left is None: return node.right elif node.right is None: return node.left else: successor = self._find_min(node.right) node.key = successor.key node.right = self._delete_recursive(node.right, successor.key) return node def _find_min(self, node): current = node while current.left is not None: current = current.left return current def search(self, key): return self._search_recursive(self.root, key) def _search_recursive(self, node, key): if node is None or node.key == key: return node if key < node.key: return self._search_recursive(node.left, key) return self._search_recursive(node.right, key) def inorder(self): self._inorder_recursive(self.root) def _inorder_recursive(self, node): if node is not None: self._inorder_recursive(node.left) print(node.key) self._inorder_recursive(node.right) def height(self): return self._height_recursive(self.root) def _height_recursive(self, node): if node is None: return 0 left_height = self._height_recursive(node.left) right_height = self._height_recursive(node.right) return max(left_height, right_height) + 1 def size(self): return self._size_recursive(self.root) def _size_recursive(self, node): if node is None: return 0 return self._size_recursive(node.left) + self._size_recursive(node.right) + 1 bst = BinarySearchTree() bst.insert(5) bst.insert(3) bst.insert(8) bst.insert(2) bst.insert(4) print("Inorder traversal:") bst.inorder() print("Height of the tree:", bst.height()) print("Size of the tree:", bst.size()) print("Searching for key 3:", bst.search(3)) bst.delete(3) print("Inorder traversal after deleting key 3:") bst.inorder()
Explanation of Code
It s a pretty long block of code, but shows these operations quite effectively.
We start by defining the Node class, and the constructor method, __init__ . This takes the key parameter of the node as an argument, which is assigned to the key attribute of the node. Initially, the node has no left or right children as dictated by the next lines.
Next, we declare the BinarySearchTree class and its constructor, as we re using the binary tree data structure. The root attribute is initialized to None, as the tree is empty.
Insertion and Deletion
We then define the insert method and recursive method, which takes the current node parameter and the key for the node to be inserted. We then have code to insert nodes at the left and right branches, with nodes greater than the current being inserted to the right and vice versa for the left.
Similarly, delete is defined, and checks the value of the node in the same way. Once the key of the current node is equal to the key to be deleted, the node is removed. If the node has children, its successor is found and replaced with this.
Find and Search
Find_min is defined after, which is a variant of the search operation, used to find the node with the minimum key. We then define search , which is used recursively until the matching node is found.
After this, we have the inorder traversal method. The implementation for the other traversals is similar, but the order in which the nodes are traversed differs. In all cases, the node parameter represents the current node.
We then have the height method. Both branches of the tree are traversed, and then the height is calculated from this.
Count, or size , is used next. The tree is traversed and counted, with 1 being added to account for the node which we started at.
Lastly, we create a binary tree, bst . We insert 5 keys into the tree, perform inorder traversal, and print the height, size, and a searched node. To finish, we delete key 3 and print the traversal again to reflect this. The results can be seen in the image below.
In conclusion, trees are a very flexible hierarchical structure, often used for their efficient use of insertion, deletion, and search operations. Each tree has a root node, children nodes, and sibling nodes, as well as branches connecting the nodes.
Binary trees are one of the most common tree types, but there are also balanced trees, n-ary trees, quadtrees, octrees, and general trees. Hierarchical relationships are common in computer and data science, so understanding how to implement and operate tree structures will be a valuable tool in your project.