🥇Binary Tree or how to prepare a binary search tree

Prelude

This article is about binary search trees. I recently wrote an article about data compression by the Huffman method. There I did not really pay attention to binary trees, because the methods of searching, inserting, deleting were not relevant. Now I decided to write an article about trees. Perhaps we'll start.

A tree is a data structure consisting of nodes connected by edges. We can say that a tree is a special case of a graph. Here is an example tree:

This is not a binary search tree! Everything is under the cut!

Vocabulary

Root

Tree root is the topmost node. In the example, this is node A. In the tree, only one path can lead from the root to any other node! In fact, any node can be considered as the root of the subtree corresponding to this node.

Parents/offspring

All nodes except the root have exactly one edge leading up to another node. The node above the current node is called parent this node. A node located below the current one and connected to it is called descendant this node. Let's take an example. Take node B, then its parent will be node A, and its children will be nodes D, E, and F.

Sheet

A node that has no children is called a leaf of the tree. In the example, nodes D, E, F, G, I, J, K will be leaves.

This is the basic terminology. Other concepts will be discussed later. So, a binary tree is a tree in which each node will have no more than two children. As you guessed, the tree from the example will not be binary, because nodes B and H have more than two children. Here is an example of a binary tree:

The nodes of the tree can contain any information. A binary search tree is a binary tree that has the following properties:

Both left and right subtrees are binary search trees.
All nodes of the left subtree of an arbitrary node X have data key values less than the data key value of node X itself.
All nodes of the right subtree of an arbitrary node X have data key values greater than or equal to the value of the data key of node X itself.

Key - some characteristic of the node (for example, a number). The key is needed in order to be able to find the element of the tree to which this key corresponds. Binary search tree example:

tree view

As I go along, I will include some (perhaps incomplete) pieces of code in order to improve your understanding. The full code will be at the end of the article.

The tree is made up of nodes. Node structure:

public class Node<T> {
    private T data;
    private int key;
    private Node<T> leftChild;
    private Node<T> rightChild;

    public Node(T data, int key) {
        this.data = data;
        this.key = key;
    }
    public Node<T> getLeftChild() {
        return leftChild;
    }

    public Node<T> getRightChild() {
        return rightChild;
    }
//...остальные методы узла
}

Each node has two children (it is quite possible that the leftChild and/or rightChild children will be null). You probably understood that in this case the number data is the data stored in the node; key - node key.

We figured out the knot, now let's talk about the pressing problems about trees. Here and below, the word "tree" will mean the concept of a binary search tree. Binary tree structure:

public class BinaryTree<T> {
     private Node<T> root;

    //методы дерева
}

As a class field, we only need the root of the tree, because from the root, using the getLeftChild() and getRightChild() methods, you can get to any node of the tree.

Tree Algorithms

Search

Let's say you have a built tree. How to find element with key key? You need to sequentially move from the root down the tree and compare the value of key with the key of the next node: if key is less than the key of the next node, then go to the left descendant of the node, if more - to the right, if the keys are equal - the desired node is found! Relevant code:

public Node<T> find(int key) {
    Node<T> current = root;
    while (current.getKey() != key) {
        if (key < current.getKey())
            current = current.getLeftChild();
        else
            current = current.getRightChild();
        if (current == null)
            return null;
    }
    return current;
}

If current becomes null, then iteration has reached the end of the tree (at a conceptual level, you are in a non-existent place in the tree - a child of a leaf).

Consider the efficiency of the search algorithm on a balanced tree (a tree in which nodes are distributed more or less evenly). Then the search efficiency will be O(log(n)), and the base 2 logarithm. See: if there are n elements in a balanced tree, then this means that there will be log(n) base 2 levels of the tree. And in the search, for one step of the cycle, you go down one level.

Insert

If you have grasped the essence of the search, then it will not be difficult for you to understand the insertion. You just need to go down to the leaf of the tree (according to the rules of descent described in the search) and become its descendant - left or right, depending on the key. Implementation:

   public void insert(T insertData, int key) {
        Node<T> current = root;
        Node<T> parent;
        Node<T> newNode = new Node<>(insertData, key);
        if (root == null)
            root = newNode;
        else {
            while (true) {
                parent = current;
                if (key < current.getKey()) {
                    current = current.getLeftChild();
                    if (current == null) {
                         parent.setLeftChild(newNode);
                         return;
                    }
                }
                else {
                    current = current.getRightChild();
                    if (current == null) {
                        parent.setRightChild(newNode);
                        return;
                    }
                }
            }
        }
    }

In this case, in addition to the current node, it is necessary to store information about the parent of the current node. When current becomes null, the parent variable will contain the sheet we need.
The insertion efficiency will obviously be the same as that of the search - O(log(n)).

Removal

Deletion is the most complex operation that will need to be done with a tree. It is clear that first it will be necessary to find the element that we are going to remove. But then what? If we simply set its reference to null, then we will lose information about the subtree whose root is this node. Tree removal methods are divided into three cases.

First case. The node to be removed has no children.

If the node to be deleted has no children, it means that it is a leaf. Therefore, you can simply set the leftChild or rightChild fields of its parent to null.

Second case. The node to be removed has one child

This case is also not very difficult. Let's go back to our example. Suppose we need to delete an element with key 14. Agree that since it is the right child of a node with key 10, then any of its descendants (in this case, the right one) will have a key greater than 10, so you can easily “cut” it from the tree, and connect the parent directly to the child of the node being deleted, i.e. connect the node with key 10 to node 13. The situation would be similar if we had to delete a node that is the left child of its parent. Think about it for yourself - an exact analogy.

Third case. Node has two children

The most difficult case. Let's take a look at a new example.

Search for a successor

Let's say we need to remove the node with the key 25. Whom shall we put in its place? One of his followers (descendants or descendants of descendants) must become successor(the one who will take the place of the removed node).

How do you know who should be the successor? Intuitively, this is the node in the tree whose key is the next largest from the node being removed. The algorithm is as follows. You need to go to its right child (always to the right one, because it was already said that the key of the successor is greater than the key of the node being deleted), and then go through the chain of left children of this right child. In the example, we must go to the node with key 35, and then go down the chain of its left children to the leaf - in this case, this chain consists only of the node with key 30. Strictly speaking, we are looking for the smallest node in the set of nodes greater than the desired node.

Successor search method code:

    public Node<T> getSuccessor(Node<T> deleteNode) {
        Node<T> parentSuccessor = deleteNode;//родитель преемника
        Node<T> successor = deleteNode;//преемник
        Node<T> current = successor.getRightChild();//просто "пробегающий" узел
        while (current != null) {
            parentSuccessor = successor;
            successor = current;
            current = current.getLeftChild();
        }
        //на выходе из цикла имеем преемника и родителя преемника
        if (successor != deleteNode.getRightChild()) {//если преемник не совпадает с правым потомком удаляемого узла
            parentSuccessor.setLeftChild(successor.getRightChild());//то его родитель забирает себе потомка преемника, чтобы не потерять его
            successor.setRightChild(deleteNode.getRightChild());//связываем преемника с правым потомком удаляемого узла
        }
        return successor;
    }

The complete code of the delete method:

public boolean delete(int deleteKey) {
        Node<T> current = root;
        Node<T> parent = current;
        boolean isLeftChild = false;//В зависимости от того, является ли  удаляемый узел левым или правым потомком своего родителя, булевская переменная isLeftChild будет принимать значение true или false соответственно.
        while (current.getKey() != deleteKey) {
            parent = current;
            if (deleteKey < current.getKey()) {
                current = current.getLeftChild();
                isLeftChild = true;
            } else {
                isLeftChild = false;
                current = current.getRightChild();
            }
            if (current == null)
                return false;
        }

        if (current.getLeftChild() == null && current.getRightChild() == null) {//первый случай
            if (current == root)
                current = null;
            else if (isLeftChild)
                parent.setLeftChild(null);
            else
                parent.setRightChild(null);
        }
        else if (current.getRightChild() == null) {//второй случай
            if (current == root)
                root = current.getLeftChild();
            else if (isLeftChild)
                parent.setLeftChild(current.getLeftChild());
            else
                current.setRightChild(current.getLeftChild());
        } else if (current.getLeftChild() == null) {
            if (current == root)
                root = current.getRightChild();
            else if (isLeftChild)
                parent.setLeftChild(current.getRightChild());
            else
                parent.setRightChild(current.getRightChild());
        } 
        else {//третий случай
            Node<T> successor = getSuccessor(current);
            if (current == root)
                root = successor;
            else if (isLeftChild)
                parent.setLeftChild(successor);
            else
                parent.setRightChild(successor);
        }
        return true;
    }

The complexity can be approximated to O(log(n)).

Finding the maximum/minimum in a tree

Obviously, how to find the minimum / maximum value in the tree - you need to sequentially go through the chain of left / right elements of the tree, respectively; when you get to the leaf, it will be the minimum/maximum element.

    public Node<T> getMinimum(Node<T> startPoint) {
        Node<T> current = startPoint;
        Node<T> parent = current;
        while (current != null) {
            parent = current;
            current = current.getLeftChild();
        }
        return parent;
    }

    public Node<T> getMaximum(Node<T> startPoint) {
        Node<T> current = startPoint;
        Node<T> parent = current;
        while (current != null) {
            parent = current;
            current = current.getRightChild();
        }
        return parent;
    }

Complexity - O(log(n))

Symmetric Bypass

Traversal is a visit to each node of the tree in order to do something with it.

Recursive symmetric traversal algorithm:

Make an action on the left child
Make an action with yourself
Make an action on the right child

Code:

    public void inOrder(Node<T> current) {
        if (current != null) {
            inOrder(current.getLeftChild());
            System.out.println(current.getData() + " ");//Здесь может быть все, что угодно
            inOrder(current.getRightChild());
        }
    }

Conclusion

Finally! If I didn’t explain something or have any comments, then I’m waiting in the comments. As promised, here is the complete code.

Node.java:

public class Node<T> {
    private T data;
    private int key;
    private Node<T> leftChild;
    private Node<T> rightChild;

    public Node(T data, int key) {
        this.data = data;
        this.key = key;
    }

    public void setLeftChild(Node<T> newNode) {
        leftChild = newNode;
    }

    public void setRightChild(Node<T> newNode) {
        rightChild = newNode;
    }

    public Node<T> getLeftChild() {
        return leftChild;
    }

    public Node<T> getRightChild() {
        return rightChild;
    }

    public T getData() {
        return data;
    }

    public int getKey() {
        return key;
    }
}

BinaryTree.java:

public class BinaryTree<T> {
    private Node<T> root;

    public Node<T> find(int key) {
        Node<T> current = root;
        while (current.getKey() != key) {
            if (key < current.getKey())
                current = current.getLeftChild();
            else
                current = current.getRightChild();
            if (current == null)
                return null;
        }
        return current;
    }

    public void insert(T insertData, int key) {
        Node<T> current = root;
        Node<T> parent;
        Node<T> newNode = new Node<>(insertData, key);
        if (root == null)
            root = newNode;
        else {
            while (true) {
                parent = current;
                if (key < current.getKey()) {
                    current = current.getLeftChild();
                    if (current == null) {
                         parent.setLeftChild(newNode);
                         return;
                    }
                }
                else {
                    current = current.getRightChild();
                    if (current == null) {
                        parent.setRightChild(newNode);
                        return;
                    }
                }
            }
        }
    }

    public Node<T> getMinimum(Node<T> startPoint) {
        Node<T> current = startPoint;
        Node<T> parent = current;
        while (current != null) {
            parent = current;
            current = current.getLeftChild();
        }
        return parent;
    }

    public Node<T> getMaximum(Node<T> startPoint) {
        Node<T> current = startPoint;
        Node<T> parent = current;
        while (current != null) {
            parent = current;
            current = current.getRightChild();
        }
        return parent;
    }

    public Node<T> getSuccessor(Node<T> deleteNode) {
        Node<T> parentSuccessor = deleteNode;
        Node<T> successor = deleteNode;
        Node<T> current = successor.getRightChild();
        while (current != null) {
            parentSuccessor = successor;
            successor = current;
            current = current.getLeftChild();
        }

        if (successor != deleteNode.getRightChild()) {
            parentSuccessor.setLeftChild(successor.getRightChild());
            successor.setRightChild(deleteNode.getRightChild());
        }
        return successor;
    }

    public boolean delete(int deleteKey) {
        Node<T> current = root;
        Node<T> parent = current;
        boolean isLeftChild = false;
        while (current.getKey() != deleteKey) {
            parent = current;
            if (deleteKey < current.getKey()) {
                current = current.getLeftChild();
                isLeftChild = true;
            } else {
                isLeftChild = false;
                current = current.getRightChild();
            }
            if (current == null)
                return false;
        }

        if (current.getLeftChild() == null && current.getRightChild() == null) {
            if (current == root)
                current = null;
            else if (isLeftChild)
                parent.setLeftChild(null);
            else
                parent.setRightChild(null);
        }
        else if (current.getRightChild() == null) {
            if (current == root)
                root = current.getLeftChild();
            else if (isLeftChild)
                parent.setLeftChild(current.getLeftChild());
            else
                current.setRightChild(current.getLeftChild());
        } else if (current.getLeftChild() == null) {
            if (current == root)
                root = current.getRightChild();
            else if (isLeftChild)
                parent.setLeftChild(current.getRightChild());
            else
                parent.setRightChild(current.getRightChild());
        } 
        else {
            Node<T> successor = getSuccessor(current);
            if (current == root)
                root = successor;
            else if (isLeftChild)
                parent.setLeftChild(successor);
            else
                parent.setRightChild(successor);
        }
        return true;
    }

    public void inOrder(Node<T> current) {
        if (current != null) {
            inOrder(current.getLeftChild());
            System.out.println(current.getData() + " ");
            inOrder(current.getRightChild());
        }
    }
}

PS

Degeneration to O(n)

Many of you may have noticed: what if you make the tree become unbalanced? For example, put nodes in the tree with increasing keys: 1,2,3,4,5,6... Then the tree will be somewhat reminiscent of a linked list. And yes, the tree will lose its tree structure, and hence the efficiency of data access. The complexity of search, insertion, and deletion operations will become the same as that of a linked list: O(n). This is one of the most important, in my opinion, disadvantages of binary trees.

Only registered users can participate in the survey. Sign in, you are welcome.

I have not been on Habré for quite a long time, and I would like to know what articles on what topics would you like to see more?

Data structures
Algorithms (DP, recursion, data compression, etc.)
Application of data structures and algorithms in real life
Programming android applications in Java
Java Web Application Programming

2 users voted. 1 user abstained.

Source: www.habr.com

Binary Tree or how to prepare a binary search tree