Sunday, May 08, 2011

Computing the Cumulative sum of a range of elements in a binary tree

Let us assume the alternative definition of a BST (Binary Search Tree), namely: "The left subtree shall contain elements less than or equal to the element at the root and the right subtree shall contain elements greater than or equal to the element at the root".

This means that if we want to compute the cumulative sum of all numbers ≤ to a given number, we need to update some metadata every time a node is added to the tree or removed from the tree (and for a balanced BST also when rebalancing occurs). Fortunately, this implementation of the AVL Tree allows us to do just this by way of hook functions that you can pass to it via the constructor.

We define a function binary_tree_summer that computes the cumulative sum of a node and its child nodes and stores in the sum attribute of the node.

To query the cumulative sum, we define a function query_sum that basically employs the following strategy (given our definition of the BST above):
  • If the value at the node is ≤ than the value we are looking for, we take the cumulative sum at the node, subtract from it the sum in it's right subtree (since we don't know how many elements in the right subtree will match our query) and then add the result of querying the right subtree
  • If the value at the node is > the value we are looking for, we are assured that the complete range lies in the node's left subtree, so we query the left subtree

You can see that at every stage, we decide to go either left or right, which means that our runtime is bounded by the height of the tree, which fortunately in case of an AVL Tree is at most O(log n).

The code for such a use-case is shown below:

function cumulative_sum() {
    var items = [ 4, 9, 2, 5, 4, 2, 1, 2, 3, 2, 1, 7, 3, 2 ];

    function binary_tree_summer(node) {
        var _s = (node.left ? node.left.sum : 0) + 
         (node.right ? node.right.sum : 0) + node.value;
        node.sum = _s;
    }

    var tree = new algo.AVLTree(algo.cmp_lt, binary_tree_summer);
    for (var i = 0; i < items.length; ++i) {
        tree.insert(items[i]);
    }

    function query_sum(node, value) {
        // We do the query like we do for a Segment Tree
        if (!node) {
            return 0;
    }

    if (node.value <= value) {
        var sub = node.right ? node.right.sum : 0;
        return node.sum - sub + query_sum(node.right, value);
    }
    else {
        // node.value > value
        return query_sum(node.left, value);
    }
}

console.log("<= 2:", query_sum(tree.root, 2));
console.log("<= 5:", query_sum(tree.root, 5));

Let's look at summing all numbers ≤ 2. The diagram below shows us how the call flow looks like. Note:
  • The green lines are the paths followed down the root
  • The red lines are the paths that are rejected (not followed)
  • The numbers in blue above each node denote the value of the sum attribute for that node


The sum is obtained in the following manner for values ≤ 2: 47-39+4


The diagram below shows us the call flow for summing all numbers ≤ 5.


The sum is obtained in the following manner for values ≤ 5: 47-39+39-25+25-16

No comments: