Friday, March 08, 2013

Weight Balance for fun and profit

According to Dr. Bender, Weight Balance (different from the wikipedia article on weight-balanced tree) is one of the most important topics in Computer Science, and we were lucky enough to learn it from him! Others might not be so lucky, so let me try and do a job that's hopefully a constant factor (or fraction) as good - since it won't matter because all we care about are the asymptotics.

Pre-requisite: This note on Amortized Analysis.

We'll use weight balance to implement a dictionary structure and examine how the guts of one such structure, the weight-balanced tree work.

A dictionary data structure is one that supports the following dictionary data structure operations:
  1. Insert
  2. Delete
  3. Find
  4. Predecessor
  5. Successor
Points 4 & 5 are what distinguish a dictionary data structure from a keyed data structure such as a Hash Table.

Now, you might have heard of the following data structures that (efficiently) support the operations mentioned above:
  1. Treaps
  2. Skip-Lists
It would surprise (or maybe not) you to know that both these structures work on the principle (guess) of weight-balance!!

So what exactly do we mean when we talk about the weight of a sub-tree in a BST? Well, as it turns out, the weight of the sub-tree in a BST is just the count of the number of nodes in the sub-tree rooted at that node (including the node itself).

For example, the following tree (image courtesy wikipedia) has a weight of 9
and the sub-tree rooted at node 10 has a weight of 3.

A weight-balanced tree rooted at node u is one in which (either):
  1. The weights of the left and right children of a sub-tree are within constant factors of each other:
    weight(Left-Child(u)) + 1 = Θ(weight(Right-Child(u) + 1)
    Note that the +1 is important for pedantic reasons as far as the order-notation is concerned.
  2. The weights of the left and right children of a sub-tree are within constant factors of the weight of the complete sub-tree
    weight(Left-Child(u)) + 1 = Θ(weight(u) + 1) AND
    weight(Right-Child(u)) + 1 = Θ(weight(u) + 1)
It turns out that both these definitions are equivalent.

More realistically, if we stick to the second definition, we have:
weight(Child(u)) + 1 ≥ 0.25 * (weight(u) + 1) AND
weight(Child(u)) + 1 ≤ 0.75 * (weight(u) + 1)

where, Child(u) denotes both the left & right child of u.

For example, if we consider the following example tree, which is clearly out of weight-balance (don't ask me how we got there because this example is made-up), we re-balance it to be perfectly in balance (if we have an odd number of nodes or almost perfectly balanced otherwise).

We should be careful about how we re-balance these N nodes, because if the cost is any worse than Θ(N), then we won't get the update costs that we desire. The easiest way to perform the re-balance with a cost of Θ(N) is to perform an in-order traversal of the subtree rooted at node u, and write out the sorted nodes to an array. We can then re-create a perfectly balanced BST from that array either using recursion or the Day–Stout–Warren algorithm.

This is where the fun starts!!

Q 1. How many nodes need to be inserted under a sub-tree rooted at node v to bring the sub-tree out of balance (assuming it is perfectly balanced to start off with)? Let's assume that the sub-tree originally contains N nodes.
A. You need to insert some constant fraction of the weight of that sub-tree! which is Ω(N).

Q 2. What is the cost to rebalance the sub-tree rooted at node v if we know that that sub-tree has a weight of N?
A. Well, we already answered this above. The answer is Θ(N).

Q 3. How many sub-trees potentially go out of balance when you insert a node?
A. We know that a node is inserted at the leaf level, so potentially all the sub-trees that are rooted at the nodes on the leaf-to-root path with the newly inserted node as the leaf node can potentially go out of balance. This happens to be Θ(log N) nodes.

∴ the amortized cost to insert a new node into the balanced tree is:
Ω(N)/Θ(N) * Θ(log N) = Θ(log N).

Now, that's a fairly straight-forward algorithm to get the same (amortized) costs as the worst-case costs for updates with a more complicated beast such as an RB-Tree or an AVL-Tree. Though, I feel that Treaps are much simpler to implement.

Tuesday, March 05, 2013

Amortized Analysis or: How I learned to stop worrying and love averages

Amortized Analysis is usually seen as a pretty scary term, and I've seen a lot of people confuse it with Average Case Analysis, but let's try and de-mistyfy it, one step at a time.

We've performed amortized analysis at some point or another in our lives without actually knowing it. A few lame examples follow:
  • For the stock broker: Purchasing shares at a lower price to average out the cost price of all the holdings of a given stock
  • For the fitness enthusiast: Working out thrice as much at the gym today because [s]he missed 2 days before today
  • For the reader: Reading a few more pages of a book so that you can take a break tomorrow and still complete it on time
There are situations where amortized analysis doesn't work too. For example, suppose you're in charge of cooking for your family, you can't say that you'll cook thrice as much on the 3rd day from today and not cook at all between now and then. Sorry, but sometimes it just doesn't work!! These are cases where you must use worst-case bounds.

Let's try some exercise and learn by example!
  1. Ex-1: You're jogging 16 miles every day for 8 days, and your friend jogs 8 miles and 24 miles on every odd and even numbered day respectively (starting from day #1). Who jogs more over a period of 8 days? Here is a graphical representation of how much you and your friend ran over a period of 8 days:

  2. Ex-2:You're jogging 16 miles every day for 7 days, and your friend jogs in the following manner:
    DayMiles Jogged
    Who jogs more over a period of 7 days?

  3. Ex-3: You're playing a game where you have a graph and you start at the node with the symbol S and finish at the node with the symbol F. The constraints on your moves are that you must take EXACTLY ONE blue coloured edge in every move, but you can take as many (or zero) red coloured edges in a move. A move contains a combination of red and blue edges.
    An example graph is shown here:
    One trace of a successful completion of a game that visited 11 blue edges and 4 red edges is shown here:
    Can you identify the maximum number of edges of any colour that will ever be visited if I tell you that exactly 107 blue edges were taken in a particular successful completion of the game?

If you got this far, and were able to solve all the exercises - congratulations! - you've understood what amortized analysis is all about! And as an added benefit, Ex-2 is how one would go about analyzing the insertion cost for Dynamic Arrays, and Ex-3 is actually how one would analyze the running time for the KMP string matching algorithm!

PV=nRT or: How I learned to stop worrying and love cooking under pressure

I'll try to be as brief as possible here. Just going to talk about the physics behind Pressure Coking and why it's so cool (and hipster). Hey, it saves fuel and hence damages the environment to a much lesser much lesser extent than does normal cooking.

From How Does A Pressure Cooker Work?: "Simply put, water boils at 212°F (100°C). At this point, no matter how long you continue to boil, it always stays the same temperature. As the water evaporates and becomes steam it is also the same temperature, 212°F.

The only way to make the steam hotter (and/or to boil the water at a higher temperature) is to put the system under pressure. This is what a pressure cooker does. If we fit an absolutely tight cover to the pan so no steam can escape while we continue to add heat, both the pressure and temperature inside the vessel will rise. The steam and water will both increase in temperature and pressure, and each fluid will be at the same temperature and pressure as the other. "

To explain the last paragraph above, let's turn to physics and the Ideal Gas Law, which states that PV=nRT where,
and we don't really care about what the other symbols mean.

This means that Pressure and Temperature vary directly with each other, and if you raise the pressure of a fluid, then the temperature at which it changes state will also increase!!