The Online Knapsack Problem
Last semester I took a course on online and approximation algorithms, and I want to present some really neat proofs related to the online knapsack problem. The course was taught by Hans-Joachim Bökenhauer and Dennis Komm, wo incidentally also first discovered the results I will present below. All results mentioned (and more) can be found in the original paper.
An online algorithm is given some input sequence $x1, x2, …, xn$, and must respond to each input $xi$ with some output $yi$, which may depend both on the current (and previous) inputs and the algorithm’s previous outputs. The algorithm does not know in advance how long the input sequence is, which presents an additional challenge.
Online algorithms are judged by their competitive ratio, which is the ratio between the score of an optimal solution of offline variant and the score of the solution produced by the algorithm. If this ratio is less than some number $k$, then we call the algorithm $k$-competitive. If the competitive ratio in unbounded, then we say the algorithm is not competitive.
Normally an offline algorithm is deterministic, which means it cannot access any other information than the inputs its given. A randomized online algorithm can additionally access an infinite stream of bits, each independently either $0$ or $1$ with probability exactly $1/2$. In this case we score the algorithm by its expected competitive factor, since its exact score may vary based on the randomness. Finally, an online algorithm with advice is permitted to access an infinite stream of advice bits, which an oracle has computed based on the input it is given. When analyzing randomized online algorithms or online algorithms with advice we are interested in minimizing the number of random or advice bits used to achieve specific (expected) competitive ratios.
We will focus on the simple online knapsack problem, which is the following: Given a backpack of capacity $1$ and $n$ objects with weights $w1, …, wn$, fill the backpack as much as possible, without exceeding the capacity. We process the weights in order, and for each weight we have to immediately decide whether we pack it or not, and we cannot change our choice at a later point in time. Restricting ourselves to capacity $1$ seems arbitrary, but we can easily adjust for any backpack capacity $b$ by scaling the weights accordingly.
First of all we notice that any deterministic algorithm for the online knapsack problem is not competitive. We can “trick” any algorithm $A$ using the following procedure: First we offer an object of weight $eps$, where $eps$ can be arbirarily small. If $A$ takes this object, we offer an object of weight $1$ and no further objects. Clearly an optimal solution can entirely fill the back, while $A$ only achieved a fill of $eps$. Since $eps$ can be arbirarily small, the competitive ratio is unbounded, and $A$ is not competitive. If on the other hand $A$ refuses the element with weight $eps$ then we do not offer any other objects, and again the competitive ratio is unbounded.
Clearly, some help is needed. How much help do we need to be optimal? Using $n$ advice bits the online algorithm is trivially optimal: we simply encode for each item one bit whether to take it or not. It turns out that we also need at least $n-1$ advice bits to be optimal: Imagine the weight sequence $1/2, 1/4, …, 1/2^(n-1), wb$. Let $b$ be a bit string of length $n-1$, and define the weight $wb$ as follows:
wb = 1 - sum (i = 1 to n-1) b_i * 1/2^i
For each of the $2^(n-1)$ possible bitstrings, the weight of $wb$ uniquely determines1 which of the previous $n-1$ weights must be taken in order to achieve the optimal fill of $1$. Thus any optimal algorithm must need at least $n-1$ advice bits to distinguish the $2^(n-1)$ identical prefixes.
Being optimal however is quite an ambitious goal. Can we do better using fewer advice bits? It turns out that we can: There exists a 2-competitive algorithm using only one advice bit. What information do we encode in this one bit? Take a few minutes to think about this if you wish.
What we need to know is whether there exists an object with weight at least $1/2$. If yes, then we simply wait until the first such object occurs and pack it. Since the optimal solution can pack at most $1$ and we pack at least $1/2$ we are 2-competitive. Now what if there does not exist such an object, and all objects weigh less than $1/2$? Either the total sum of weights is less than $1$, then we can pack all items into the bag greedily and are even optimal. If the total weight is larger than $1$, then we claim that greedy is 2-competitive. There must exist at least one element that greedy cannot pack since it overfills the backpack. But by our assumption this object must weigh less than $1/2$, meaning that our backpack must contain at least objects of total weight $1/2$, which is 2-competitive by the same reasoning as in the first case.
Until now we have only considered advice algorithms, it seems natural to ask how a randomized algorithm performs. Naturally we expect a randomized algorithm to perform a bit worse than an advice algorithm given the same number of bits. Indeed, if we execute the same algorithm as above and guess the advice bit (by sampling a random bit), then in half the cases we are 2-competitive, which implies that we are 4-competitive overall.
However it turns out that there exists a randomized algorithm which is 2-competitive using only one advice bit. I consider this algorithm and its correctness proof to be one of the most elegant constructions I have seen to date. Again I urge you to take some time to think about this yourself.
The 2-competitive randomized online algorithm $A$ works as follows: We have two algorithms $A1$ and $A2$, and $A$ simply chooses one of them uniformly at random. Algorithm $A1$ uses a greedy approach: Pack every item that fits until the backpack is full. We have already learned that this strategy by itself is not competitive, the beauty comes in combination of $A2$: algorithm $A2$ simulates $A1$, but does not pack any items. Once $A1$ cannot pack an item for the first time, $A2$ begins to pack items greedily into its backpack.
This algorithm is 2-competitive: If the sum of all weights are less than $1$, then $A1$ is optimal and $A2$ never packs an item - in expectation $A$ achieves half of the optimal solution and we are done. If the total weight is larger than $1$, then the sum of weights packed by both algorithms must be greater than $1$, and again on average we will have packed at least $1/2$ total weight, which makes $A$ 2-competitive. I find this algorithm and it’s correctness proof both amazingly simple and exceptionally clever, a beautiful combination that awakes a motivation to delve deeper into this subject matter.
If you want to learn more about the online knapsack problem then I recommend you read the original paper by Bökenhauer, Komm, Královič and Rossmanith, the section on the simple2 knapsack problem is quite accessible even with little theoretical background. We have barely scratched the surface, and the paper deals with further interesting problems. If you are interested in online algorithms in general then I recommend Komm’s An Introduction to Online Computation, which holds its promise and is also quite accessible with basic theoretical computer science knowledge.
I hope I could convey some of the appreciation I have for this topic and you learned a little bit about online algorithms. If you have any questions or comments feel free to reach out to me via my public inbox.