The Integer Knapsack Problem
You are given a set of items with weight and value for each. Some items are more worthy than others, but there is no straight relationship. For example, you could have a laptop worth $1000 with a weight of 3kg, and two ipads weighing 700 grams but worth $800. You can maximize the worth in your backpack by putting in two ipads rather than the notebook. This actually is the main problem: given this set of items (weight, value), maximimize the value in a given weight (or cost) range.
For example, take this input data:
Item | #1 | #2 | #3 | #4 |
---|---|---|---|---|
Cost | 35 | 28 | 11 | 20 |
Value | 8 | 8 | 2 | 5 |
Convince yourself that the best value you can get is achieved by
- Taking 5 items #2: 5x28 = cost of 140, value of 40
- plus 1 item of #4: 1x20 = 20, value=5
- Total 6 items with a total cost of 160 and a total value of 45
This problem seems quite straightforward, but it is actually NP-Complete. See the Wikipedia article for more information.
Ask yourself: Do I understand the problem?
- Input is an array A of tuples (cost, value) values, and a maximum cost C.
- Don't think of cost and value as dollar amounts each: they are not exchangable. (Few people would buy 5 items at $28 each for a total value of $8 each). Think of cost and value as being in different media. For example, cost could be weight in Kilogram, and value could be value for you (as in: how do you pack your airtravel luggage without exceeding maximum weight?).
- The output should be a count for each item: how many of each do you want to have. It is trivial to calculate from this the total cost, the total value and the "gap" (sometimes it is better not to try to fill your luggage to its maximum extent).
Setting up a test harness
As usual, our first step is to prepare a testbed. Our code strategy for the test will be this:
- We create an array of random tuples (cost, value)
- We call several different methods to calculate the result, comparing the execution time and result quality
Here is the code for this:
#! -*- Encoding: Latin-1 -*-
def selftest():
C = 167
A = [(random.randrange(5, 50), random.randrange(2, 10), ) for i in range(40)]
print A
methods = { 'brute-force:' : ik_with_brute_force,
'using-sorting' : ik_with_sorting,
'using-simulated-annealing' : ik_with_simulated_annealing,
}
for method in methods:
t0 = time.time()
result = methods[method](A, C)
print "%s: %.5f: %r" % (method, time.time() - t0, result, )
if __name__ == "__main__":
selftest()
Brute-Force approach
Since the problem is NP complete we might as well give up and just enumerate all possibilities. Actually, the algorithm for doing so is pretty straightforward:
- For every item in the input array:
- Try every possible value n from 0 .. (max. items that will fit in C)
- For each such combination, calculate the total value. If the total value of this solution exceeds the best known total value so far, use it as a new best-solution.
This translates into code pretty straightforward:
def ik_with_brute_forc(A, C):
# we start with an empty solution
solution = [0] * len(A)
# and obviously, there is no best solution yet
best_solution = [ [], 0, 0, 0 ]
# helper method: evaluate the quality of a single solution
def evaluate_solution(C):
# given the solution indices, calculate the total value and total cost for this solution
total_value = 0
total_cost = 0
for index, item in enumerate(solution):
cost, value = A[index]
total_value += item * value
total_cost += item * cost
# a solution is choosen if it results in a higher value than the last known solution
if total_value > best_solution[1]:
best_solution[0] = solution[:]
best_solution[1] = total_value
best_solution[2] = total_cost
best_solution[3] = C
# helper method: recurse through all permutations of possible solutions
def recurse(n, C):
# we've reached a viable solution
if n == len(A):
evaluate_solution(C)
else:
# look at current item and try every possible count of this item (from 0..n), recursively
cost, value = A[n]
for k in range((C/cost)+1):
solution[n] = k
recurse(n+1, C-(k*cost))
recurse(0, C)
return best_solution
If you try this you'll notice that a) you can use it only for small input sets, and b) - more surprisingly - execution time drastically differs for different input values. For example, take this test output:
[(10, 2), (7, 8), (5, 5), (47, 9), (38, 4)] brute-force:: 0.02300: [[0, 23, 1, 0, 0], 189, 166, 1]
[(41, 5), (19, 8), (49, 5), (40, 6), (45, 3)] brute-force:: 0.00000: [[0, 8, 0, 0, 0], 64, 152, 15]
(for a total max. cost of 167 each). This means that the first input is dramatically slower than the second (we are talking about a list with a whopping 5 (five) input items each!). If you look at the numbers, it becomes clear why: costs of individual items in the first list are on average much smaller than in the second. If the cost of an individual item is smaller you can use more of them, so you must test more of them. For example, in the first list the third item has a cost of 5, which means you can use - and must test for - count in the range 0..33, whereas the third item in the second list has a cost of 49, so you need test only 0..3.
But at any rate, it becomes clear that brute-force is a non-starter if you have larger input sizes. I want to show you two alternative heurisics that have quite good results.
Using Simulated Annealing
The first idea is to use a combination of a greedy algorithm, with an idea from simulated annealing thrown in. One day I will open the algorithm section on pyalda, then it will all become much more clear, but for now just bear with me.
- For each item in A, use as many as possible. This is the greedy bit. For example, given this input
Greedy would use 110/20=5 items of cost 20 (value 100), and one item of cost 10 (value 10), giving you a total cost of 110 (and a total value of 110). To see that greedy can fail, take a look at this input instead:
A = [(20,20), (10,10)], C = 110
Greedy would use 110/50=2 items of cost 50 (value 100) - and then leave a cost of 10 unused. Whereas you can see that the following combination is better: 1 items of cost 50 (value 50) and 3 items of cost 20 (value 60) = total value 110.A = A = [(50,50), (20,20)], C = 110
- So, and this is the simulated-annealing bit, also try using less than the maximum, say 1, 2 or 3 items less.
It must be clear that this is only a heuristic. There is no guarantee that it will result in the best (or even in a good) result - but we'll take a look at how good this one actually performs.
Given that description, we modify the brute force algorithm and have something like the following:
def ik_with_simulated_annealing(A, C):
# we start with an empty solution
solution = [0] * len(A)
# and obviously, there is no best solution yet
best_solution = [ [], 0, 0, 0 ]
# helper method: evaluate the quality of a single solution
def evaluate_solution(C):
# given the solution indices, calculate the total value and total cost for this solution
total_value = 0
total_cost = 0
for index, item in enumerate(solution):
cost, value = A[index]
total_value += item * value
total_cost += item * cost
# a solution is choosen if it results in a higher value than the last known solution
if total_value > best_solution[1]:
best_solution[0] = solution[:]
best_solution[1] = total_value
best_solution[2] = total_cost
best_solution[3] = C
if C == 0:
return True
# helper method: recurse through all permutations of possible solutions
def recurse(n, C):
# we've reached a viable solution
if n == len(A):
if evaluate_solution(C):
return True
else:
cost, value = A[n]
# and this is the greedy/annealing bit: calculate the maximum value, and also try up to 5 items less
max = C/cost
min = max - 5
if min < 0:
min = 0
if max:
for k in range(max, min-1, -1):
solution[n] = k
recurse(n+1, C-(k*cost))
else:
# special case : doesn't fit at all
solution[n] = 0
if recurse(n+1, C):
return True
recurse(0, C)
return best_solution
Using Sorting
Actually, we can take this a little bit further, by combining a couple of observations
- We sort the input list so that items with a bigger value are tested first.
- Also, we realize that this way we don't need recursion because we assume that any item later in the list will have less value anyway
- A small but viable optimization is to look at items that have the same price, but a different value. This can happen easily given pseudo-random numbers with a limited value range. We'll want to fold those to use the item with the best price/value ratio always
Note: This is still only a heuristic, not a proof-of-best-solution
The algorithm now looks like this:
def ik_with_sorting(A, C):
# fold items with same price, but different value
fold_price_value = {}
for index, item in enumerate(A):
cost, value = item
try:
existing_value = fold_price_value[cost][0]
except KeyError:
existing_value = -value
if existing_value < value:
fold_price_value[cost] = (value, index, )
# there is no best solution yet
best_solution = [ [], 0, 0, 0 ]
# helper method: evaluate the quality of a single solution
def evaluate_solution(C):
# given the solution indices, calculate the total value and total cost for this solution
total_value = 0
total_cost = 0
for index, item in enumerate(solution):
cost, value = A[index]
total_value += item * value
total_cost += item * cost
# a solution is choosen if it results in a higher value than the last known solution
if total_value > best_solution[1]:
best_solution[0] = solution[:]
best_solution[1] = total_value
best_solution[2] = total_cost
best_solution[3] = C
# another way to formulate the "simulated annealing" bit: use k = 0..3 differences
for k in range(3):
for i in range(len(A)):
remainder = C
solution = [0] * len(A)
for j, cost in enumerate(sorted(fold_price_value)):
value, index = fold_price_value[cost]
n = remainder / cost
if (i == j) and (n >= k):
n -= k
solution[index] = n
remainder -= n * cost
evaluate_solution(remainder)
return best_solution
Results, please
Well, the good thing is that very often all three algorithms give equal results. Example: given these values
[(10, 8), (39, 2), (14, 4), (30, 2), (47, 2), (25, 6), (29, 8), (49, 2), (28, 2), (24, 9), (34, 8), (21, 4), (5, 4), (28, 3), (7, 3)]
The results are:
Method | Runtime | Result |
---|---|---|
Simulated Annealing | 0.00400 | [[16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0], 132, 165, 2] |
Sorting | 0.00000 | [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 33, 0, 0], 132, 165, 2] |
Brute-Force | 2.81400 | [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 33, 0, 0], 132, 165, 2] |
But for this data-set:
[(47, 7), (34, 6), (9, 5), (29, 3), (17, 7), (18, 4), (29, 6), (38, 2), (34, 6), (7, 6), (6, 7), (42, 8), (6, 5), (9, 4), (39, 3)]
the results look like this:
Method | Runtime | Result |
---|---|---|
Simulated Annealing | 0.04200 | [[0, 0, 13, 0, 0, 0, 0, 0, 0, 2, 6, 0, 0, 0, 0], 119, 167, 0] |
Sorting | 0.00100 | [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 27, 0, 0, 0, 0], 189, 162, 5] |
Brute-Force | 13.58600 | [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 27, 0, 0, 0, 0], 189, 162, 5] |
This is an example, by the way, of why the brute-force algorithm really isn't feasible: 13.6 seconds, and the same result as the sorting algorithm...
Both of the heuristic algorithms are way faster, but generally speaking the sorting-approach just is much better for most input data I saw. In the end, both are heuristic versions, and you might even try both and compare the results (if that is feasible for your input size).