Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
CS212 Unit 4
Contents
1 CS212 Unit 4......................................................................................................................................................1/16 1.1 1. 01 l Water Pouring Problem...........................................................................................................1/16 1.2 2. 03 q Combinatorial Complexity ......................................................................................................2/16 1.3 3. 03 s Combinatorial Complexity......................................................................................................2/16 1.4 4. 04 q Exploring the Space................................................................................................................2/16 1.5 5. 04 Exploring the Space...................................................................................................................3/16 1.6 6. 05 Pouring Solution........................................................................................................................3/16 1.7 7. 06 Doctest.......................................................................................................................................6/16 1.8 8. 07 Bridge Problem..........................................................................................................................7/16 1.9 9. 08 q Representing State..................................................................................................................8/16 1.10 10. 08 s Representing State...............................................................................................................8/16 1.11 11. 09 p Bridge Successors...............................................................................................................8/16 1.12 12. 09 s Bridge Successors ................................................................................................................8/16 1.13 13. 10 q Bridge Solution...................................................................................................................9/16 1.14 14. 10 s Bridge Solution ....................................................................................................................9/16 1.15 15. 11 q Debugging...........................................................................................................................9/16 1.16 16. 11 s Debugging...........................................................................................................................9/16 1.17 17. 12 q Did it work........................................................................................................................10/16 1.18 18. 12 s Did it work .........................................................................................................................10/16 1.19 19. 13 q Improving the Solution.....................................................................................................10/16 1.20 20. 13 s Improving the Solution ......................................................................................................10/16 1.21 21. 14 p Modify Code.....................................................................................................................10/16 1.22 22. 14 s Modify Code ......................................................................................................................10/16 1.23 23. 15 p Adding Tests.....................................................................................................................11/16 1.24 24. 16 p Refactoring Paths..............................................................................................................11/16 1.25 25. 16 s Refactoring Paths..............................................................................................................12/16 1.26 26. 17 p Calculating Costs..............................................................................................................12/16 1.27 27. 17 s Calculating Costs ...............................................................................................................12/16 1.28 28. 18 l Putting it Together.............................................................................................................12/16 1.29 29. 19 l Generalizing.......................................................................................................................13/16 1.30 30. 20 q Missionaries and Cannibals..............................................................................................13/16 1.31 31. 20 s Missionaries and Cannibals...............................................................................................14/16 1.32 32. 21 q Generalized State..............................................................................................................14/16 1.33 33. 21 s Generalized State ...............................................................................................................14/16 1.34 34. 22 p csuccessors........................................................................................................................14/16 1.35 35. 22 s csuccessors........................................................................................................................14/16 1.36 36. 23 l mc problem........................................................................................................................14/16 1.37 37. 24 q Shortest Path Search.........................................................................................................14/16 1.38 38. 24 s Shortest Path Search..........................................................................................................15/16 1.39 39. 25 p sps function.......................................................................................................................15/16 1.40 40. 25 s sps function ........................................................................................................................15/16 1.41 41. 26 p Cleaning up MC Problem.................................................................................................15/16 1.42 42. 26 s Cleaning up MC Problem..................................................................................................15/16 1.43 43. 27 p Lowest Cost Search ...........................................................................................................15/16 1.44 44. 27 s Lowest Cost Search...........................................................................................................16/16 1.45 45. 28 p Back to Bridge Problem....................................................................................................16/16 1.46 46. 28 s Back to Bridge Problem....................................................................................................16/16 1.47 47. 29 l Summary............................................................................................................................16/16
1 CS212 Unit 4
Contents 1. 01 l Water Pouring Problem 2. 03 q Combinatorial Complexity 3. 03 s Combinatorial Complexity 4. 04 q Exploring the Space 5. 04 Exploring the Space 6. 05 Pouring Solution 7. 06 Doctest 8. 07 Bridge Problem 9. 08 q Representing State 10. 08 s Representing State 11. 09 p Bridge Successors 12. 09 s Bridge Successors 13. 10 q Bridge Solution 14. 10 s Bridge Solution 15. 11 q Debugging 16. 11 s Debugging 17. 12 q Did it work 18. 12 s Did it work 19. 13 q Improving the Solution 20. 13 s Improving the Solution 21. 14 p Modify Code 22. 14 s Modify Code 23. 15 p Adding Tests 24. 16 p Refactoring Paths 25. 16 s Refactoring Paths 26. 17 p Calculating Costs 27. 17 s Calculating Costs 28. 18 l Putting it Together 29. 19 l Generalizing 30. 20 q Missionaries and Cannibals 31. 20 s Missionaries and Cannibals 32. 21 q Generalized State 33. 21 s Generalized State 34. 22 p csuccessors 35. 22 s csuccessors 36. 23 l mc problem 37. 24 q Shortest Path Search 38. 24 s Shortest Path Search 39. 25 p sps function 40. 25 s sps function 41. 26 p Cleaning up MC Problem 42. 26 s Cleaning up MC Problem 43. 27 p Lowest Cost Search 44. 27 s Lowest Cost Search 45. 28 p Back to Bridge Problem 46. 28 s Back to Bridge Problem 47. 29 l Summary
CS212 Unit 4
06/05/12 11:49:22
pouring actions, and the pouring can be from one glass to another. It can go in the other direction. It can go from the faucet into each of the glasses. And it can go from the glasses down the drain. Six different actions we can take, and we want to find a sequence of actions that arrives at this goal of 6 oz. Of course, we can generalize the problem and put any number rather than 9 and 4 and 6. As usual, let's make our inventory of concepts that we're going to be dealing with. We have the glass, and the glass has a capacity and a current level. This glass would have capacity 9, current level 5. We're also going to need collection of glasses probably--a pair of glasses. I guess we can say that the pair of glasses and they're current levels represents a complete state of the world. We'll think of that as a state of the world. Everything we need to know about where we are in the problem. Then we have a goal that we're trying to reach. We have the pouring actions--1, 2, 3, 4, 5, 6. That breaks down into emptying, filling, and transferring. The transferring, I think, is a little bit tricky, because there are two ways to do it. When we were transferring from the 9 oz into the 4 oz-- so we transfer from x to y--we can do that until y is full. That's what happened here. The 4 oz was full. Or we could do it until x is empty. If we were starting to pour back 4 oz from here into an empty one, we could do it until it was empty. Anything else in the inventory? Oh, well, we certainly need a notion of a solution. A solution is going to be a sequence of steps-- to pour from here to here, then from here to the drain, then fill up, then pour again, and so on. What this unit is really all about is techniques for finding these solutions, which are sequences of steps. Again, we're always talking about managing complexity in this class. The complexity we're trying to manage here is a complexity
CS212 Unit 4
06/05/12 11:49:22
memory. Computer exploration is more like a collection of explorers all collectively expanding the frontier. Our next move can be to say we'll take one of these explorers, say the one in this state here, and say now tell me what's next. You've got 6 actions from there. Where do they go to? Maybe some of them explore the world and generate new states that we haven't seen before. Maybe some of them go to a state that we already know is on the frontier. Maybe some of them regress backwards into previously explored territory. But we can keep on going, expanding out our frontier until eventually the frontier keeps on expanding. When it overlaps the goal, then we've got a solution. Now, in exploration problems like this, there are two problems that we have to worry about. One problem is that there is no solution at all, that the goals are not connected to the to start state. So there's no path from here to there. Then what we want to do is do the exploration we need and report back that it's impossible. We want to find out that it's impossible. Then the other problem is if there is some path that eventually makes it to the goal, We want to make sure that we find that in a reasonable amount of time. That means we want to be efficient about the way we explore the space. It also means that we don't want to get stuck in an infinite loop. Now, if there is a finite number of states and they are connected, then we should be able to find the path. But if we aren't clever, we may miss the solution even though it's possible to find it. For example, if we had a strategy that says first I'm going to explore in this direction-- say this is pouring from cup x into cup y-- and then I go in this direction, pouring from cup y back in to cup x, and then I pour the water back again--so I'm continually just taking water and pouring it between two different cups back and forth, those are all legal steps to take, but I'm ending up with an infinitely long path and I'm not making any progress. We'd like to come up with a strategy for exploration, and the strategy corresponds to deciding which path to expand next. Strategy is always there's some path--let's say this one-- and we say that's the one we're going to explore from next. To avoid this type of infinite loop, here's some possibilities. One possibility would be don't reverse an action. If you come from state A to state B, don't allow the action that goes immediately back to state A. Another strategy would be to say always take the shortest path first. Out of all the paths that you've built so far, when we go to choose which one we're going to expand next, always choose one of the shortest ones. That way we might start to build up an infinitely long path, but at least we won't continue it. First we'll do another one before we do that one. Then another strategy would be don't re-explore. That is, if we're on the frontier--let's say we're here on the frontier-- and we have a move that moves us back out of the frontier into the previously explored zone, then we should not allow that path. My question is check all the strategies that would eventually lead us to the goal. Don't worry about the efficiency of getting to the goal, but which one will eventually get us there and won't get stuck in an infinite loop.
CS212 Unit 4
06/05/12 11:49:22
13 for (state, action) in successors(x, y, X, Y).items(): 14 if state not in explored: 15 explored.add(state) 16 path2 = path + [action, state] 17 if goal in state: 18 return path2 19 else: 20 frontier.append(path2) 21 return Fail 22 Fail = []
I'm saying the input to this pour problem function are X and Y, which are the capacity of the glass for that. Then the goal, which is going to be an integer, like 6, to say that's how much I'm trying to get to. That can be in either one of the glasses. Then the start state, which I'm defaulting to 0 and 0, saying both glasses have current level 0, but if you wanted you could generalize the problem and pass in something else as what we're starting with. I'm using lowercase x and lowercase y to indicate the current capacity of the glasses. Here I check and see are we done before we even get going? Did you give me a start state and say the goal is the have a glass with zero in it? Then we're done before doing any actions. Go ahead and return that. What I'm going to return is called a "path." The path is a alteration of states and an arrow, which will give a name to each action, and then the other states that it goes to, and we alternate out with the states action states and so on. Here, if there's nothing to do, it's just a state with no actions. We're going to keep track of the states that we've already explored and that's going to be a set. We're going to keep track of the frontier. Conceptually, that's a set too, but we're going to pull the items off of the frontier one at a time, so I've made it an ordered list rather than a set. I know which element of the frontier I want to explore first. So the explored is a set of states, and a frontier is an ordered list of paths. The only path we have so far is the trivial path that says we're starting at the start, and we haven't gone anywhere else yet. That's what we start our frontier with. While the frontier is left, while there is still frontier states that we haven't explored from yet, we pop off the first one. Pop(0) says take the 0th element of the list, so we're going to pull elements off of the front of the list and push them onto the end of the list. Then say the current state is the last element of the path, so the path goes from one state to the next, and the last element of the path is the current state. Let's take x and y from there. Then I've defined a successor function that gives me all the successor states and the actions we used to get from there. There should be six of those. Then if we say if that new state is not explored then it's something new. If it was explored, there is nothing left to do. We're already explored from there. If it hasn't been explored yet, then add it to the explored set, make up a new path, which consists of the old path plus we follow an action to get to the new state. If the goal number is somewhere in that state, so the goal is 6 and the state is the two levels of the glasses, say 6 and 3, yes, 6 is in 6 and 3. Then we're done. Return that path as the winner, the path that reached the goal. Otherwise, just add this path onto the frontier, and we'll pull something off the frontier later. If we go all the way through and we run out of frontiers to explore from, then we can't reach the goal and we return fail. You could have Fail be None. I decided to make it the empty list, because all the other things we're returning were lists. Either way, None or Fail, both are equivalent to False in Python if statements. So probably either one would do fine. Here's my successor function. 1 def successors(x, y, X, Y): 2 """Return a dict of {state:action} pairs describing what can be reached from 3 the (x, y) state and how.""" 4 assert x <= X and y <= Y ## (x, y) is glass levels; X and Y are glass sizes 4/16
CS212 Unit 4
06/05/12 11:49:22
5 return {((0, y+x) if y+x <= Y else (x-(Y-y), y+(Y-y))): 'X->Y', 6 ((x+y, 0) if x+y <= X else (x+(X-x), y-(X-x))): 'X<-Y', 7 (X, y): 'fill X', 8 (x, Y): 'fill Y', 9 (0, y): 'empty X', 10 (x, 0): 'empty Y' 11 }
It takes the current levels of the glasses and the maximum capacity of the glasses. What it's going to return is a dictionary of state-action pairs. The state is just an x-y pair of what the levels of the glasses are going to be, and the action is how you got there. We're just going to use strings to represent those actions, so it's just something that we can print out that is otherwise unimportant in the operation of the program. First I wanted to check that this is a legal state that the fill level of x is less than its capacity and the same for y. Then I said here are the six possibilities. The pouring is complicated. Let's do the filling first. The filling says: You can fill X up to its capacity--capital X. You can fill Y up to its capacity--capital Y. You can empty X. That'll become 0. You can empty Y. It will become 0. Then the pour - there are two cases. If the total amount of water is less than y, then you can take all the water in the first glass, which is x, and add it into y, so you get y plus x. Same thing in the other direction. But if the total amount of water is more than the destination that you're trying to pour it into, then you could only pour as much as will fill up the other glass. We can see that there is conservation of water here. The total amount is x + y minus this difference plus this difference. I got the definition of my program pretty much just by following out the implications of this diagram.
We're going to keep track of an explored set, never try to return there, expand the frontier, pop off one element of the frontier, add in the new elements, and check when we get to the goal. Then that was all kind of generic for any exploration problem. Then for the specific water problem, the successor function and the way that was laid out was specific to what we're doing with the glasses.
5/16
CS212 Unit 4
06/05/12 11:49:22
1.7 7. 06 Doctest
Now that was a lot of code again, so I'm really going to need some tests to makes sure I got this right. Rather than write the types of tests that we had before with the search statements, I'm going to introduce a new type of test. This comes from the standard Python module called "doctest." It stands for documentation test. The idea is that you can write comments-- the sort of comments that go with your class items and with your function items and then automatically have them run its tests. The tests look just like something that you would type into the Python interpreter. The way doctest knows that you've got a test is you have three-arrow prompt, and an expression is input and the following lines are the output that comes back from that expression. It tests to see if what comes back when you run the test is what was expected. Here I've typed in what I've done at an interactive session, what the results should be, and then when I make a change to my program I can run it again and make sure I haven't messed anything up. 1 import doctest 2 3 class Test: 4 """ 5 >>> successors(0, 0, 4, 9) 6 {(0, 9): 'fill Y', (0, 0): 'empty Y', (4, 0): 'fill X'} 7 8 >>> successors(3, 5, 4, 9) 9 {(4, 5): 'fill X', (4, 4): 'X<-Y', (3, 0): 'empty Y', (3, 9): 'fill Y', (0, 5): 'empty X', (0, 8): 'X->Y'} 10 11 >>> successors(3, 7, 4, 9) 12 {(4, 7): 'fill X', (4, 6): 'X<-Y', (3, 0): 'empty Y', (0, 7): 'empty X', (3, 9): 'fill Y', (1, 9): 'X->Y'} 13 14 >>> pour_problem(4, 9, 6) 15 [(0, 0), 'fill Y', (0, 9), 'X<-Y', (4, 5), 'empty X', (0, 5), 'X<-Y', (4, 1), 'empty X', (0, 1), 'X<-Y', (1, 0), 'fill Y', (1, 9), 'X<-Y', (4, 6)] 16 17 ## What problem, with X, Y, and goal < 10 has the longest solution? 18 ## Answer: pour_problem(7, 9, 8) with 14 steps. 19 20 >>> def num_actions(triplet): X, Y, goal = triplet; return len(pour_problem(X, Y, goal)) / 2 21 22 >>> def hardness(triplet): X, Y, goal = triplet; return num_actions((X, Y, goal)) - max(X, Y) 23 >>> max([(X, Y, goal) for X in range(1, 10) for Y in range(1, 10) 24 ... for goal in range(1, max(X, Y))], key = num_actions) 25 (7, 9, 8) 26 27 >>> max([(X, Y, goal) for X in range(1, 10) for Y in range(1, 10) 28 ... for goal in range(1, max(X, Y))], key = hardness) 29 (7, 9, 8) 30 31 >>> pour_problem(7, 9, 8) 32 [(0, 0), 'fill Y', (0, 9), 'X<-Y', (7, 2), 'empty X', (0, 2), 'X<-Y', (2, 0), 'fill Y', (2, 9), 'X<-Y', (7, 4), 'empty X', (0, 4), 'X<-Y', (4, 0), 'fill Y', (4, 9), 'X<-Y', (7, 6), 'empty X', (0, 6), 'X<-Y', (6, 0), 'fill Y', (6, 9), 'X<-Y', (7, 8)] 33 """ 34 35 print(doctest.testmod()) 36 # TestResults(failed=0, attempted=9) 6/16
CS212 Unit 4
06/05/12 11:49:22
For example, at the start here I just want to test out what are the successors of the start state with both glasses empty and when one glass has capacity 4 and the other has capacity 9. In general there are six actions but here a lot of them end up being the same, because if you pour zero into zero either way or if you empty out either of them, it all comes out the same. We only end up with three states, and they happen to have these labels-- (0, 9) filling Y, (0, 0)--we called that emptying Y, but of course emptying 0 gives you 0. It could have been the no opt, but that's just the way the successor function works out. Then (4, 0) is filling X. More interestingly, if you have 3 and 5 and you fill-- so this is testing when we aren't exceeding the capacity, and this test is when we do exceed the capacity. We can see they work out to the right numbers. Then we solve a problem and come up with a solution and so on. Doctest is a nice capacity to allow you to write tests this way. You can sprinkle them throughout your program, and then you can run the test. Just say: 1 print doctest.testmod()
which stands for test module. If you give it no arguments, it tests the current module. When I run this I get the comforting message that there's a test result that is none of the tests failed, and there were 9 that were attempted. Let's go back and look at the solution. I'm asking given glasses of levels 4 and 9 trying to find the goal 6. This is the shortest solution possible--fill Y, pour from Y into X, empty X, do the same, empty X again, fill Y into X again, fill Y, and pour from Y into X, and then we end up with a 6 in Y. We can solve problems more generally. Here I've defined a function num_actions, which says given an X and Y capacity and a goal how long does it take to solve the goal--the total number of steps it's going to take. Then I asked here for all values of X and Y less than 10--for all capacities less than 10-- and for all goals smaller than the capacity, what's the longest? What's the hardest? Which combinations of those takes the most actions? The answer was if you're given glasses of size 7 and 9 and asked to pour out 8, that's the hardest problem within that range.
CS212 Unit 4
06/05/12 11:49:22
CS212 Unit 4
06/05/12 11:49:22
right I would do that, but then I decided later on that maybe the action should be more than just the arrow. Maybe the action should also tell who went across. I have the option of doing thing. If I want to just solve the problem the way it was specified then I would return just the arrow to represent the action, and I would do the same thing over here. One subtlety of this that worked out well in my favor-- it's a little bit messy dealing with frozen sets. I don't like the idea of that the name is so long, but I didn't have to consider separately the idea of one person going across and two persons going across. Because we were dealing with sets, the set of people a, b when a is equal to b is equal to 1 person. I get the 1 person crossing for free. That's one nice thing about my representation. But notice that everything is in flux here. I'm trying to choose a good representation. I'm changing my mind as I go along. Should the actions be represented by a single arrow or should they be represented by an arrow along with the names of the people that are going? That's all up in flux. I should say that that type of flux is okay as long as it remains contained. If you have uncertainties that are going to cross barriers between lots of different functions, then probably you want to nail them down. If you think that they're contained, then it's okay to have some uncertainty and be able to explore the exact options later.
9/16
CS212 Unit 4
06/05/12 11:49:22
10/16
CS212 Unit 4
06/05/12 11:49:22
1 def bridge_problem(here): 2 "Find the fastest (least elapsed time) path to the goal in the bridge problem." 3 here = frozenset(here) | frozenset(['light']) 4 explored = set() # set of states we have visited 5 # State will be a (peoplelight_here, peoplelight_there, time_elapsed) tuple 6 # E.g. ({1, 2, 5, 10, 'light'}, {}, 0) 7 frontier = [ [(here, frozenset(), 0)] ] # ordered list of paths we have blazed 8 while frontier: 9 path = frontier.pop(0) 10 here1, there1 = state1 = path[-1] 11 if not here1 or here1 == set(['light']): ## Check for solution when we pull best path 12 return path 13 for (state, action) in bsuccessors2(state1).items(): 14 if state not in explored: 15 here, there, t = state 16 explored.add(state) 17 path2 = path + [action, state] 18 # Don't check for solution when we extend a path 19 frontier.append(path2) 20 frontier.sort(key = elapsed_time) 21 return Fail
Two changes are here and here. We pull up the test to this point where we check for solution when we pulled the best path off, and we check for our goal only there, and we don't check for the goal when we're putting something on the frontier.
CS212 Unit 4
06/05/12 11:49:22
that, and paths are going to look like that. Now, I want you to write the new successor function for the bridge problem. We'll call it bsuccessors2--the "2" just to keep it distinct from the first version. Again it returns a dict of state-action pairs. A state now is just a two-tuple of (here, there), and the here and there are still frozen sets. It's pretty much the same except we dropped out the time t. Go ahead and implement that for me.
CS212 Unit 4
06/05/12 11:49:22
27 old = None 28 for i,p in enumerate(frontier): 29 if final_state(p) == final_state(path): 30 old = i 31 break 32 if old is not None and path_cost(frontier[old]) < path_cost(path): 33 return # Old path was better; do nothing 34 elif old is not None: 35 del frontier[old] # Old path was worse; delete it 36 ## Now add the new path and re-sort 37 frontier.append(path)
The tricky part is just keeping track of the costs and putting them in the right location. Just like before we're popping paths off the frontier. We're checking to see if we hit a goal. We're keeping track of states that we've already explored. But now we're doing something new. We're computing the cost of the path that we just popped off, and that's just pulling the cost out, because we've already computed it and stored it in the final action. Then for each of the successors, we figure out the total cost is the cost of the path that we already computed so far plus the bride cost of the individual action. Total cost so far plus cost for one more action, and then we just throw that into the path. The new path is equal to the old path plus the action total cost tuple plus the state that we end up with. Add that to the frontier and we're done. I just define this simple one-line function here. The final_state of a path is the last element of the path. I use that there. Here is adding to the frontier. Now, it could just be throwing it on there the way we did before, but there's a tricky part here. The complication that I want to deal with here that we haven't dealt with before was there may be two different paths that end up in the same state. If that's the case, we want to choose the best one. We don't want to get to the state from a path that's more expensive. We look at see--is there a path that gets to the state that is already on the frontier? If there is, then check to see which one has a better path cost and use that.
CS212 Unit 4
06/05/12 11:49:22
CS212 Unit 4
06/05/12 11:49:22
particular problem wants to deal with. Shortest_path_search doesn't have to know about that. Now, why is that the case? Because shortest_path_search can interface with states through these two functions-- through successors and through the goal function and through the start state. What do I mean by that? The start state is going to be some atomic state. We don't know anything more about that. Shortest_path doesn't know anything about that. When we go to use shortest_path_search for a particular problem, then we have to specify what a state looks like, but shortest_path_search itself doesn't have to know. All it has to know is that if you give the start state to the successor function-- so successor will be a function which takes a state as input and returns a dictionary of state-action pairs. Now, given that initial state that we passed in, we can generate new states and new actions. So the actions also are atomic. Shortest_path_search doesn't have to know anything about the representation other than that this is where they come from--from the successor function. Now, what about the goal? Well, we could specify an exact state that we're looking for, but sometimes we're looking for multiple states. We could specify a set of states, but sometimes the set of states is really big. There's lots of states that satisfy the goal. Instead, let's have the goal be a function. Its's a function. When you pass it a state it returns a boolean. True or False? Is that the goal? With that now we're ready to specify shortest_path_search. Shortest_path_search is going to be a function. It's going to take some inputs, and it's going to return a path, and return failure as a path if it can't find a solution. Now the question is out of this inventory, which of these things do we have to pass into shortest_path_search to allow us to solve a problem? Check all those that apply.
15/16
CS212 Unit 4
06/05/12 11:49:22
16/16