Sei sulla pagina 1di 18

CS212 Unit 4

18 Jos Antnio Soares Augusto

CS212 Unit 4

Contents
1 CS212 Unit 4......................................................................................................................................................1/16 1.1 1. 01 l Water Pouring Problem...........................................................................................................1/16 1.2 2. 03 q Combinatorial Complexity ......................................................................................................2/16 1.3 3. 03 s Combinatorial Complexity......................................................................................................2/16 1.4 4. 04 q Exploring the Space................................................................................................................2/16 1.5 5. 04 Exploring the Space...................................................................................................................3/16 1.6 6. 05 Pouring Solution........................................................................................................................3/16 1.7 7. 06 Doctest.......................................................................................................................................6/16 1.8 8. 07 Bridge Problem..........................................................................................................................7/16 1.9 9. 08 q Representing State..................................................................................................................8/16 1.10 10. 08 s Representing State...............................................................................................................8/16 1.11 11. 09 p Bridge Successors...............................................................................................................8/16 1.12 12. 09 s Bridge Successors ................................................................................................................8/16 1.13 13. 10 q Bridge Solution...................................................................................................................9/16 1.14 14. 10 s Bridge Solution ....................................................................................................................9/16 1.15 15. 11 q Debugging...........................................................................................................................9/16 1.16 16. 11 s Debugging...........................................................................................................................9/16 1.17 17. 12 q Did it work........................................................................................................................10/16 1.18 18. 12 s Did it work .........................................................................................................................10/16 1.19 19. 13 q Improving the Solution.....................................................................................................10/16 1.20 20. 13 s Improving the Solution ......................................................................................................10/16 1.21 21. 14 p Modify Code.....................................................................................................................10/16 1.22 22. 14 s Modify Code ......................................................................................................................10/16 1.23 23. 15 p Adding Tests.....................................................................................................................11/16 1.24 24. 16 p Refactoring Paths..............................................................................................................11/16 1.25 25. 16 s Refactoring Paths..............................................................................................................12/16 1.26 26. 17 p Calculating Costs..............................................................................................................12/16 1.27 27. 17 s Calculating Costs ...............................................................................................................12/16 1.28 28. 18 l Putting it Together.............................................................................................................12/16 1.29 29. 19 l Generalizing.......................................................................................................................13/16 1.30 30. 20 q Missionaries and Cannibals..............................................................................................13/16 1.31 31. 20 s Missionaries and Cannibals...............................................................................................14/16 1.32 32. 21 q Generalized State..............................................................................................................14/16 1.33 33. 21 s Generalized State ...............................................................................................................14/16 1.34 34. 22 p csuccessors........................................................................................................................14/16 1.35 35. 22 s csuccessors........................................................................................................................14/16 1.36 36. 23 l mc problem........................................................................................................................14/16 1.37 37. 24 q Shortest Path Search.........................................................................................................14/16 1.38 38. 24 s Shortest Path Search..........................................................................................................15/16 1.39 39. 25 p sps function.......................................................................................................................15/16 1.40 40. 25 s sps function ........................................................................................................................15/16 1.41 41. 26 p Cleaning up MC Problem.................................................................................................15/16 1.42 42. 26 s Cleaning up MC Problem..................................................................................................15/16 1.43 43. 27 p Lowest Cost Search ...........................................................................................................15/16 1.44 44. 27 s Lowest Cost Search...........................................................................................................16/16 1.45 45. 28 p Back to Bridge Problem....................................................................................................16/16 1.46 46. 28 s Back to Bridge Problem....................................................................................................16/16 1.47 47. 29 l Summary............................................................................................................................16/16

1 CS212 Unit 4
Contents 1. 01 l Water Pouring Problem 2. 03 q Combinatorial Complexity 3. 03 s Combinatorial Complexity 4. 04 q Exploring the Space 5. 04 Exploring the Space 6. 05 Pouring Solution 7. 06 Doctest 8. 07 Bridge Problem 9. 08 q Representing State 10. 08 s Representing State 11. 09 p Bridge Successors 12. 09 s Bridge Successors 13. 10 q Bridge Solution 14. 10 s Bridge Solution 15. 11 q Debugging 16. 11 s Debugging 17. 12 q Did it work 18. 12 s Did it work 19. 13 q Improving the Solution 20. 13 s Improving the Solution 21. 14 p Modify Code 22. 14 s Modify Code 23. 15 p Adding Tests 24. 16 p Refactoring Paths 25. 16 s Refactoring Paths 26. 17 p Calculating Costs 27. 17 s Calculating Costs 28. 18 l Putting it Together 29. 19 l Generalizing 30. 20 q Missionaries and Cannibals 31. 20 s Missionaries and Cannibals 32. 21 q Generalized State 33. 21 s Generalized State 34. 22 p csuccessors 35. 22 s csuccessors 36. 23 l mc problem 37. 24 q Shortest Path Search 38. 24 s Shortest Path Search 39. 25 p sps function 40. 25 s sps function 41. 26 p Cleaning up MC Problem 42. 26 s Cleaning up MC Problem 43. 27 p Lowest Cost Search 44. 27 s Lowest Cost Search 45. 28 p Back to Bridge Problem 46. 28 s Back to Bridge Problem 47. 29 l Summary

1.1 1. 01 l Water Pouring Problem


I'm going to begin this unit with an old problem known as the "water-pouring problem." Here's what we're given: two glasses of water and we have a faucet in a sink, which can be the source of as much water as we want. Now, these glass are of different sizes. I haven't drawn them that much different, but this one is 4 oz, and this one is 9 oz. For those of you in the rest of the world besides the U.S., an ounce is about 30 mL. Our goal is to measure out a specific amount of water. What we want to have is 6 oz of water measured out. Six ounces won't fit in this glass. The idea is at the end want to have this glass filled with exactly 6 oz of water. There's no gradated markers. It's not like a function.graduated cylinder or measuring cup where we have the measurements on the glass. It wouldnt be accurate enough to just eyeball it. What we've got to do is we've got to figure out how to do that by measuring out a precise amounts into the cups and pouring them off. For example, if the goal had been 5 oz, then that would have been easy. We'd just fill the 9 oz all the way up to the top, and then pour the 9 oz into the 4 oz until the 4 oz is all the way full, and then what would be remaining here because there's 9 altogether would be 5 in this glass. Five ounces is easy. Six ounces is not as obvious how to get there. The puzzle is to find a sequence of 1/16

CS212 Unit 4

06/05/12 11:49:22

pouring actions, and the pouring can be from one glass to another. It can go in the other direction. It can go from the faucet into each of the glasses. And it can go from the glasses down the drain. Six different actions we can take, and we want to find a sequence of actions that arrives at this goal of 6 oz. Of course, we can generalize the problem and put any number rather than 9 and 4 and 6. As usual, let's make our inventory of concepts that we're going to be dealing with. We have the glass, and the glass has a capacity and a current level. This glass would have capacity 9, current level 5. We're also going to need collection of glasses probably--a pair of glasses. I guess we can say that the pair of glasses and they're current levels represents a complete state of the world. We'll think of that as a state of the world. Everything we need to know about where we are in the problem. Then we have a goal that we're trying to reach. We have the pouring actions--1, 2, 3, 4, 5, 6. That breaks down into emptying, filling, and transferring. The transferring, I think, is a little bit tricky, because there are two ways to do it. When we were transferring from the 9 oz into the 4 oz-- so we transfer from x to y--we can do that until y is full. That's what happened here. The 4 oz was full. Or we could do it until x is empty. If we were starting to pour back 4 oz from here into an empty one, we could do it until it was empty. Anything else in the inventory? Oh, well, we certainly need a notion of a solution. A solution is going to be a sequence of steps-- to pour from here to here, then from here to the drain, then fill up, then pour again, and so on. What this unit is really all about is techniques for finding these solutions, which are sequences of steps. Again, we're always talking about managing complexity in this class. The complexity we're trying to manage here is a complexity

1.2 2. 03 q Combinatorial Complexity


There's a complexity that comes from combinatorial problems. We've seen that before. In the cryptarithmetic problems ODD + ODD = EVEN. We had these up to 10! different permutations of digits to assign, and it was complex because we had to consider them all. In the zebra puzzle we had 5!5 combinations to consider. It was complex because it took a long time to consider them all. We came up with an optimization to consider a few of them by going one at a time. For our pouring problem, we know there are 6 actions, 2 empties, 2 fills, and 2 pours. The glasses are of size 4 and 9. The goal is 6 oz. I guess my question for you is how many combinations do we need? For cryptarithmetic it was 10!. For zebra it was 5!5. For pouring is it 64 6(9-4) 66 69 can't tell none of the above?

1.3 3. 03 s Combinatorial Complexity


The answer is that you can't tell. This is a different type of combinatorial problem than the previous ones. In the previous ones we had a fixed number of variables, and we knew how many combinations we had for each variable. In the zebra problem, there were 25 variables, and that's all there was. We could enumerate all the combinations. For the pouring problem we're trying to not fill static variables but rather put together a sequence of actions to go from one state to the next. We don't know how long that sequence is, and of course, at each point we have 6 different options of different ways to go. From each of those 6 more. We know it's going to be roughly 6 to the something, because we branch 6 at each point, but we don't know what that x is, because we don't know how long the sequence is. So that makes the problem slightly different. If we want to be foraml, we call it a combinatorial optimization problem, but usually we just called it a "search" problem.

1.4 4. 04 q Exploring the Space


Now it's called search traditionally, but I think "exploration" is a better name for it. We start out at home, and in this case our home is where we have two glasses. Zero and zero are the values for how full the glasses are. Then we start to explore. One way we could explore is to fill one of the glasses Then we're at this state--say we're at 0 and 4-- but we know that there are other actions in which we could explore in other directions. Now we could take one of the other states and explore from there in other directions. We have lots of choices going forward of this huge space that we're exploring. Now, somewhere out in this space--and we don't know which direction it is-- is this goal state, which has 6 and then actually any amount in the other glass. We're trying to reach that, and we're distinguishing this part of the state space as a goal. So I drew this as one, but really it's a collection of states in that every state that has 6 on one side and anything on the other should be considered part of this collection of goals. We're trying to search forwards towards that. One reason I like to call it an exploration problem is because we can think of going forward, exploring a new land, and part of that exploration is that we've got a frontier. Here's all the states that are the farthest out that we've gone. If we want to make progress towards the goal, then we're probably going to have to step from one of the frontier nodes farther out. We've separated the set of all possible states into the goal state, the frontier states, and the previously explored states. Then you can see that the way to make progress is to say let's take one of the frontier states and expand that, and we have the advantage here of being a computer that an individual explorer doesn't have. An individual explorer has to take one path, and if they decide they've gone in the wrong direction, they have to go all the way back. A computer can store lots of states in 2/16

CS212 Unit 4

06/05/12 11:49:22

memory. Computer exploration is more like a collection of explorers all collectively expanding the frontier. Our next move can be to say we'll take one of these explorers, say the one in this state here, and say now tell me what's next. You've got 6 actions from there. Where do they go to? Maybe some of them explore the world and generate new states that we haven't seen before. Maybe some of them go to a state that we already know is on the frontier. Maybe some of them regress backwards into previously explored territory. But we can keep on going, expanding out our frontier until eventually the frontier keeps on expanding. When it overlaps the goal, then we've got a solution. Now, in exploration problems like this, there are two problems that we have to worry about. One problem is that there is no solution at all, that the goals are not connected to the to start state. So there's no path from here to there. Then what we want to do is do the exploration we need and report back that it's impossible. We want to find out that it's impossible. Then the other problem is if there is some path that eventually makes it to the goal, We want to make sure that we find that in a reasonable amount of time. That means we want to be efficient about the way we explore the space. It also means that we don't want to get stuck in an infinite loop. Now, if there is a finite number of states and they are connected, then we should be able to find the path. But if we aren't clever, we may miss the solution even though it's possible to find it. For example, if we had a strategy that says first I'm going to explore in this direction-- say this is pouring from cup x into cup y-- and then I go in this direction, pouring from cup y back in to cup x, and then I pour the water back again--so I'm continually just taking water and pouring it between two different cups back and forth, those are all legal steps to take, but I'm ending up with an infinitely long path and I'm not making any progress. We'd like to come up with a strategy for exploration, and the strategy corresponds to deciding which path to expand next. Strategy is always there's some path--let's say this one-- and we say that's the one we're going to explore from next. To avoid this type of infinite loop, here's some possibilities. One possibility would be don't reverse an action. If you come from state A to state B, don't allow the action that goes immediately back to state A. Another strategy would be to say always take the shortest path first. Out of all the paths that you've built so far, when we go to choose which one we're going to expand next, always choose one of the shortest ones. That way we might start to build up an infinitely long path, but at least we won't continue it. First we'll do another one before we do that one. Then another strategy would be don't re-explore. That is, if we're on the frontier--let's say we're here on the frontier-- and we have a move that moves us back out of the frontier into the previously explored zone, then we should not allow that path. My question is check all the strategies that would eventually lead us to the goal. Don't worry about the efficiency of getting to the goal, but which one will eventually get us there and won't get stuck in an infinite loop.

1.5 5. 04 Exploring the Space


The answer is shortest first would work. If there is a path, it'll eventually find it. It will waste some time repeating itself, and may not be the most efficient. But we'll get there. Don't re-explore seems more efficient, because it stops off some of these paths. Don't reverse isn't quite good enough, because if we said, okay, we're going to eliminate the steps that go from A to B and then back to A, but that doesn't stop us from going from A to B to C to D and then back to A and having that longer loop and having that be infinite.

1.6 6. 05 Pouring Solution


Now let's get to solving the problem and coding it up. But before I do that, I want to introduce one more piece of jargon, which is if I'm at a particular state, and I decide that that's the endpoint of the path that I want to expand, and I come up with the states you can get to from there by expanding the path and the steps that it takes to get to those states. I call that the successors to this state. The successors are a collection of states that you can reach and the steps that it took to get there. Here is my solution. It's a little bit complicated. Let's go through it step-by-step. 1 def pour_problem(X, Y, goal, start = (0, 0)): 2 """X and Y are the capacity of glasses; (x,y) is current fill levels and 3 represent a state. The goal is a level that can be in either glass. Start at 4 start state and follow successors until we reach the goal. Keep track of 5 frontier and previously explored; fail when no frontier.""" 6 if goal in start: 7 return [start] 8 explored = set() # set the states we have visited 9 frontier = [ [start] ] # ordered list of paths we have blazed 10 while frontier: 11 path = frontier.pop(0) 12 (x, y) = path[-1] # Last state in the first path of the frontier 3/16

CS212 Unit 4

06/05/12 11:49:22

13 for (state, action) in successors(x, y, X, Y).items(): 14 if state not in explored: 15 explored.add(state) 16 path2 = path + [action, state] 17 if goal in state: 18 return path2 19 else: 20 frontier.append(path2) 21 return Fail 22 Fail = []

I'm saying the input to this pour problem function are X and Y, which are the capacity of the glass for that. Then the goal, which is going to be an integer, like 6, to say that's how much I'm trying to get to. That can be in either one of the glasses. Then the start state, which I'm defaulting to 0 and 0, saying both glasses have current level 0, but if you wanted you could generalize the problem and pass in something else as what we're starting with. I'm using lowercase x and lowercase y to indicate the current capacity of the glasses. Here I check and see are we done before we even get going? Did you give me a start state and say the goal is the have a glass with zero in it? Then we're done before doing any actions. Go ahead and return that. What I'm going to return is called a "path." The path is a alteration of states and an arrow, which will give a name to each action, and then the other states that it goes to, and we alternate out with the states action states and so on. Here, if there's nothing to do, it's just a state with no actions. We're going to keep track of the states that we've already explored and that's going to be a set. We're going to keep track of the frontier. Conceptually, that's a set too, but we're going to pull the items off of the frontier one at a time, so I've made it an ordered list rather than a set. I know which element of the frontier I want to explore first. So the explored is a set of states, and a frontier is an ordered list of paths. The only path we have so far is the trivial path that says we're starting at the start, and we haven't gone anywhere else yet. That's what we start our frontier with. While the frontier is left, while there is still frontier states that we haven't explored from yet, we pop off the first one. Pop(0) says take the 0th element of the list, so we're going to pull elements off of the front of the list and push them onto the end of the list. Then say the current state is the last element of the path, so the path goes from one state to the next, and the last element of the path is the current state. Let's take x and y from there. Then I've defined a successor function that gives me all the successor states and the actions we used to get from there. There should be six of those. Then if we say if that new state is not explored then it's something new. If it was explored, there is nothing left to do. We're already explored from there. If it hasn't been explored yet, then add it to the explored set, make up a new path, which consists of the old path plus we follow an action to get to the new state. If the goal number is somewhere in that state, so the goal is 6 and the state is the two levels of the glasses, say 6 and 3, yes, 6 is in 6 and 3. Then we're done. Return that path as the winner, the path that reached the goal. Otherwise, just add this path onto the frontier, and we'll pull something off the frontier later. If we go all the way through and we run out of frontiers to explore from, then we can't reach the goal and we return fail. You could have Fail be None. I decided to make it the empty list, because all the other things we're returning were lists. Either way, None or Fail, both are equivalent to False in Python if statements. So probably either one would do fine. Here's my successor function. 1 def successors(x, y, X, Y): 2 """Return a dict of {state:action} pairs describing what can be reached from 3 the (x, y) state and how.""" 4 assert x <= X and y <= Y ## (x, y) is glass levels; X and Y are glass sizes 4/16

CS212 Unit 4

06/05/12 11:49:22

5 return {((0, y+x) if y+x <= Y else (x-(Y-y), y+(Y-y))): 'X->Y', 6 ((x+y, 0) if x+y <= X else (x+(X-x), y-(X-x))): 'X<-Y', 7 (X, y): 'fill X', 8 (x, Y): 'fill Y', 9 (0, y): 'empty X', 10 (x, 0): 'empty Y' 11 }

It takes the current levels of the glasses and the maximum capacity of the glasses. What it's going to return is a dictionary of state-action pairs. The state is just an x-y pair of what the levels of the glasses are going to be, and the action is how you got there. We're just going to use strings to represent those actions, so it's just something that we can print out that is otherwise unimportant in the operation of the program. First I wanted to check that this is a legal state that the fill level of x is less than its capacity and the same for y. Then I said here are the six possibilities. The pouring is complicated. Let's do the filling first. The filling says: You can fill X up to its capacity--capital X. You can fill Y up to its capacity--capital Y. You can empty X. That'll become 0. You can empty Y. It will become 0. Then the pour - there are two cases. If the total amount of water is less than y, then you can take all the water in the first glass, which is x, and add it into y, so you get y plus x. Same thing in the other direction. But if the total amount of water is more than the destination that you're trying to pour it into, then you could only pour as much as will fill up the other glass. We can see that there is conservation of water here. The total amount is x + y minus this difference plus this difference. I got the definition of my program pretty much just by following out the implications of this diagram.

We're going to keep track of an explored set, never try to return there, expand the frontier, pop off one element of the frontier, add in the new elements, and check when we get to the goal. Then that was all kind of generic for any exploration problem. Then for the specific water problem, the successor function and the way that was laid out was specific to what we're doing with the glasses.

5/16

CS212 Unit 4

06/05/12 11:49:22

1.7 7. 06 Doctest
Now that was a lot of code again, so I'm really going to need some tests to makes sure I got this right. Rather than write the types of tests that we had before with the search statements, I'm going to introduce a new type of test. This comes from the standard Python module called "doctest." It stands for documentation test. The idea is that you can write comments-- the sort of comments that go with your class items and with your function items and then automatically have them run its tests. The tests look just like something that you would type into the Python interpreter. The way doctest knows that you've got a test is you have three-arrow prompt, and an expression is input and the following lines are the output that comes back from that expression. It tests to see if what comes back when you run the test is what was expected. Here I've typed in what I've done at an interactive session, what the results should be, and then when I make a change to my program I can run it again and make sure I haven't messed anything up. 1 import doctest 2 3 class Test: 4 """ 5 >>> successors(0, 0, 4, 9) 6 {(0, 9): 'fill Y', (0, 0): 'empty Y', (4, 0): 'fill X'} 7 8 >>> successors(3, 5, 4, 9) 9 {(4, 5): 'fill X', (4, 4): 'X<-Y', (3, 0): 'empty Y', (3, 9): 'fill Y', (0, 5): 'empty X', (0, 8): 'X->Y'} 10 11 >>> successors(3, 7, 4, 9) 12 {(4, 7): 'fill X', (4, 6): 'X<-Y', (3, 0): 'empty Y', (0, 7): 'empty X', (3, 9): 'fill Y', (1, 9): 'X->Y'} 13 14 >>> pour_problem(4, 9, 6) 15 [(0, 0), 'fill Y', (0, 9), 'X<-Y', (4, 5), 'empty X', (0, 5), 'X<-Y', (4, 1), 'empty X', (0, 1), 'X<-Y', (1, 0), 'fill Y', (1, 9), 'X<-Y', (4, 6)] 16 17 ## What problem, with X, Y, and goal < 10 has the longest solution? 18 ## Answer: pour_problem(7, 9, 8) with 14 steps. 19 20 >>> def num_actions(triplet): X, Y, goal = triplet; return len(pour_problem(X, Y, goal)) / 2 21 22 >>> def hardness(triplet): X, Y, goal = triplet; return num_actions((X, Y, goal)) - max(X, Y) 23 >>> max([(X, Y, goal) for X in range(1, 10) for Y in range(1, 10) 24 ... for goal in range(1, max(X, Y))], key = num_actions) 25 (7, 9, 8) 26 27 >>> max([(X, Y, goal) for X in range(1, 10) for Y in range(1, 10) 28 ... for goal in range(1, max(X, Y))], key = hardness) 29 (7, 9, 8) 30 31 >>> pour_problem(7, 9, 8) 32 [(0, 0), 'fill Y', (0, 9), 'X<-Y', (7, 2), 'empty X', (0, 2), 'X<-Y', (2, 0), 'fill Y', (2, 9), 'X<-Y', (7, 4), 'empty X', (0, 4), 'X<-Y', (4, 0), 'fill Y', (4, 9), 'X<-Y', (7, 6), 'empty X', (0, 6), 'X<-Y', (6, 0), 'fill Y', (6, 9), 'X<-Y', (7, 8)] 33 """ 34 35 print(doctest.testmod()) 36 # TestResults(failed=0, attempted=9) 6/16

CS212 Unit 4

06/05/12 11:49:22

For example, at the start here I just want to test out what are the successors of the start state with both glasses empty and when one glass has capacity 4 and the other has capacity 9. In general there are six actions but here a lot of them end up being the same, because if you pour zero into zero either way or if you empty out either of them, it all comes out the same. We only end up with three states, and they happen to have these labels-- (0, 9) filling Y, (0, 0)--we called that emptying Y, but of course emptying 0 gives you 0. It could have been the no opt, but that's just the way the successor function works out. Then (4, 0) is filling X. More interestingly, if you have 3 and 5 and you fill-- so this is testing when we aren't exceeding the capacity, and this test is when we do exceed the capacity. We can see they work out to the right numbers. Then we solve a problem and come up with a solution and so on. Doctest is a nice capacity to allow you to write tests this way. You can sprinkle them throughout your program, and then you can run the test. Just say: 1 print doctest.testmod()

which stands for test module. If you give it no arguments, it tests the current module. When I run this I get the comforting message that there's a test result that is none of the tests failed, and there were 9 that were attempted. Let's go back and look at the solution. I'm asking given glasses of levels 4 and 9 trying to find the goal 6. This is the shortest solution possible--fill Y, pour from Y into X, empty X, do the same, empty X again, fill Y into X again, fill Y, and pour from Y into X, and then we end up with a 6 in Y. We can solve problems more generally. Here I've defined a function num_actions, which says given an X and Y capacity and a goal how long does it take to solve the goal--the total number of steps it's going to take. Then I asked here for all values of X and Y less than 10--for all capacities less than 10-- and for all goals smaller than the capacity, what's the longest? What's the hardest? Which combinations of those takes the most actions? The answer was if you're given glasses of size 7 and 9 and asked to pour out 8, that's the hardest problem within that range.

1.8 8. 07 Bridge Problem


Now let's introduce another problem. We have a cavern here with a rickety bridge connecting it. On this side, which we'll call "here," we have a collection of 4 people who want to get to the other side, which we'll call "there." Part of the problem is this is nighttime, and it's dark. Fortunately, our team has a flashlight or a torch. The setup is such that the bridge is so rickety that only 2 people at a time can cross, so either one or two people can cross. It's so dark that they need the flash light with them. For everybody to get across, two people are going to have to go across. One is going to have to come back with the flashlight. They'll shuttle each back and forth like that. Now, each of the people has different physical abilities and fear levels, so they each take different times to cross the bridge. This person is speedy, takes 1 minute, the next 2 minutes, the next 5 minutes, and the last 10 minutes. The question is what combinations of actions will get everybody across the bridge the fastest. 7/16

CS212 Unit 4

06/05/12 11:49:22

1.9 9. 08 q Representing State


Let's take our usual approach-- start making an inventory of concepts and figure how to represent them. We want to represent a person, a collection of people, and probably it looks like we want to have two collections of people. One, the collection of people on the here side, and one, the collection of people on the there side. We also need to represent the light or the torch. From there it seems like that's about it, and the other concepts we need are the concepts we already had of states and paths. Now, how about the representation choices. For person, well, I hate to reduce people to numbers, but in this case that seems like the perfect thing to do. This person, regardless of all his wonderful individual qualities, we can just represent by the number 5. How about a collection of people? We could represent a collection as a tuple--1, 2, 5, 10-- as a list, as a set. There's also this data type in Python called a frozen set. What I want you to tell me is of these four, which do you think would be okay for representations just in terms of being able to to manipulate them and calculate the successors. Which of these are hashable? Hashable is important, because if we're going to use the same type of technique we used before for our search we had our explored set, which was a set of states, and members of a set have to be hashable. That's a property that we might want to worry about. Now, I should say one more thing in that the description of the problem it was explicitly stated that each of the people has different speeds. That bothered me a little bit, because I could certainly imagine two people having the same speed. But let's just solve what we were asked to solve where every person has a distinct speed.

1.10 10. 08 s Representing State


The answer that all four of these representations would be fine. We can generate successors by appending or adding elements to set lists, tuples, or frozen sets. None of those is too hard to do. It's a little bit easier with sets than with the other ones. In terms of hashing, the immutable data types--frozen sets and tuple-- are hashable, and the mutable types--list and set--are not hashable.

1.11 11. 09 p Bridge Successors


Now, out of those many choices, I made a choice to say I'm going to represent as a tuple of (here, there, t), where "here" represents everything that's on this side, "there" represents everything that's on that side, and "t" is the total elapsed time since the start. I'm going to represent here and there with frozen sets, because those are hashable. So this collection here would be the frozen set consisting of {1, 2, 5, 10}, and I'm going to just use the string "light" to represent the flashlight. There would be the empty frozen set. Now, consider this state here representing the start state. What are the successors of that state? Well, any one of the people could go across. They've got to bring the light with them. In the successor state, the light will definitely be there, and it will not be here. It can only be in one place. At least one of the people will be over there and possibly two of the people, so all combinations of sending either one person or two people to the other side, those will each be distinct successor states. Let's see--we've got 4 x 3 is 12, but order doesn't matter, so there's 6 of those. Then 4 more, so it looks like there should be 10 successor states. What I want you to do is write for me the successor function. We're calling it bsuccessors, because we already had a and we're on to b. Or b could stand for "bridge." Remember that a result of the successor function is the dictionary of state action pairs. A state is this (here, there, t) tuple. Here and there have to be frozen sets. The frozen sets contained people--1, 2, 5, and 10-- and/or this light, indicated by the string "light." Show me the function that will generate all the successors. Here I've given you a hint of here's a way to break up the state into those three variables. Then put your code here. Oh, one more thing I forgot is what are the actions. Well, let's say that an action will be represented by the character string arrow going to the right if we're moving from here to there and an arrow going to the left if we're moving from there to here.

1.12 12. 09 s Bridge Successors


Here's my solution. I've got to say that my solution came out a little bit more complicated than I expected it to. I think maybe I made a bad choice for the representation. I threw in the flashlight along with the set of people, because I figured you want one set to represent everything that's on one side. But I'm think now after this came out the way that it did that maybe I should have had the flashlight be a separate part of the state. In other words, have the state be a 4-tuple, not of things that are here or there but of people that are here or there, then the time, and then a fourth element being the flashlight saying where is the flashlight. That could either be true or false, saying it is it here, or it could be a character string, saying it's there or here, or it could be a integer--0 or 1. I think it might've been easier if I'd chosen one of those representations. But it didn't bother me enough to go back and make a change. If you want to, you could spend time refractoring and change that. I'm going to just push ahead. Here's what I did. I said if the light is here, then let's look at all the people in here. We'll look at all the pairs of people--A and B. To make sure that they're people, I have to say that they're not the light. For all pairs of people A and B, we can generate a successor state, which is the set of people that were here minus the two people and the light, because the light is going to move from here to there. The second part of the successor state is everything that was already over on the other side on there unioned with the things that are coming over, which are people A and B and the light. Then the time is the time plus the maximum time that it took for A and B to get over. Then I know it says in the specification here that the action is represented just by an arrow. If I want to get the problem 8/16

CS212 Unit 4

06/05/12 11:49:22

right I would do that, but then I decided later on that maybe the action should be more than just the arrow. Maybe the action should also tell who went across. I have the option of doing thing. If I want to just solve the problem the way it was specified then I would return just the arrow to represent the action, and I would do the same thing over here. One subtlety of this that worked out well in my favor-- it's a little bit messy dealing with frozen sets. I don't like the idea of that the name is so long, but I didn't have to consider separately the idea of one person going across and two persons going across. Because we were dealing with sets, the set of people a, b when a is equal to b is equal to 1 person. I get the 1 person crossing for free. That's one nice thing about my representation. But notice that everything is in flux here. I'm trying to choose a good representation. I'm changing my mind as I go along. Should the actions be represented by a single arrow or should they be represented by an arrow along with the names of the people that are going? That's all up in flux. I should say that that type of flux is okay as long as it remains contained. If you have uncertainties that are going to cross barriers between lots of different functions, then probably you want to nail them down. If you think that they're contained, then it's okay to have some uncertainty and be able to explore the exact options later.

1.13 13. 10 q Bridge Solution


Now I'm going to show you the solution to the search problem rather than try to make you do it yourself, because there are still a few tricks here that are different from the previous search problem. I'm going to define problem, which takes a sequence of elements here. If you want, you can pass in a frozen set of {1, 2, 5, 10} or whatever, but if you didn't I'm going to go ahead and do that kind of version for you. I'm going to make it into a frozen set, and I'm going to add in the light in case you forgot to specify that. You can just ask bridge_problem of the list 1, 2, 5, 10. I'll take care of it all for you. Like before, the explored set starts off being the empty set. The frontier starts off being the one initial state, which is the frozen set we just made up for everything that's on the here side, and empty set for everything that's on the there side, and 0 for the elapsed time. The idea is to get everybody away from here onto the other side. If we were given a trival problem where there was already nobody here, then we're done and we return that initial state. Otherwise, just like before, we start popping things off the frontier. Just like before we're looking at our successors, and the only difference is down here. Whereas before we put a path on the end, and we were expanding out our frontier and taking off the shortest path first from our frontier, because in the previous problem, in the water-pouring problem, the best solution was to find as the solution that was shortest, with the smallest number of steps. In this problem, the best solution is defined as the one with the smallest elapsed time where the elapsed time of a path is the second element. That's the t element here of the final element of the path. That would be the total elapsed time of a path. So we sort the frontier by the total elapsed time. Now it is a little bit wasteful here that we're going through this loop, we only added in one new element, and we sorted the whole thing. Python's actually pretty good at that type of sort. There are other ways to make that more efficient, but just conceptually that's what we're doing. We always want to have the frontier sorted, so that we're taking the fastest time first. I typed that program in, and I ran it for the very first time. Bridge_problem([1, 2, 5, 10]). I got an answer back. Remember, the answer is a path, which is an alternation of states and actions. We can pick out just the actions, like this, by asking for the path and then taking a slice of that path, starting at element number 1, going to the end, and giving us every other element. That'll be just the actions. Those are these three actions. That's my proposed solution that my program came up with. My question is is that correct? Yes or no?

1.14 14. 10 s Bridge Solution


The answer is no, that's not correct at all. I've been cheating a little along the way in that I've been showing you solutions that I got the second or third time once I'd debugged them and got them right. This time I wanted to show you a little bit of the debugging process. I got something wrong here. I don't always get them right the first time. This is so wrong looks what's happening. I said the first move is at the 5 and the 2 go across together. It seems like a perfectly reasonable move. They're going from here to there. The second move was that the 1, by his or herself, comes back from there to here. But 1 isn't even over there. How could 1 come back? I must have messed up the successor function. Let's take a look.

1.15 15. 11 q Debugging


Here's the problem. I was careful about doing the here case. I made up this nice expression, but then I did a copy and paste, and I edited the expression, and I swapped around the here and the there in this part. When I created the new state, I did that correctly. But down here I'm iterating over the people that were here. I'm trying to have candidates move from there to here, and I'm iterating over people that are here. That doesn't make any sense at all. I've got to fix that. Now the question is is it going to run this time. I found a bug. I fixed it. Is the program correct now? Yes, no, or not enough information, you can't tell yet?

1.16 16. 11 s Debugging


I think the right answer is that you just can't tell. I'm hopeful that it's going to work, but I know I fixed one bug. I don't know whether there are other bugs lurking in there.

9/16

CS212 Unit 4

06/05/12 11:49:22

1.17 17. 12 q Did it work


Now I run it again. This is the path I get. These are the actions in the path. Let's see if it makes sense. Now 1 and 2, the two fastest people, go over first, That looks like a pretty good solution. It came up with a total time of 19. The question is is the program correct now? Yes, it is. No, this example is wrong--there might be a faster example than this and it didn't find it? Or, no, this example is okay. It is the fastest, but the other examples are wrong. Or you still can't tell.

1.18 18. 12 s Did it work


The answer to that is that this example is actually wrong. It does get everybody across, and it gets them across in 19, but there's another solution that's faster than that. So let's look at our program and see what we did wrong and why we missed the fastest solution.

1.19 19. 13 q Improving the Solution


Unfortunately, we got the wrong answer. Yes, we got a path that leads to the goal, but we didn't get the fastest path. Let's see what went wrong. We had our start state, and then we started expanding that and moving out. That defined our frontier. Then we were very careful about sorting the elements on the frontier, and then we pulled off the very best, the one with the least cost. Then expanded out from there. Let's say the cost of getting to the end of this path with 14, this one 15, this one 16. This is the lowest cost path, we expand that first. Let's say one of the steps cost 5, so that gets us to this state with a cost of 19. Let's say that is in fact a goal state. Now we just stopped there. We said we took off the least cost path. We expanded it. We found a goal. We're done. When we were looking for the shortest path in terms of the least number of steps, that was the right approach, but when we're looking for the least cost path, that's not the right approach. Because even though we pulled off the cheapest path here--the one with the lowerst cost-- here's another path that has a higher cost, but if we expand that there might be a step that only costs 2. We get to this state with cost 17 and that's also a goal. So we made a mistake. We stopped here when we got this result that was 19 when we really wanted this result that was 17. I think the problem was we were prematurely acting. We said just because this was the fastest solution here, we went ahead and took one step away from the fastest and accepted that when that might not be the best answer overall. How can we fix this? One possibility would be to exhaust the frontier. That is, we've got a frontier here. Even though we find a solution from the first element of the frontier, we keep going until we visit everybody on the frontier and give everybody a change to find the better solution. Another possibility is to give everybody one more chance. Once we've found the first solution, now we say, okay, everybody on the frontier gets one more step to see if they can find a solution. The third possibility would be to test later. That is, when we generate this solution, we don't check right here to see if it is a solution. Rather, we just go ahead and throw this onto the frontier and only check to see if it's a solution when we pull the next element off of the frontier. Rather than when we generate a new node and we're about to add them, do the checks later once we've pulled them off the frontier. Now tell me which, if any, of these will work to give us this fastest solution.

1.20 20. 13 s Improving the Solution


The answer is exhausting the frontier won't work, because the frontier might be infinite. In this particular problem, there's only a finite number of states, but in some problems there might be an infinite number. If we kept on generating new elements onto the frontier we may never get to the end. Doing one step won't do it either. In this case, if once we found the solution from this 14, we then gave all the other guys one step, it would work in this case. But it might be that it took two steps. Maybe from the 15 there'd be one step that costs 1 and another step that cost 2. I might not just be one step, so that's not going to work. The test later part will work. The reason it works is because now we've guaranteed that everybody on the frontier is sorted, and we're pulling off the shortest one first. If we put it back onto the frontier rather than recognizing immediately that it's a goal, then since we're pulling them off in order of increasing cost, then we know that the first one we pull off the frontier that is a goal that must be the cheapest path to the goal.

1.21 21. 14 p Modify Code


What I want you to do is take this is the same version of the bridge problem solver that we saw before, and I want you to modify this so that it tests for the goal later after pulling a state off the frontier, not when we're about to put it on the frontier.

1.22 22. 14 s Modify Code


Here's the solution.

10/16

CS212 Unit 4

06/05/12 11:49:22

1 def bridge_problem(here): 2 "Find the fastest (least elapsed time) path to the goal in the bridge problem." 3 here = frozenset(here) | frozenset(['light']) 4 explored = set() # set of states we have visited 5 # State will be a (peoplelight_here, peoplelight_there, time_elapsed) tuple 6 # E.g. ({1, 2, 5, 10, 'light'}, {}, 0) 7 frontier = [ [(here, frozenset(), 0)] ] # ordered list of paths we have blazed 8 while frontier: 9 path = frontier.pop(0) 10 here1, there1 = state1 = path[-1] 11 if not here1 or here1 == set(['light']): ## Check for solution when we pull best path 12 return path 13 for (state, action) in bsuccessors2(state1).items(): 14 if state not in explored: 15 here, there, t = state 16 explored.add(state) 17 path2 = path + [action, state] 18 # Don't check for solution when we extend a path 19 frontier.append(path2) 20 frontier.sort(key = elapsed_time) 21 return Fail

Two changes are here and here. We pull up the test to this point where we check for solution when we pulled the best path off, and we check for our goal only there, and we don't check for the goal when we're putting something on the frontier.

1.23 23. 15 p Adding Tests


It looks like this is a tricky problem. There are lots of cases that we have to take care of. It seems like a good idea to write some more tests. I've done that here. I've written a few tests. I really should write a lot more. What I want you to do is write at least 3 more tests and run them. I don't have a way of knowing for sure whether you've come up with good ones or not, but go ahead and add at least three more tests to this class of test

1.24 24. 16 p Refactoring Paths


Now, mostly we're looking for correct code. If you wrote some more tests, you may start to have some more confidence in the code that we have. We're also considering efficiency to some degree. It seems like there's a big problem with the efficiency of the program we have so far. Let me show you one of the issues. Now we represented states as a (here, there, t) triplet. The problem with this is there can be two states that have identical here and there's but differ in the t, and they're going to be considered different states. Why is that a problem? Consider this problem. We have two people--one who takes 1 unit to cross the bridge, and one who takes 1000. It seems pretty clear there is an easy solution. The two of them go across together. It takes 1000, but look how we're going to explore this space. We're going to start out in the initial state that took time 0, and then we're going to start adding things to the frontier. Out of all the ways we could cross, the one that adds the least is for the 1 to go across by himself. Now he's on the other side with the 1 on the other side and the 1000 on the original side. That only took 1 step. Now what's the fastest thing we can do after that? We could take 1 more step and go back to the original state. Here we had 1 and we'll call K for the 1000 on the left-hand side. Here K was left behind and 1 went over to the right. Here we took one more time unit, and we had 1, K on this side. If we continue taking the fastest step we can, we'll get to another distinct state where K is on this side and 1 is on the other side. The flashlight is always going with the 1. We keep on going on like that. We'll go out 1000 different steps. Each of these will be a distinct state, because this will be the state with time t equals 0. Here time t equals 1, t equals 2, t equals 3. But really, although it looks like we're getting different states, in another way of looking at it, we're always getting the same state. We're just going back and forth from here to there and back to here and back and back. We're going around in circles. In order to recognize that these are in fact the same states, we're going to have to take t out of our state, and we're going to have to deal with the t someplace else. We want our representation of a state to be just (here, there). We've got to figure out someplace else to put the t. I'm not sure what the right way to do it is, but why don't we do it this way? We have a path, which is state, action, state dot, dot, dot-- keeps on alternating between states and actions. Let's change that so that the path is a state followed by a tuple of the action and the total time it took after applying that action, then the next state, then the next action and the total time after applying that, and so on. That'll be our new representation. States are going to look like 11/16

CS212 Unit 4

06/05/12 11:49:22

that, and paths are going to look like that. Now, I want you to write the new successor function for the bridge problem. We'll call it bsuccessors2--the "2" just to keep it distinct from the first version. Again it returns a dict of state-action pairs. A state now is just a two-tuple of (here, there), and the here and there are still frozen sets. It's pretty much the same except we dropped out the time t. Go ahead and implement that for me.

1.25 25. 16 s Refactoring Paths


Here it is--pretty straightforward. I just dropped out the time, and I'm just building up these two components.

1.26 26. 17 p Calculating Costs


Now, we got rid of the times in the successor function, so we've got to put them back in someplace. I'm going to generalize a little bit, and instead of talking about times, I'm going to talk about costs for a path. I'm just thinking of maybe we might want to do some other problems that also have paths in them and that aren't dealing with optimizing time but are dealing with optimizing some type of cost. What I want you to for me is to define this function path_cost, which takes a path as input and returns the total cost of that path. That's already stored away. We don't have to compute anything new. Because we decided that our convention for paths was it was going to be stored there. That is, we said that a path is equal to a state followed by an action and a total cost followed by another state, etc. Here I've just said, well, if we don't have any actions there or if it's the empty path, then do one thing. Otherwise do something else. Then I also want you to find the bridge cost--bcost is the abbreviation I'll use. That's the cost of an individual action. An action in this domain is something like 2, 5, arrow to the right. I want you to figure out what's the cost of that action.

1.27 27. 17 s Calculating Costs


Pretty straightforward. If we don't have at least 3 elements in the path, that means we don't have an action there. It's just an individual state. The cost of that should be 0. Otherwise, we look at the second element from the end. There's a final state and then there's a final action. That should be the final action and total cost--this tuple--we just return the total cost. For the bridge cost of an action, it's just the maximum of the two times.

1.28 28. 18 l Putting it Together


Now we've got our new successor function. We know how to deal with costs. Now it's time to put it all together. It's a little bit tricky, so I'm not going to ask you to do this as a quiz. If you want to you can pause the video now and do it on your own. You're certainly welcome to give it a try. I'm going to go ahead and show it to you. Okay, here it is. 1 def bridge_problem2(here): 2 here = frozenset(here) | frozenset(['light']) 3 explored = set() # set of states we have visited 4 # state will be a (peoplelight_here, peoplelight_there) tuple 5 # E.g. ({1, 2, 5, 10, 'light'}, {}) 6 frontier = [ [(here, frozenset())] ] # ordered list of paths we have blazed 7 while frontier: 8 path = frontier.pop(0) 9 here1, there1 = state1 = final_state(path) 10 if not here1 or (len(here1) == 1 and 'light' in here1): 11 return path 12 explored.add(state1) 13 pcost = path_cost(path) 14 for (state, action) in bsuccessors2(state1).items(): 15 if state not in explored: 16 total_cost = pcost + bcost(action) 17 path2 = path + [(action, total_cost), state] 18 add_to_frontier(frontier, path2) 19 return Fail 20 21 def final_state(path): return path[-1] 22 23 def add_to_frontier(frontier, path): 24 "Add path to frontier, replacing costlier path if there is one." 25 # (This could be done more efficiently.) 26 # Find if there is an old path to the final state of this path. 12/16

CS212 Unit 4

06/05/12 11:49:22

27 old = None 28 for i,p in enumerate(frontier): 29 if final_state(p) == final_state(path): 30 old = i 31 break 32 if old is not None and path_cost(frontier[old]) < path_cost(path): 33 return # Old path was better; do nothing 34 elif old is not None: 35 del frontier[old] # Old path was worse; delete it 36 ## Now add the new path and re-sort 37 frontier.append(path)

The tricky part is just keeping track of the costs and putting them in the right location. Just like before we're popping paths off the frontier. We're checking to see if we hit a goal. We're keeping track of states that we've already explored. But now we're doing something new. We're computing the cost of the path that we just popped off, and that's just pulling the cost out, because we've already computed it and stored it in the final action. Then for each of the successors, we figure out the total cost is the cost of the path that we already computed so far plus the bride cost of the individual action. Total cost so far plus cost for one more action, and then we just throw that into the path. The new path is equal to the old path plus the action total cost tuple plus the state that we end up with. Add that to the frontier and we're done. I just define this simple one-line function here. The final_state of a path is the last element of the path. I use that there. Here is adding to the frontier. Now, it could just be throwing it on there the way we did before, but there's a tricky part here. The complication that I want to deal with here that we haven't dealt with before was there may be two different paths that end up in the same state. If that's the case, we want to choose the best one. We don't want to get to the state from a path that's more expensive. We look at see--is there a path that gets to the state that is already on the frontier? If there is, then check to see which one has a better path cost and use that.

1.29 29. 19 l Generalizing


The moral of the story is this is tricky. There are a lot of cases to deal with in getting this kind of search just right, and we made a couple mistakes along the way. I sort of duplicated the history of the field. There a couple tools we can get to avoid mistakes. One tool is to write lots of tests, and I just didn't do enough testing. I wanted to go fast. I wanted to be able to show you some of the interesting ideas. I put in a few tests, but I really need more to have confidence that I've got this right. The second thing is to use, or better yet, reuse existing tools. Every time I do a search, I don't want to be rewriting this search routine from scratch, because it is tricky and I will make mistakes. Rather I want to write it once or have somebody else write it once and then reuse it. In order to do that, we're going to have to figure out how to generalize. I've written a a function that's good only for solve the bridge problem through search. I want to write a search function that can solve a wide variety of problems. Then I want to reuse that so that I'm not repeating mistakes, and I'm not introducing new errors.

1.30 30. 20 q Missionaries and Cannibals


Let's do an example to figure out how to do generalization. What do we generalize over? Well, we generalize over problems. So we're going to need another problem. Rather than have a problem dealing with costs, which we saw were complicated , let's just do a problem where we're finding the shortest path. That is, the least number of steps to a solution. I'm going to choose a classic problem called the "missionaries and cannibals" problem. It works like this: there's a river we have to cross, similar to the bridge but this time it's a river. We've got a boat, and on this side of the river, there are 6 people. No flashlight, but a boat and 6 people. Three of these people are missionaries, and three are cannibals. The goal is to get everybody over to the other side. What makes it hard is that there are two rules. One, at most 2 in the boat. One person can go in the boat and cross from one side to the other, but it takes either 1 or 2 people to get the boat from one side and to get it back. The other rule is that we don't want the cannibals eating the missionaries. If we leave more cannibals that missionaries on either side of the river-- either on this side or over on this side-- then the cannibals are going to gang up and eat the missionaries, and we won't be able to accomplish getting everybody across. We have to shuttle them back in forth in such a way that this never occurs. Now, let's try to come up with a good representation for state. One possibility would be to have a set of missionaries, a set of cannibals, and a boat--let's call that a Boolean, yes or no, saying what's on the starting side and leaving out what's on the other side, because we can figure that out. Given that we know we have three missionaries, If there's a set of 2 on one side then the other side there must be 1. Another possibility is that we have 3 integers: the number of missionaries, the number of cannibals, and the number of boats that are on the starting side. These are all integers. Then the third possibility is that we have 6 numbers: the number of missionaries, cannibals, and boats on the first side, and the number of each of those on the other side. It may be subjective which of these is best, but I want you to tell me which of these would sufficient for representing the state. 13/16

CS212 Unit 4

06/05/12 11:49:22

1.31 31. 20 s Missionaries and Cannibals


The answer is that all of them would work. All of them have everything you need to know to solve this specific problem of three missionaries, three cannibals and the boat.

1.32 32. 21 q Generalized State


Now the next question is what representation for states should we use if we want to generalize this problem. So that we're given an initial state when there can be any number of missionaries, cannibals, and boats on one side of the river and any number on the other. Which of these representations is sufficient under those conditions?

1.33 33. 21 s Generalized State


In this case since we don't know that there's only three missionaries, we need to have both sets of numbers. We can't just say there's two missionaries on the left; therefore, there's one on the right. We don't know how many are going to be on the right. So this six-element tuple would do the job where these two wouldn't.

1.34 34. 22 p csuccessors


Now I want you to define the successor function for this problem. We'll give you a hint that a state is of that form. Return all the successors. The successors should be a dictionary as before. We want to include successor states that result in cannibals being able to eat, but such a state should have no successors itself. In other words, we're free to generate a successor state that has, say, two cannibals and one missionary in one location, but if we're given such a state then we should return the empty dictionary of successors.

1.35 35. 22 s csuccessors


Here's my solution. The key to my solution is a list of deltas, of differences in the states that correspond to these moves. What do I mean by that? One thing we can do is send two missionaries from a side with the boat to the other side. That would be a difference of 2 in the missionaries. We would add 2 to one side and subtract 2 from the other side and not change at all the number of cannibals and change the number of boats by 1. Or we could send 2 cannibals, or we could send one of each, or we could send only 1 missionary or cannibal. There are 5 possible moves, basically, depending on where the boat is. That's what csuccessors says. First we check for states with no successors. If there are more cannibals than missionaries but there are some missionaries, then they're going to get eaten, and so we return the empty dictionary as a result. Otherwise, we're going to collect up the number of items in our dictionary, and we're going to do that by going through these deltas and subtracting the deltas from the side where the boat is and adding them in to the other side. We have two directions we can go from left to right, start to the other side, or from the other side back to the original side. I made use here of vector addition and subtraction. I take the current state, which is 6 numbers, and I add or subtract these deltas. That's what these definitions say. Now, it would nice if this type of vector arithmetic was built into Python, and there are versions called "numeric Python" where you can do that, but here I had to write these functions myself.

1.36 36. 23 l mc problem


Now let's write a function to solve the missionary and cannibals problem. It takes a start state. Here's the normal problem: 3 missionary, 3 cannibals, and 1 boat on the start side. Nothing on the other side, and it takes a goal state. The goal state is not specified. It's just the opposite of that--3, 3, 1 on the other side. Nothing on the original side. The state is this 6-tuple, and we're trying to find a path from the initial state to the goal state. In fact, we're trying to find the path with the least number of steps. I'm not going to ask you to do this as a quiz. If you're enthusiastic, you can stop the video now and go ahead and solve it on your own, but now I'm going to go ahead and show it to you. Here's a solution that looks pretty much like the pouring water problem. We check to see if the goal is None, then we fix up a nice goal. We check to see if we've accidentally already reached the goal at the start. Then we just search for the shortest path.

1.37 37. 24 q Shortest Path Search


Now let's generalize. Let's take the specific solver--we had a specific one for the pouring problem and one for the missionaries and cannibals. Let's generalize them. I'm going to call the generalization "shortest_path_search." That's a search for the shortest path that reaches a goal. Let's take our inventory. The concepts we have to deal with--we've got paths, states, actions, successors. We have a start state. We have a goal. Now let's figure out how we're going to represent each of these concepts. Paths we already had. I don't see any reason to change. We have [state, action, state...]. Notice we're just doing shortest_path_search. We're not doing best_cost_search. We don't need to put in the total cost in here. We can just have the action by itself. We have states, and here the states can be atomic. We don't have to know anything about the states. In other words, a state can be anything that a 14/16

CS212 Unit 4

06/05/12 11:49:22

particular problem wants to deal with. Shortest_path_search doesn't have to know about that. Now, why is that the case? Because shortest_path_search can interface with states through these two functions-- through successors and through the goal function and through the start state. What do I mean by that? The start state is going to be some atomic state. We don't know anything more about that. Shortest_path doesn't know anything about that. When we go to use shortest_path_search for a particular problem, then we have to specify what a state looks like, but shortest_path_search itself doesn't have to know. All it has to know is that if you give the start state to the successor function-- so successor will be a function which takes a state as input and returns a dictionary of state-action pairs. Now, given that initial state that we passed in, we can generate new states and new actions. So the actions also are atomic. Shortest_path_search doesn't have to know anything about the representation other than that this is where they come from--from the successor function. Now, what about the goal? Well, we could specify an exact state that we're looking for, but sometimes we're looking for multiple states. We could specify a set of states, but sometimes the set of states is really big. There's lots of states that satisfy the goal. Instead, let's have the goal be a function. Its's a function. When you pass it a state it returns a boolean. True or False? Is that the goal? With that now we're ready to specify shortest_path_search. Shortest_path_search is going to be a function. It's going to take some inputs, and it's going to return a path, and return failure as a path if it can't find a solution. Now the question is out of this inventory, which of these things do we have to pass into shortest_path_search to allow us to solve a problem? Check all those that apply.

1.38 38. 24 s Shortest Path Search


The answer is what we have to pass in is the start state-- you've got to know where you're starting from, a successor function-- you have to know where you can get to from the start state, and a goal function--you have to know when you're done applying successors. That's it. We don't need to pass in any other actions or states or paths, because those can all be generated from these three.

1.39 39. 25 p sps function


Let's see if you can write that function. I've left you with the missionary and cannibals problem as sort of a template, but I want you to generalize that to write shortest_path_search, which takes a start state, a successor function, and a is_goal function and returns the shortest path.

1.40 40. 25 s sps function


It's pretty easy. We just took the template that we had for missionaries and cannibals and just replace these general functions--is_goal and successors-- put them in here rather than putting in the specific functions for the missionaries and cannibals.

1.41 41. 26 p Cleaning up MC Problem


Now let's complete the generalization. I'm going to define missionaries and cannibals problem, and we'll give it a 2 just so we can tell the two versions apart. It takes the same arguments as before. You may need some initialization code to get going. Then I want the body of the function, the main part, to just be a call to shortest_path_search with the appropriate arguments inserted. If you need to you can define other functions outside of here if that's necessary.

1.42 42. 26 s Cleaning up MC Problem


Here's my solution. I had to write some code to fix up the goal if it wasn't specified. Then it's just a single call. We call shortest_path_search with the start state we were given, with the csuccessors function that we've already defined, and then with a goal test. The goal test is that everybody is gone from the start side of the river. That we define this way.

1.43 43. 27 p Lowest Cost Search


Once again generalize. This time I want to go back to the bridge problem and generalize that. What we're going to come up with is lower_cost_search, and that'll take some arguments and again return a path, but let's figure out what we need. Yes, we're going to need the start state just like before. We're going to need a successor function, and we're going to need a goal function. In addition, we're going to need one more thing. We're going to need to know the cost of an action. That's going to be necessary. It's going to have to be a parameter to the function. We'll have the start, the successors, the goal, and the action cost and return from that a path. There's a notion of action_cost, and as part of our inventory of concepts, there's also the notion of path cost, but that won't have to be passed in as a prohibitor. Let's see if you can define for me lowest_cost_search, which takes these four parameters and should perform the same type of search as we saw previously with the bridge problem.

15/16

CS212 Unit 4

06/05/12 11:49:22

1.44 44. 27 s Lowest Cost Search


Here is my solution, and I got it by copying the code from the bridge problem and just generalizing it. Just replacing the B successors with successors and action_cost and so on.

1.45 45. 28 p Back to Bridge Problem


Now let's go ahead and redefine bridge problem in terms of lowest cost search, thereby generalizing it. In the initialization code you might need here a single call to lowest_cost_search. Any other functions you need to define here.

1.46 46. 28 s Back to Bridge Problem


Here's my solution. I have to define the start state given a set of people that are on the here side. I have to define the here side and just make sure that we throw in the flashlight there. Then on the other side there's nobody. Lowest_cost_search--starting from the start state, we've already defined the successor function. I'm defining a new function to test for a goal. We already defined the cost function. The new function to test for the goal is right here. It says if not here--in other words, if there's nothing here, if there's nobody here at all, it's the empty set, or if here is only the set of the flashlight. That normally wouldn't happen, but I guess it could happen if the initial problem was there's no people and just a flashlight. Then you've got a solution with doing nothing at all. I just wanted to make sure I covered that trivial case.

1.47 47. 29 l Summary


Congratulations. You made it to the end of the unit. What have we learned? Well, first of all, some problems require search. What I mean by search is you need to put together a sequence of steps, starting from a start and keep going. You don't know how many steps it's going to take, and you're trying to optimize some factor. There are different kinds of search. We just scratched the surface, believe me. It's a gigantic field with all sorts of different algorithms and different types of applicability for these different algorithms. There are many complications we didn't cover, but we covered two-- the shortest_path and the least_cost search. These are two of the most useful. Third, search is really subtle. There are lots of possible problems lurking in there and many that we didn't even cover yet. What that means is where there is subtlety, there is likely to be bugs, and there are even some bugs where there is no subtlety. That means we have to be careful. We have these two tools for combating bugs. One is lots of tests, and the second is standardized tools. That is, we work really hard to make a tool that we know works and has got all the bugs out of it, and then we reuse that tool. Part of that reuse is generalization-- to look at a specific problem and say, "Here we solved this specific problem this way," and to generalize it, to say here's part of that that I think we're going to use over and over again. Let's break that out, and now we'll have two parts to the solution. We want to be thinking about this specific problem, and we want to be thinking about the more general problem. We want to be allocating our work to one or the other appropriately. Congratulations again. You learned a lot of important concepts. You did a great job in writing some very complex programs.

16/16

Potrebbero piacerti anche