Sei sulla pagina 1di 15

CS 4231 Solutions to HW2

Problems 1, 2 graded by Varun Jalan


Problem 3 graded by Neetha Sebastian
Problem 4 graded by Huzaifa Neralwala
Problems 5, 6 graded by Manish Vasani

1 Problem 1
1.1 a
T(n) = 4T(n/2) +n
2
We can use the master method for solving the above recurrence since a = 4 1, b = 2 > 1,
and f(n) = n
2
is an asymptotically positive function. We have that f(n) = (n
log
2
4
). Thus
the second case of the master method is applicable. This gives us T(n) = (n
log
b
a
lg n) =
(n
2
lg n).
Alternately, we can expand the series and get :
T(n) = n
2
+ 4(n/2)
2
+ 16(n/4)
2
+ + 4
k
(n/2
k
)
2
where 2
k
is the largest power of two which is at most n. Each of the above terms in the
summation equals n
2
, and there are k = (lg n) terms. Thus we get T(n) = (n
2
lg n).
1.2 b
T(n) = 3T(n/4) +n
We can expand the series and get :
T(n) = n + (3/4)n + (3/4)
2
n + + (3/4)
k
n
where k is the biggest power of 3/4 which is at most n. We can write the summation as :
T(n) = n (1 + (3/4) + (3/4)
2
+ + (3/4)
k
)
The summation of the right side is clearly bounded by n
1
1
3
4
. Thus we get T(n) = (n).
Alternately, case 3 of the master method is applicable as well, which gives us T(n) =
(f(n)) = (n)
1.3 c
T(n) = 3T(n/2) +nlg
2
n
We have a = 3, b = 2 and f(n) = nlg
2
n. nlg
2
n = O(n
log
2
3
) since log
2
3 = 1.58 > 1. Thus
case 1 of the master method is applicable. This gives us T(n) = (n
log
2
3
) = (n
1.58
).
1
1.4 d
T(n) = T(

n) + 1
Substituting m = lg n gives :
T(2
m
) = T(2
m/2
) + 1
Let S(m) = T(2
m
). This gives the recurrence :
S(m) = S(m/2) + 1
By expansion or by the Master Theorem, it is easy to see that the solution of the above
recurrence is S(m) = (lg m). Changing back to n, we get :
T(n) = S(m) = (lg m) = (lg lg n)
Thus, T(n) = (lg lg n).
1.5 e
T(n) = T(n 1) +
1
n
By expansion, we have :
T(n) =
1
n
+
1
n 1
+ +
1
1
We know that the summation on the left is the nth Harmonic number, H
n
and that H
n
=
(lg n). Thus,
T(n) = (lg n)
1.6 f
T(n) = 4T(n 2) + 1
By expansion, we have :
T(n) = 1 + 4 + 4
2
+ + 4
n/2
=
4
n/2+1
1
4 1
Thus, we have T(n) = (4
n/2
) = (2
n
).
2
2 Problem 2
2.1 a
Let us assume that there are x good chips, where 2 x n. We divide the chips in three
groups. The rst group A consists of all the x good chips, the second group B consists of
any x bad chips, and the third group C consists of the remaining n 2 x bad chips. We
assume that the chips of group B mimic the corresponding chips in group A. More precisely,
any test carried out will have one of the following results :
A chip of group A is tested against another chip of group A. In this case, both the
chips mark the other one as good.
A chip of group A is tested against a chip of group B. In this case, both the chips
mark the other one as bad.
A chip of group B is tested against another chip of group B. In this case, both the
chips mark each other as good.
A chip from either group A or B is tested against another chip of group C. In this
case, both chips mark each other as bad.
Clearly, a chip in group A (a good chip) is indistinguishable from a chip in group B (a
bad chip). Given the results of the tests, it might as well be possible that the chips in group
B were good, and those in A were bad. Hence, it is not possible for an algorithm to surely
pinpoint a good chip.
Note that this argument fails when 2 x > n, since the good group cannot be mimicked
by a bad group of the same size.
2.2 b
The idea behind the algorithm is to test pairs of chips. We discard both chips as long as we
are sure that there is at least one bad chip amongst the pair discarded, or we discard any
one chip if it might be the case that both chips are good. More precisely, the algorithm is
as follows :
3
TESTCHIP()
1 while n > 1
2 make n/2 pairs for comparision with each other.
3 Leave the single chip(if n is odd) alone
4 for each pair {chip X, chip Y }
5 if x says y is good, and y says x is good, then discard one of the chips and
6 retain the other.
7 else discard both chips
8 if n is odd and the number of chips retained is odd, discard the unpaired chip
9 and retain it otherwise.
10 n = Number of retained chips.
11 return the remaining chip
We prove that the algorithm correctly returns a good chip if there are more than n/2
good chips at the start.
Proof. We use the loop invariant that the number of good chips at the start of any iteration
in line 1 is always more than n/2.
Initialization: By our assumption, we know that the invariant holds at the start.
Maintenance: It is easy to see that whenever we discard both the chips in a tested pair
in line 7, at least one of the chips is bad. Hence, after discarding both chips in all such
pairs, the number of good chips is still more than half the number of remaining chips. Out
of the remaining pairs of chips where both chips test the other one as good, let the number
of pairs where both chips are actually good be denoted by X, and those where both chips
are actually bad be denoted by Y .
Now we consider the following cases:
n is even. Clearly, X > Y , since otherwise, the number of bad chips would be at least
half the total number of chips. In this case, the number of good chips retained is X
whereas the number of bad chips retained is Y . Hence the invariant is maintained.
n is odd and (X + Y ) is odd. If Y > X, then even if the unpaired chip is good we
have at least half bad chips, which is a contradiction. Thus, we have X > Y . Since we
discard the unpaired chip, the invariant is maintained.
n is odd and (X +Y ) is even. Consider the following cases:
X < Y . This is not possible since in this case the number of bad chips is at least
half the total number of chips.
4
X = Y . Here, the unpaired chip must be good. Thus, the invariant will be
maintained since we include it.
X = Y + 1. This is not possible since X +Y is even.
If X > Y + 1 then irrespective of the type of the unpaired chip, the majority of
good chips is maintained.
Termination: On termination, there is clearly only one chip remaining which is good.
The other good chips can be identied by testing all other chips against this good chip.
There is an alternative solution. The idea behind the algorithm is to maintain groups of
chips, where we are guaranteed that all chips in the same group are of the same type (all
good, or all bad), and that the total number of good chips is strictly more than half the total
number of chips. Since all chips within a group are of the same type, a more precise measure
of the size of the problem state is the number of groups, rather than the actual number of
total chips. We will show that with n/2 pairwise tests, we can can halve the problem size.
The algorithm is given below :
TESTCHIP2()
1 Initialize n groups each containing 1 chip
2 groups = n
3 while groups > 1
4 make groups/2 pairs for comparision with each other.
5 Leave the single group(if groups is odd) alone
6 for each pair {group X, group Y }
7 pick one chip x from X and another chip y from Y and compare them
8 if x says y is good, and y says x is good, then merge the groups together
9 else if x says y is bad and y says x is good (or vice versa), discard the group saying good
10 else if x says y is bad and y says x is bad, throw as many chips from the bigger group
11 as the ones in the smaller group, and discard the smaller group. Discard the other
12 bigger group if it is now empty
13 groups = groups/2
14 return any chip from the only remaining group
We prove that the algorithm correctly returns a good chip if there are more than n/2
good chips at the start.
Proof. We use the following loop invariants that we mentioned above:
5
Each group consists of chips of the same type
The total number of good chips is strictly more than half the total number of chips
Initialization:
Initially each group contains a single good chip of a bad chip
We know that the number of good chips initially is more than half the total number
of chips
Maintenance:
A group is merged only in line 8. We know that both chips call each other good if and
only if both are good, or both are bad. Thus the merged group contains the same type
of chips
In line 9, we will surely discard only a bad group. The discarded group marks the
other group as good, and the other group marks the discarded group as bad. This
results in a contradiction if the discarded group was good. In line 10, an equal number
of chips from both groups are discarded, thus maintaining the majority for the good
chips. This is because at least one out of the two concerned groups have to be bad
since both groups are marked as bad by the other group.
Termination: On termination, there is clearly only one group remaining which has chips
of the same type. These have to be good chips, since good chips are always present in
majority.
The above algorithm clearly makes n/2 comparisons per iteration of the loop in line
3, and the number of groups after one iteration halves. As mentioned above, the number of
groups is a more precise measure of the size of the problem.
2.3 c
The above algorithm nds a good chip in O(n) time.
Proof. We have the following recurrence :
T(n) = T(n/2) + n/2
By expansion, we get T(n) = n/2 +n/4 + +n/2
k
, where 2
k
is the biggest power of 2
which is at most n. Thus we get T(n) = n(1/2 + 1/4 + + 1/2
k
). The right hand since is
clearly at most n, and thus T(n) = O(n).
6
Once we nd this good chip, we can nd all the good chips by testing the good chip
against every other chip. This needs another n 1 tests, and the total number of tests used
to identify all good chips is O(n).
7
Problem 3:
(i) What is the probability that the guess in the ith step is correct?
(ii) Show that the expected total number of points that you gain using the random strategy is (logn)
Solution:
Probability that the guess in the ith step is correct :
1
(ni+1)
At the rst step, any ordering is possible, so the probability that the guess is correct on the rst step is
1
n
.
At the second step, we know that the element cannot be the one that was picked in the earlier step since all
the cards are distinct, so there are only n 1 elements to choose from. Hence the probability of making a
correct guess in this step is
1
(n1)
. Similarly, the probability of making a correct guess on the third step is
1
(n2)
, fourth step is
1
(n3)
and so on. Thus probability of making a correct guess on the ith step is
1
(ni+1)
.
We can calculate the expected number of total points that one can gain using the random strategy by
setting up an indicator random variable X
i
for the ith guess:
X
i
=

1, if the ith guess is correct


0, if the ith guess is incorrect
The expected value of the the indicator random variable is given by
E[X
i
] = 1. Pr(guess is correct) + 0. Pr(guess is incorrect)
= 1.
1
n i + 1
+ 0
Using this, we can calculate the expected value of the number of points E[x] by calculating the expected
value of the sum of the points for each of the trials.
E[X] = E[
n

i=1
X
i
]
=
n

i=1
E[X
i
]
=
n

i=1
1. Pr(guess is correct) + 0. Pr(guess is incorrect)
=
n

i=1
1.
1
n i + 1
= 1 +
1
2
+
1
3
+ ... +
1
n 2
+
1
n 1
+
1
n
= (logn)
The nal summation is the harmonic series, and it is known that it is (logn)
1
Problem 4

1. The Randomized QuickSort algorithm as given in the textbook is

PARTITION(A, p, r)
1 x A[r]
2 i p - 1
3 for j p to r - 1
4 do if A[j] x
5 then i i + 1
6 exchange A[i] A[j]
7 exchange A[i + 1] A[r]
8 return i + 1

RANDOMIZED-PARTITION(A, p, r)
1 i RANDOM(p, r)
2 exchange A[r] A[i]
3 return PARTITION(A, p, r)

RANDOMIZED-QUICKSORT(A, p, r)
1 if p < r
2 then q RANDOMIZED-PARTITION(A, p, r)
3 RANDOMIZED-QUICKSORT(A, p, q - 1)
4 RANDOMIZED-QUICKSORT(A, q + 1, r)

When all elements in the array A are equal, we can observe that in any call to PARTITION,
A|]] = x, v p ] < r and hence the pivot is always placed at A|r]. Thus, the right partition is
always empty and the left partition contains r -p elements. The running time for any input is
independent of the random choices made and is given by:
I(n) = I(n -1) +0(n) = 0(n
2
)
And the expected running time is:
I

(n) = I(n) = 0(n


2
)








2. The required PARTITION routine is as follows:

PARTITION3(A, p, r)
1 x - A|r]
2 q1 - p -1
3 q2 - p -1
4 for j p to r 1
5 do if A|]] = x
6 then q2 - q2 +1
7 exchange A|q2] - A|]]
8 else if A|]] < x
9 then q2 - q2 +1
10 q1 - q1 +1
11 exchange A|q1] - A|q2]
12 exchange A|q1] - A|]]
13 exchange A|q2 +1] - A|r]
14 return (q1, q2 +1)

Proof of correctness:
We will use the loop invariant: At the beginning of each iteration of the loop on line 4, the following
conditions hold:
1 v p k q1, A|k] < x
2 v q1 < k q2, A|k] = x
3 v q2 < k < ], A|k] > x
4 A|r] = x

Initialization:
At initialization, q1 = p -1, q2 = p -1, ] = p and the conditions are trivially true.
Maintenance:
x If A|]] > x
Then, v q2 < k < ] +1, A|k] > x.
Since in this case, we simply increment ] by 1, the conditions hold at the beginning of the
next iteration.
x If A|]] = x
We know that eitherA|q2 +1] > x OR q2 + 1 = ].
Since, in this case, we exchange A|q2 +1] and A|]] and increment q2 by 1, we get
v q1 < k q2, A|k] = x and v q2 < k < ] +1, A|k] > x.
Hence, the conditions hold at the beginning of the next iteration.
x If A|]] < x
We know that exactly one of the following conditions holds
o A|q1 +1] = x AND either A|q2 +1] > x or q2 +1 = ]
o A|q1 +1] > x AND q2 = q1
o q1 = q2 = ] -1
In any case we see that on performing the operations on lines 9 to 12 (note that we
increment q1 and q2), we get v p k q1, A|k] < x and v q1 < k q2, A|k] = x and
v q2 < k < ] + 1, A|k] > x.
Hence, the conditions hold at the beginning of the next iteration.

Termination:
At termination we get
1 v p k q1, A|k] < x
2 v q1 < k q2, A|k] = x
3 v q2 < k < r, A|k] > x
4 A|r] = x
Since A|r] = x AND either A|q2 +1] > x OR q2 +1 = r. In either case, after performing step 13,
we get,
1 v p k q1, A|k] < x
2 v q1 < k q2 +1, A|k] = x
3 v q2 +1 < k r, A|k] > x
which is as required.

3. The modified algorithms are:

RANDOMIZED-PARTITION3(A, p, r)
1 i RANDOM(p, r)
2 exchange A[r] A[i]
3 return PARTITION3(A, p, r)

RANDOMIZED-QUICKSORT3(A, p, r)
1 if p < r
2 then (q1,q2) RANDOMIZED-PARTITION3(A, p, r)
3 RANDOMIZED-QUICKSORT3(A, p, q1)
4 RANDOMIZED-QUICKSORT3(A, q2 + 1, r)

Expected Running Time:
x For inputs where all elements in the array are distinct, if q = PARTITI0N(A, p, r) and
(q1, q2) = PARTITI0NS(A, p, r) then q1 = q -1 and q2 = q +1. Hence, for a particular
input and over a particular sequence of random choices, the recursive calls made by
RANDOMIZED-QUICKSORT3 will be the same as the recursive calls made by RANDOMIZED-
QUICKSORT and the running time will be the same for both algorithms. The expected
running time for these inputs will thus be the same as that of RANDOMIZED-QUICKSORT,
i.e., 0(n log n).
x For inputs where some elements are equal, we observe that once any sub-array A|p. . r] is
partitioned into A|p. . q1], A|q1 +1. . q2] onJ A|q2 +1. . r], any two elements from
different partitions will never be compared. Also, no two elements from A|q1 +1. . q2] will
be compared at any subsequent time. Hence if,
X
]
i
= I { z

is compaieu to z
]
uuiing an execution of RANB0NIZEB -Q0ICKS0RTS ]
And,
X
]
= I { z

is compaieu to z
]
uuiing an execution of RANB0NIZEB -Q0ICKS0RT ]
Then,
E|X
]
i
] = Pi _
z

is the fiist pivot chosen fiom Z


]
anu no othei pivot
pieviously chosen has the same value as z

_ +
Pi _
z
]
is the fiist pivot chosen fiom Z
]
anu no othei pivot
pieviously chosen has the same value as z
]
_
Pi|z

is the fiist pivot chosen fiom Z


Ij
| +
Pi|z

is the fiist pivot chosen fiom Z


Ij
|
= E|X
]
]
Let,
X
i
= X
]
i
n
]=+1
n-1
=1

Hence,
E|X
i
] = E|X
Ij
i
]
n
]=+1
n-1
=1
E|X
Ij
]
n
]=+1
n-1
=1
= E|X]
Following the analysis of RANDOMIZED-QUICKSORT given in the textbook, we get
E|X'] = 0(n log n)
Hence, the expected running time for these inputs is 0(n log n).
Hence, for every input array, the expected running time of RANDOMIZED-QUICKSORT3 will be
0(n log n)
Problem 5
Exercise 9.3.6 The k
th
quantiles are the k 1 order statistics that divide the sorted set
into k equal sized sets (to within 1). Give an O(nlgk) algorithm to list the k
th
quantiles of
a set.
Answer: The k
th
quantiles of an array are the elements of the corresponding sorted
array at indices r (n/k), where 1 r < k. We will use a divide and conquer approach to
nd the k
th
quantiles of the given unsorted array.
MAIN (A, n, k)
Q = emptylist
COMPUTE-QUANTILES(A, 1, n, k, Q)
end
COMPUTE-QUANTILES(A, p, q, k, Q)
if k > 1 then
r = k/2
s = r n/k
t = SELECT(A, p, q, s)
Q.add(t)
COMPUTE-QUANTILES(A, p, s, k/2, Q)
COMPUTE-QUANTILES(A, s + 1, q, k/2, Q)
end-if
end
Each call to COMPUTE-QUANTILES computes k/2
th
of the k1 order statistics for
the sub-array A[p..q]. It does so by using the SELECT routine to partition the elements of
the sub-array around the order statistic t (see section 9.3 of CLRS for details on SELECT).
It then recursively computes the k/2
th
quantiles on the left sub-array and the k/2
th
quantiles on the right sub-array.
To analyze the running time of the routine COMPUTE-QUANTILES, we note that
the value of k is approximately halved for the recursive calls. Hence the recursion tree of
COMPUTE-QUANTILES has O(lgk) levels. At any given level i of this tree, we have 2
i
instances of COMPUTE-QUANTILES. Each instance of COMPUTE-QUANTILES makes
one call to the SELECT routine. Each of these instances operate on approximately n/2
i
elements of the array. Hence the combined running time of SELECT routines at level i is
O(2
i
(n/2
i
)) = O(n) and the total runtime of the algorithm is T = O(ndepth) = O(nlgk).
1
Problem 6
Exercise 9.3.9 Optimal placement of main pipeline to minimize the total length of the
spur pipelines.
Answer: Professor Olay should compute the median of the y-coordinates of all the
oil wells and place the pipeline at this y-coordinate to ensure an optimal placement. This
algorithm can be implemented in linear time by computing the n/2
th
order statistic of the
set of y-coordinates using the SELECT routine (section 9.3 in CLRS). Let us prove that this
strategy is optimal by providing a proof by contradiction.
Let z represent the sum of lengths of all the vertical spur pipelines by our suggested
strategy S and y
m
be the y-coordinate at which the main pipeline is placed. Let us assume
that there exists a more optimal strategy S

which places the main pipeline at y-coordinate


y

. Let z

be the sum of lengths of all vertical spur pipelines for S

, such that z

< z. Let
w = |y
m
y

| represent the vertical distance between the main pipelines in S and S

. Let us
consider all the possible relations between y
m
and y

:
1. y

> y
m
: We know from our strategy S that there are at least k wells, k n/2, with
y-coordinates y
i
y
m
. All these oil wells are distance w farther away from the main
pipeline in S

than in S. We also know that there are at most k

wells, k

n/2
with y-coordinates y
i
> y
m
. These wells are distance w nearer to the main pipeline in
S

than in S. Hence we get z

z = w(k k

). As k k

, we get z

z 0 or z

z,
a contradiction.
2. y

< y
m
: We know from our strategy S that there are at least k wells, k n/2, with
y-coordinates y
i
y
m
. All these oil wells are distance w farther away from the main
pipeline in S

than in S. We also know that there are at most k

wells, k

n/2
with y-coordinates y
i
< y
m
. These wells are distance w nearer to the main pipeline in
S

than in S. We again get z

z = w(k k

) 0 or z

z, a contradiction.
3. If y

= y
m
then z

= z, again a contradiction.
Hence our strategy S is optimal.
1

Potrebbero piacerti anche