Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Theory of Computation(contd)
Theory of Computation | Pumping Lemma
There are two Pumping Lemmas, which are defined for
1. Regular Languages, and
2. Context Free Languages
Pumping Lemma for Regular Languages
For any regular language L, there exists an integer n, such that for all x L with |x| n, there
exists u, v, w , such that x = uvw, and
(1) |uv| n
(2) |v| 1
(3) for all i 0: uviw L
In simple terms, this means that if a string v is pumped, i.e., if v is inserted any number of
times, the resultant string still remains in L.
Pumping Lemma is used as a proof for irregularity of a language. Thus, if a language is regular,
it always satisfies pumping lemma. If there exists at least one string made from pumping which
is not in L, then L is surely not regular.
The opposite of this may not always be true. That is, if Pumping Lemma holds, it does not mean
that the language is regular.
For above example, 0n1n is CFL, as any string can be the result of pumping at two places, one for
0 and other for 1.
Let us prove, L012 = {0n1n2n | n 0} is not Context-free.
Let us assume that L is Context-free, then by Pumping Lemma, the above given rules follow.
Now, let x L and |x| n. So, by Pumping Lemma, there exists u, v, w, x, y such that (1) (3)
hold.
We show that for all u, v, w, x, y (1) (3) do not hold.
If (1) and (2) hold then x = 0n1n2n = uvwxy with |vwx| n and |vx| 1.
(1) tells us that vwx does not contain both 0 and 2. Thus, either vwx has no 0s, or vwx has no
2s. Thus, we have two cases to consider.
Suppose vwx has no 0s. By (2), vx contains a 1 or a 2. Thus uwy has n 0s and uwy either has
less than n 1s or has less than n 2s.
But (3) tells us that uwy = uv0wx0y L.
So, uwy has an equal number of 0s, 1s and 2s gives us a contradiction. The case where vwx
has no 2s is similar and also gives us a contradiction. Thus L is not context-free.
Source : John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman (2003). Introduction to Automata
Theory, Languages, and Computation.
Even number of as : The regular expression for even number of as is (b|ab*ab*)*. We can
construct a finite automata as shown in Figure 1.
The above automata will accept all strings which have even number of as. For zero as, it
will be in q0 which is final state. For one a, it will go from q0 to q1 and the string will
not be accepted. For two as at any positions, it will go from q0 to q1 for 1st a and q1 to
q0 for second a. So, it will accept all strings with even number of as.
String with ab as substring : The regular expression for strings with ab as substring is
(a|b)*ab(a|b)*. We can construct finite automata as shown in Figure 2.
The above automata will accept all string which have ab as substring. The automata will
remain in initial state q0 for bs. It will move to q1 after reading a and remain in same
state for all a afterwards. Then it will move to q2 if b is read. That means, the string
has read ab as substring if it reaches q2.
String with count of a divisible by 3 : The regular expression for strings with count of a divisible
by 3 is {a3n | n >= 0}. We can construct automata as shown in Figure 3.
The above automata will accept all string of form a3n. The automata will remain in initial
state q0 for and it will be accepted. For string aaa, it will move from q0 to q1 then q1
to q2 and then q2 to q0. For every set of three as, it will come to q0, hence accepted.
Otherwise, it will be in q1 or q2, hence rejected.
Note : If we want to design a finite automata with number of as as 3n+1, same automata
can be used with final state as q1 instead of q0.
If we want to design a finite automata with language {akn | n >= 0}, k states are required.
We have used k = 3 in our example.
Binary numbers divisible by 3 : The regular expression for binary numbers which are divisible by
three is (0|1(01*0)*1)*. The examples of binary number divisible by 3 are 0, 011, 110, 1001,
1100, 1111, 10010 etc. The DFA corresponding to binary number divisible by 3 can be shown in
Figure 4.
The above automata will accept all binary numbers divisible by 3. For 1001, the automata
will go from q0 to q1, then q1 to q2, then q2 to q1 and finally q2 to q0, hence accepted.
For 0111, the automata will go from q0 to q0, then q0 to q1, then q1 to q0 and finally q0
to q1, hence rejected.
String with regular expression (111 + 11111)* : The string accepted using this regular expression
will have 3, 5, 6(111 twice), 8 (11111 once and 111 once), 9 (111 thrice), 10 (11111 twice) and all
other counts of 1 afterwards. The DFA corresponding to given regular expression is given in
Figure 5.
Question : What will be the minimum number of states for strings with odd number of as?
Solution : The regular expression for odd number of a is b*ab*(ab*ab*)* and corresponding
automata is given in Figure 6 and minimum number of states are 2.
L1 = (a b)
L1* = (a b)*
Complement : If L(G) is regular language, its complement L(G) will also be regular.
Complement of a language can be found by subtracting strings which are in L(G) from all
possible strings. For example,
L(G) = {an | n > 3}
L(G) = {an | n <= 3}
Note : Two regular expressions are equivalent if languages generated by them are same.
For example, (a+b*)* and (a+b)* generate same language. Every string which is
generated by (a+b*)* is also generated by (a+b)* and vice versa.
How to solve problems on regular expression and regular languages?
Question 1 : Which one of the following languages over the alphabet {0,1} is described
by the regular expression?
(0+1)*0(0+1)*0(0+1)*
(A) The set of all strings containing the substring 00.
(B) The set of all strings containing at most two 0s.
(C) The set of all strings containing at least two 0s.
(D) The set of all strings that begin and end with either 0 or 1.
Solution : Option A says that it must have substring 00. But 10101 is also a part of
language but it does not contain 00 as substring. So it is not correct option.
Option B says that it can have maximum two 0s but 00000 is also a part of language. So
it is not correct option.
Option C says that it must contain atleast two 0. In regular expression, two 0 are present.
So this is correct option.
Option D says that it contains all strings that begin and end with either 0 or 1. But it can
generate strings which start with 0 and end with 1 or vice versa as well. So it is not
correct.
Question 2 : Which of the following languages is generated by given grammar?
S -> aS | bS |
(A) {an bm | n,m 0}
(B) {w {a,b}* | w has equal number of as and bs}
(C) {an | n 0} {bn | n 0} {an bn | n 0}
(D) {a,b}*
Solution : Option (A) says that it will have 0 or more a followed by 0 or more b. But S ->
bS => baS => ba is also a part of language. So (A) is not correct.
Option (B) says that it will have equal no. of as and bs. But But S -> bS => b is also a
part of language. So (B) is not correct.
Option (C) says either it will have 0 or more as or 0 or more bs or as followed by bs.
But as shown in option (A), ba is also part of language. So (C) is not correct.
Option (D) says it can have any number of as and any numbers of bs in any order. So
(D) is correct.
Explanation : Initially, the state of automata is q0 and symbol on stack is Z and the input is
aaabbb as shown in row 1. On reading a (shown in bold in row 2), the state will remain q0 and
it will push symbol A on stack. On next a (shown in row 3), it will push another symbol A on
stack. After reading 3 as, the stack will be AAAZ with A on the top. After reading b (as shown
in row 5), it will pop A and move to state q1 and stack will be AAZ. When all bs are read, the
state will be q1 and stack will be Z. In row 8, on input symbol and Z on stack, it will pop Z
and stack will be empty. This type of acceptance is known as acceptance by empty stack.
Note :
The above pushdown automaton is deterministic in nature because there is only one move
from a state on an input symbol and stack symbol.
The non-deterministic pushdown automata can have more than one move from a state on
an input symbol and stack symbol.
It is not always possible to convert non-deterministic pushdown automata to deterministic
pushdown automata.
Power of non-deterministic PDA is more as compared to deterministic PDA as some
languages which are accepted by NPDA but not by deterministic PDA which will be
discussed in next article.
The push down automata can either be implemented using empty stack or by final state
and one can be converted to another.
L2 = { anbn | n >= 0 }
L3 = { anbncn | n >= 0 }
Which of the following statements is NOT TRUE?
A. Push Down Automata (PDA) can be used to recognize L1 and L2
B. L1 is a regular language
C. All the three languages are context free
D. Turing machine can be used to recognize all the three languages
Solution : Option (A) says PDA can be used to recognize L1 and L2. L1 contains all strings with
any no. of a followed by any no. of b. So, it can be accepted by PDA. L2 contains strings with n
no. of as followed by n no. of bs. It can also be accepted by PDA. So, option (A) is correct.
Option (B) says that L1 is regular. It is true as regular expression for L1 is a*b*.
Option (C) says L1, L2 and L3 are context free. L3 languages contains all strings with n no. of
as followed by n no. of bs followed by n no. of cs. But it cant be accepted by PDA. So option
( C) is not correct.
Question : The language L = { 0i12i | i 0 } over the alphabet {0, 1, 2} is :
A. Not recursive
B. Is recursive and deterministic CFL
C. Is regular
D. Is CFL bot not deterministic CFL.
Solution : The above language is deterministic CFL as for 0s, we can push 0 on stack and for
2s we can pop corresponding 0s. As there is no ambiguity which moves to take, it is
deterministic. So, correct option is (B). As CFL is subset of recursive, it is recursive as well.
Question : Consider the following languages:
L1 = { 0n1n| n0 }
L2 = { wcwr | w {a,b}* }
L3 = { wwr | w {a,b}* }
Which of these languages are deterministic context-free languages?
A. None of the languages
B. Only L1
C. Only L1 and L2
D. All three languages
Solution : Languages L1 contains all strings in which n 0s are followed by n 1s. Deterministic
PDA can be constructed to accept L1. For 0s we can push it on stack and for 1s, we can pop
from stack. Hence, it is DCFL.
L2 contains all strings of form wcwr where w is a string of as and bs and wr is reverse of w.
For example, aabbcbbaa. To accept this language, we can construct PDA which will push all
symbols on stack before c. After c, if symbol on input string matches with symbol on stack, it is
popped. So, L2 can also be accepted with deterministic PDA, hence it is also DCFL.
L3 contains all strings of form wwr where w is a string of as and bs and wr is reverse of w. But
we dont know where w ends and wr starts. e.g.; aabbaa is a string corresponding to L3. For first
a, we will push it on stack. Next a can be either part of w or wr where w=a. So, there can be
multiple moves from a state on an input symbol. So, only non-deterministic PDA can be used to
accept this type of language. Hence, it is NCFL not DCFL.
So, correct option is (C). Only, L1 and L2 are DCFL.
Question : Which one of the following grammars generate the language L = { aibj | i j }
S -> AC | CB, C -> aCb | a | b, A -> aA | , B -> Bb |
S -> aS | Sb | a | b
S -> AC | CB, C -> aCb | , A -> aA | , B -> Bb |
S -> AC | CB, C -> aCb | , A -> aA | a, B -> Bb | b
Solution : The best way to solve these type of questions is to eliminate options which do not
satisfy conditions. The conditions for language L is no. of as and no. of bs should be unequal.
In option (B), S => aS => ab. It can generate strings with equal as and bs. So, this option is
incorrect.
In option (C), S => AC => C => . In , as and bs are equal (0), so it is not correct option.
In option (A), S will be replaced by either AC or CB. C will either generate no. of as more than
no. of bs by 1 or no. of bs more than no. of as by 1. But one more a or one more b can be
compensated by B -> bB | or A -> aA | respectively. So it may give strings with equal no. of
as and bs. So, it is not a correct option.
In option (D), S will be replaced by either AC or CB. C will always generate equal no. of as and
bs. If we replace S by AC, A with add atleast one extra a. and if we replace S by CB, B will add
atleast one extra b. So this grammar will never generate equal no. of as and bs. So, option (D)
is correct.
This article has been contributed by Sonal Tuteja.
Please write comments if you find anything incorrect, or you want to share more information
about the topic discussed above
See GATE Corner for all information about GATE CS and Quiz Corner for all Quizzes on GeeksQuiz
Step 1: Q =
Step 2: Q = {q0}
Step 3: For each state in Q, find the states for each input symbol.
Currently, state in Q is q0, find moves from q0 on input symbol a and b using transition function
of NFA and update the transition table of DFA.
(Transition Function of DFA)
Now { q0, q1 } will be considered as a single state. As its entry is not in Q, add it to Q.
So Q = { q0, { q0, q1 } }
Now, moves from state { q0, q1 } on different input symbols are not present in transition table of
DFA, we will calculate it like:
( { q0, q1 }, a ) = ( q0, a ) ( q1, a ) = { q0, q1 }
( { q0, q1 }, b ) = ( q0, b ) ( q1, b ) = { q0, q2 }
Now we will update the transition table of DFA.
Now { q0, q2 } will be considered as a single state. As its entry is not in Q, add it to Q.
So Q = { q0, { q0, q1 }, { q0, q2 } }
Now, moves from state {q0, q2} on different input symbols are not present in transition table of
DFA, we will calculate it like:
( { q0, q2 }, a ) = ( q0, a ) ( q2, a ) = { q0, q1 }
( { q0, q2 }, b ) = ( q0, b ) ( q2, b ) = { q0 }
Now we will update the transition table of DFA.
(Transition Function of DFA)
As there is no new state generated, we are done with the conversion. Final state of DFA will be
state which has q2 as its component i.e., { q0, q2 }
Following are the various parameters for DFA.
Q = { q0, { q0, q1 }, { q0, q2 } }
= ( a, b )
F = { { q0, q2 } } and transition function as shown above. The final DFA for above NFA has
been shown in Figure 2.
Note : Sometimes, it is not easy to convert regular expression to DFA. First you can convert
regular expression to NFA and then NFA to DFA.
Question : The number of states in the minimal deterministic finite automaton corresponding to
the regular expression (0 + 1)* (10) is ____________.
Solution : First, we will make an NFA for the above expression. To make an NFA for (0 + 1)*,
NFA will be in same state q0 on input symbol 0 or 1. Then for concatenation, we will add two
moves (q0 to q1 for 1 and q1 to q2 for 0) as shown in Figure 3.
Step 1. P0 will have two sets of states. One set will contain q1, q2, q4 which are final states of
DFA and another set will contain remaining states. So P0 = { { q1, q2, q4 }, { q0, q3, q5 } }.
Step 2. To calculate P1, we will check whether sets of partition P0 can be partitioned or not:
For set { q1, q2, q4 } :
( q1, 0 ) = ( q2, 0 ) = q2 and ( q1, 1 ) = ( q2, 1 ) = q5, So q1 and q2 are not
distinguishable.
Similarly, ( q1, 0 ) = ( q4, 0 ) = q2 and ( q1, 1 ) = ( q4, 1 ) = q5, So q1 and q4 are not
distinguishable.
Since, q1 and q2 are not distinguishable and q1 and q4 are also not distinguishable, So q2 and q4
are not distinguishable. So, { q1, q2, q4 } set will not be partitioned in P1.
For set { q0, q3, q5 } :
( q0, 0 ) = q3 and ( q3, 0 ) = q0
( q0, 1) = q1 and ( q3, 1 ) = q4
Moves of q0 and q3 on input symbol 0 are q3 and q0 respectively which are in same set in
partition P0. Similarly, Moves of q0 and q3 on input symbol 1 are q3 and q0 which are in same
set in partition P0. So, q0 and q3 are not distinguishable.
( q0, 0 ) = q3 and ( q5, 0 ) = q5 and ( q0, 1 ) = q1 and ( q5, 1 ) = q5
Moves of q0 and q5 on input symbol 0 are q3 and q5 respectively which are in different set in
partition P0. So, q0 and q5 are distinguishable. So, set { q0, q3, q5 } will be partitioned into { q0,
q3 } and { q5 }. So,
P1 = { { q1, q2, q4 }, { q0, q3}, { q5 } }
To calculate P2, we will check whether sets of partition P1 can be partitioned or not:
For set { q1, q2, q4 } :
( q1, 0 ) = ( q2, 0 ) = q2 and ( q1, 1 ) = ( q2, 1 ) = q5, So q1 and q2 are not
distinguishable.
Similarly, ( q1, 0 ) = ( q4, 0 ) = q2 and ( q1, 1 ) = ( q4, 1 ) = q5, So q1 and q4 are not
distinguishable.
Since, q1 and q2 are not distinguishable and q1 and q4 are also not distinguishable, So q2 and q4
are not distinguishable. So, { q1, q2, q4 } set will not be partitioned in P2.
For set { q0, q3 } :
( q0, 0 ) = q3 and ( q3, 0 ) = q0
( q0, 1 ) = q1 and ( q3, 1 ) = q4
Moves of q0 and q3 on input symbol 0 are q3 and q0 respectively which are in same set in
partition P1. Similarly, Moves of q0 and q3 on input symbol 1 are q3 and q0 which are in same
set in partition P1. So, q0 and q3 are not distinguishable.
For set { q5 }:
Since we have only one state in this set, it cant be further partitioned. So,
P2 = { { q1, q2, q4 }, { q0, q3 }, { q5 } }
Since, P1=P2. So, this is the final partition. Partition P2 means that q1, q2 and q4 states are
merged into one. Similarly, q0 and q3 are merged into one. Minimized DFA corresponding to
A. 1 and 3 only
B. 2 and 4 only
C. 2 and 3 only
D. 3 and 4 only
Solution : Statement 4 says, it will accept all strings of length atleast 2. But it accepts 0 which is
of length 1. So, 4 is false.
Statement 3 says that the DFA is minimal. We will check using the algorithm discussed above.
P0 = { { q2 }, { q0, q1 } }
P1 = { q2 }, { q0, q1 } }. Since, P0 = P1, P1 is the final DFA. q0 and q1 can be merged. So
minimal DFA will have two states. Therefore, statement 3 is also false.
So correct option is (D).
This article has been contributed by Sonal Tuteja.
Please write comments if you find anything incorrect, or you want to share more information
about the topic discussed above.
See GATE Corner for all information about GATE CS and Quiz Corner for all Quizzes on GeeksQuiz.
Finite Automata: It is used to recognize patterns of specific type input. It is the most restricted type
of automata which can accept only regular languages (languages which can be expressed by regular
expression using OR (+), Concatenation (.), Kleene Closure(*) like a*b*, (a+b) etc.)
Deterministic FA and Non-Deterministic FA: In deterministic FA, there is only one move from
every state on every input symbol but in Non-Deterministic FA, there can be zero or more than one
move from one state for an input symbol.
Note:
Push Down Automata: Pushdown Automata has extra memory called stack which gives more
power than Finite automata. It is used to recognize context free languages.
Deterministic and Non-Deterministic PDA: In deterministic PDA, there is only one move from
every state on every input symbol but in Non-Deterministic PDA, there can be more than one move
from one state for an input symbol.
Note:
Linear Bound Automata: Linear Bound Automata has finite amount of memory called tape which
can be used to recognize Context Sensitive Languages.
Turing Machine: Turing machine has infinite size tape and it is used to accept Recursive
Enumerable Languages.
Production Rules
Language
Accepted
Automata
Closed Under
Type-3
(Regular
Gramar)
Aa or AaB
where A,B N(non
terminal) and
aT(Terminal)
Regular
Finite Automata
Union, Intersection,
Complementation,
Concatenation,
Kleene Closure
Type-2
(Context Free
Grammar)
A-> where A N
Context Free
Push Down
Automata
Union,
Concatenation,
Kleene Closure
and (TN)*
Type-1
(Context
Sensitive
Grammar)
where ,
(TN)* and len()
<= len() and
should contain
atleast 1 non
terminal.
Context
Sensitive
Linear Bound
Automata
Union, Intersection,
Complementation,
Concatenation,
Kleene Closure
Type-0
(Recursive
Enumerable)
where ,
(TN)* and
contains atleast 1
non-terminal
Recursive
Enumerable
Turing Machine
Union, Intersection,
Concatenation,
Kleene Closure
In a pipelined processor, a pipeline has two ends, the input end and the output end. Between
these ends, there are multiple stages/segments such that output of one stage is connected to
input of next stage and each stage performs a specific operation.
Interface registers are used to hold the intermediate output between two stages. These
interface registers are also called latch or buffer.
All the stages in the pipeline along with the interface registers are controlled by a common
clock.
Pipeline Stages
RISC processor has 5 stage instruction pipeline to execute all the instructions in the RISC
instruction set. Following are the 5 stages of RISC pipeline with their respective operations:
In the same case, for a non-pipelined processor, execution time of n instructions will be:
ETnon-pipeline = n * k * Tp
So, speedup (S) of the pipelined processor over non-pipelined processor, when n tasks are
executed on the same processor is:
S = Performance of pipelined processor / Performance of Non-pipelined
processor
When the number of tasks n are significantly larger than k, that is, n >> k
S n * k / n
S k
Sources : goo.gl/J9KVNt
https://en.wikipedia.org/wiki/Hazard_(computer_architecture)
https://en.wikipedia.org/wiki/Data_dependency
This article has been contributed by Saurabh Sharma.
Please write comments if you find anything incorrect, or you want to share more information
about the topic discussed above
See GATE Corner for all information about GATE CS and Quiz Corner for all Quizzes on GeeksQuiz.
This dependency arises due to the resource conflict in the pipeline. A resource conflict is a
situation when more than one instruction tries to access the same resource in the same cycle. A
resource can be a register, memory, or ALU.
Example:
In the above scenario, in cycle 4, instructions I1 and I4 are trying to access same resource
(Memory) which introduces a resource conflict.
To avoid this problem, we have to keep the instruction on wait until the required resource
(memory in our case) becomes available. This wait will introduce stalls in the pipeline as shown
below:
Flow (data) dependence: O(S1) I (S2), S1 S2 and S1 writes after something read by S2
Anti-dependence: I(S1) O(S2), S1 S2 and S1 reads something before S2 overwrites it
Output dependence: O(S1) O(S2), S1 S2 and both write the same memory location.
Data Hazards
Data hazards occur when instructions that exhibit data dependence, modify data in different
stages of a pipeline. Hazard cause delays in the pipeline. There are mainly three types of data
hazards:
1) RAW (Read after Write) [Flow dependency]
2) WAR (Write after Read) [Anti-Data dependency]
3) WAW (Write after Write) [Output dependency]
Let there be two instructions I and J, such that J follow I. Then,
RAW hazard occurs when instruction J tries to read data before instruction I writes it.
Eg:
I: R2 <- R1 + R3 J: R4 <- R2 + R3
WAR hazard occurs when instruction J tries to write data before instruction I reads it.
Eg:
I: R2 <- R1 + R3 J: R3 <- R4 + R5
WAW hazard occurs when instruction J tries to write output before instruction I writes it.
Eg:
I: R2 <- R1 + R3 J: R2 <- R4 + R5
WAR and WAW hazards occur during the out-of-order execution of the instructions.
Sources : goo.gl/J9KVNt
https://en.wikipedia.org/wiki/Hazard_(computer_architecture)
https://en.wikipedia.org/wiki/Data_dependency
This article has been contributed by Saurabh Sharma.
Please write comments if you find anything incorrect, or you want to share more information
about the topic discussed above
See GATE Corner for all information about GATE CS and Quiz Corner for all Quizzes on GeeksQuiz.
If buffers are included between the stages then, Cycle Time (Tp) = Stage Delay +
Buffer Delay
For example, if there are 4 stages with delays, 1 ns, 2 ns, 3 ns, and 4 ns, then
Tp = Maximum(1 ns, 2 ns, 3 ns, 4 ns) = 4 ns
If buffers are included between the stages,
Tp = Maximum(Stage delay + Buffer delay)
Example : Consider a 4 segment pipeline with stage delays (2 ns, 8 ns, 3 ns, 10 ns). Find
the time taken to execute 100 tasks in the above pipeline.
Solution : As the above pipeline is a non-linear pipeline,
Tp = max(2, 8, 3, 10) = 10 ns
We know that ETpipeline = (k + n 1) Tp = (4 + 100 1) 10 ns = 1030 ns
NOTE: MIPS = Million instructions per second
Ideal CPI of the pipelined processor is 1. But due to stalls, it becomes greater than 1.
=> S = CPInon-pipeline * Cycle Timenon-pipeline / (1 + Number of stalls per Instruction) * Cycle
Timepipeline
As Cycle Timenon-pipeline = Cycle Timepipeline,
Speed Up (S) = CPInon-pipeline / (1 + Number of stalls per instruction)
Sources : goo.gl/J9KVNt
https://en.wikipedia.org/wiki/Hazard_(computer_architecture)
https://en.wikipedia.org/wiki/Data_dependency
This article has been contributed by Saurabh Sharma.
Please write comments if you find anything incorrect, or you want to share more information
about the topic discussed above
See GATE Corner for all information about GATE CS and Quiz Corner for all Quizzes on GeeksQuiz.
If CFG is generating finite number of strings, then CFG is Non-Recursive (or the grammar is said
to be recursive grammar)
If CFG can generate infinite number of strings then the grammar is said to be Recursive
grammar
During Compilation, the parser uses the grammar of the language to make a parse tree(or
derivation tree) out of the source code. The grammar used must be unambiguous. An ambiguous
grammar must not be used for parsing.
2) Based on number of derivation trees.
The language(set of strings) generated by the above grammar is :{b, bab, babab,}, which is
infinite.
2) S-> Aa
A->Ab|c
The language generated by the above grammar is :{ca, cba, cbba }, which is infinite.
Non-Recursive Grammars
S->Aa
A->b|c
The language generated by the above grammar is :{ba, ca}, which is finite.
Please write comments if you find anything incorrect, or you want to share more information
about the topic discussed above
See GATE Corner for all information about GATE CS and Quiz Corner for all Quizzes on GeeksQuiz.
Ambiguous grammars
Unambiguous grammars
Ambiguous grammar:
A CFG is said to ambiguous if there exists more than one derivation tree for the given input
string i.e., more than one LeftMost Derivation Tree (LMDT) or RightMost Derivation Tree
(RMDT).
Definition: G = (V,T,P,S) is a CFG is said to be ambiguous if and only if there exist a string in
T* that has more than on parse tree.
where V is a finite set of variables.
T is a finite set of terminals.
P is a finite set of productions of the form, A -> , where A is a variable and (V T)* S is a
designated variable called the start symbol.
For Example:
1. Let us consider this grammar : E -> E+E|id
We can create 2 parse tree from this grammar to obtain a string id+id+id :
The following are the 2 parse trees generated by left most derivation:
Both the above parse trees are derived from same grammar rules but both parse trees are
different. Hence the grammar is ambiguous.
2. Let us now consider the following grammar:
Set of alphabets = {0,,9, +, *, (, )}
E
E
E
E
I
->
->
->
->
->
I
E + E
E * E
(E)
| 0 | 1 | | 9
S-> aS |Sa|
E-> E +E | E*E| id
A -> AA | (A) | a
S -> SS|AB , A -> Aa|a , B -> Bb|b
Discrete Mathematics(Contd)
Discrete Mathematics | The Pigeonhole Principle
Theorem:
I) If A is the average number of pigeons per hole, where A is not an integer then
At least one pigeon hole contains seal[A] (smallest integer greater than or equal to A) pigeons
Remaining pigeon holes contains at most floor[A] (largest integer less than or equal to A)
pigeons
Or
II) We can say as, if n + 1 objects are put into n boxes, then at least one box contains two or
more objects.
The abstract formulation of the principle: Let X and Y be finite sets and let f: X > Y be a
function.
Pigeonhole principle is one of the simplest but most useful ideas in mathematics. We will see
more applications that proof of this theorem.
Example 1: If (Kn+1) pigeons are kept in n pigeon holes where K is a positive integer, what
is the average no. of pigeons per pigeon hole?
Solution: average number of pigeons per hole = (Kn+1)/n
= K + 1/n
Therefore at least a pigeonholes contains (K+1) pigeons i.e., seal[K +1/n] and remaining contain
at most K i.e., floor[k+1/n] pigeons.
i.e., the minimum number of pigeons required to ensure that at least one pigeon hole contains
(K+1) pigeons is (Kn+1).
Example 2: A bag contains 10 red marbles, 10 white marbles, and 10 blue marbles. What is
the minimum no. of marbles you have to choose randomly from the bag to ensure that we
get 4 marbles of same color?
Solution: Apply pigeonhole principle.
No. of colors (pigeonholes) n = 3
No. of marbles (pigeons) K+1 = 4
Therefore the minimum no. of marbles required = Kn+1
By simplifying we get Kn+1 = 10.
Verification: seal[Average] is [Kn+1/n] = 4
[Kn+1/3] = 4
Kn+1 = 10
i.e., 3 red + 3 white + 3 blue + 1(red or white or blue) = 10
Pigeonhole principle strong form.
Theorem: Let q1, q2, . . . , qn be positive integers.
If q1+ q2+ . . . + qn n + 1 objects are put into n boxes, then either the 1st box contains at least q1
objects, or the 2nd box contains at least q2 objects, . . ., the nth box contains at least qn objects.
Application of this theorem is more important, so let us see how we apply this theorem in
problem solving.
Example 1: In a computer science department, a student club can be formed with either 10
members from first year or 8 members from second year or 6 from third year or 4 from
final year. What is the minimum no. of students we have to choose randomly from
department to ensure that a student club is formed?
Solution: we can directly apply from the above formula where,
q1 =10, q2 =8, q3 =6, q4 =4 and n=4
Therefore the minimum number of students required to ensure department club to be formed is
10 + 8 + 6 + 4 4 + 1 = 25
Example 2: A box contains 6 red, 8 green, 10 blue, 12 yellow and 15 white balls. What is the
minimum no. of balls we have to choose randomly from the box to ensure that we get 9
balls of same color?
Solution: Here in this we cannot blindly apply pigeon principle. First we will see what happens
if we apply above formula directly.
From the above formula we have get answer 47 because 6 + 8 + 10 + 12 + 15- 5 + 1 = 47
But it is not correct. In order to get the correct answer we need to include only blue, yellow and
white balls because red and green balls are less than 9. But we are picking randomly so we
include after we apply pigeon principle.
i.e., 9 blue + 9 yellow + 9 white 3 + 1 = 25
Since we are picking randomly so we can get all the red and green balls before the above 25
balls. Therefore we add 6 red + 8 green + 25 = 39
We can conclude that in order to pick 9 balls of same color randomly, one has to pick 39 balls
from a box.
This article is contributed by Saikiran Goud Burra. Please write comments if you find anything
incorrect, or you want to share more information about the topic discussed above.
See GATE Corner for all information about GATE CS and Quiz Corner for all Quizzes on GeeksQuiz.
Packet Switching
Delay between data units in circuit Delay between data units in packet
switching is uniform.
switching is not uniform.
Resource reservation is the feature There is no resource reservation
of circuit switching because path is because bandwidth is shared among
fixed for data transmission.
users.
Circuit switching is more reliable.
Please write comments if you find anything incorrect, or you want to share more information
about the topic discussed above
See GATE Corner for all information about GATE CS and Quiz Corner for all Quizzes on GeeksQuiz.
Switch
Router
Like DHCP, ARP is a discovery protocol, but unlike DHCP there is not server here.
Modulus operator (%d) isnt defined for real types however library functions fmod(),
fmodf(), fmodl() from math.h can be used for double, float and long double respectively.
In C, any of the 3 expressions of for loop i.e. for(exp1 ; exp2 ; exp3) can be empty. Even
all of the three can be empty.
The controlling expression of a switch statement i.e. switch(exp) shall have integer type
(i.e. int, char etc.) only.
When continue statement is hit in while or do-while loops, the next executed statement is
controlling expression of while or do-while loops. But when continue statement is hit in
for loop, the next executed statement is expression3 which is called increment expression
as well.
As per C standard, continue can be used in loop body only. And break can be used in
loop body and switch body only.
In C, goto statement can be used inside functions only and its label can point to anywhere
in the same function only.
In switch body, two case cant result in same value though having only one case or only
default is okay. In fact, switch body can be empty also.
As per C standard, a jump statement causes an unconditional jump to another place and
all goto, continue, break, return are jump statements.
In C, typedef is used to create alias of any other type. It can be used to create alias for
array and function pointer as well.
Multiple aliases of different types can be created using one typedef only. For example,
typedef int INT, *INTPTR, ONEDARR[10]; is completely valid.
The only storage-class specifier that shall occur in a parameter declaration is register.
Thats why even fun(auto int arg) is incorrect.
In C, signed, unsigned, short and long are Type specifiers and when they are used, int is
implicitly assumed in all of these. So signed i; unsigned j; short k; long l; is valid.
Though const and volatile look opposite to each other but a variable can be both const
and volatile.
const is a Type qualifier and a variable qualified with const means that the value of
variable isnt modifiable by the program.
volatile is a Type qualifier and a variable qualified with volatile means that value of
variable is subject to sudden change (possibly from outside the program)
A function cant have an explicit array as return type i.e. int [5] func(int arg1) is invalid.
However, indirect methods can be used if array like info needs to be output from a
function (e.g. using pointers).
Though sizeof() looks like a function, its actually an operator in C. Also, sizeof() is a
compile time operator. Thats why the output of printf(%d,sizeof(printf(GQ)));
would be same as printf(%d,sizeof(int));. Basically, operand of sizeof() operator isnt
evaluated at run time. Variable length array (introduced in C99) is exception for this.
While assigning any function to a function pointer, & (address of) is optional. Same way,
while calling a function via function pointer, * (value at address) is optional.
In C, for macros with arguments, there cant be any space between macro name and open
parenthesis.
C language doesnt provide any true support for 2D array or multidimensional arrays. For
example, a 2D array is simulated via 1D array of arrays.
Important point is that array size can be derived from its initialization but thats
applicable for first dimension only. For example, int arr[][2] = {1,2,3,4} is valid but int
arr[2][] = {1,2,3,4} is not valid.
Dereferencing of void pointer isnt allowed because void is an incomplete data type.
In C, initialization of array can be done for selected elements as well. Specific elements
in array can be initialized using []. For example, int arr[10] = {100, [5]=100,[9]=100};
is legal in C. This initializes arr[0], arr[5] and arr[9] to 100. All the remaining elements
would be 0.
As per C standard, if array size is defined using variable, the array cant be initialized at
definition itself. For example, int size = 2, arr[size]; is valid but int size = 2, arr[size] =
{1,2}; is invalid. Also, an array whose size is specified using variable cant be defined
out any function i.e. this array cant be global.
In C, struct members can be initialized even out of order using field name and using dot
operator. For example, struct {int i; char c;} myVar = {.c =A,.i = 100}; is valid.
LMNs- Algorithms
Analyze an algorithm
1) Worst Case Analysis (Usually Done):In the worst case analysis, we calculate upper bound
on running time of an algorithm by considering worst case (a situation where algorithm takes
maximum time) 2) Average Case Analysis (Sometimes done) :In average case analysis, we
take all possible inputs and calculate computing time for all of the inputs. 3) Best Case Analysis
(Bogus) :In the best case analysis, we calculate lower bound on running time of an algorithm.
Asymptotic Notations
Notation:The theta notation bounds a functions from above and below, so it defines exact
asymptotic behavior.
((g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that
0 <= c1*g(n) <= f(n) <= c2*g(n) for all n >= n0}
Big O Notation: The Big O notation defines an upper bound of an algorithm, it bounds a function
only from above.
O(g(n)) = { f(n): there exist positive constants c and n0 such that
0 <= f(n) <= cg(n) for all n >= n0}
Solving recurrences
Substitution Method: We make a guess for the solution and then we use mathematical
induction to prove the guess is correct or incorrect.
Recurrence Tree Method: We draw a recurrence tree and calculate the time taken by every
level of tree. Finally, we sum the work done at all levels.
Master Method: Only for following type of recurrences or for recurrences that can be
transformed to following type.
T(n) = aT(n/b) + f(n) where a >= 1 and b > 1
Sorting
Algorithm
Worst
Case
Average
Case
Best
Case
Bubble
(n2)
(n2)
(n2)
(n2)
Selection (n2)
(n2)
(n2)
(n)
Insertion (n2)
(n2)
(n)
(n2)
Quick
(n2)
(nlgn)
(nlgn)
(n2)
Merge
(nlgn)
(nlgn)
(nlgn)
Is not in-place
sorting
Is not in-place
sorting
Heap
(nlgn)
(nlgn)
(nlgn)
O(nlgn)
(nlgn)
Searching
Algorithm
Worst Case
Average Case
Best Case
Linear Search
(n)
(n)
(1)
Binary Search
O (logn)
O (logn)
O (1)
Trees
Trees: Unlike Arrays, Linked Lists, Stack and queues, which are linear data structures, trees are
hierarchical data structures. Depth First Traversals: (a) Inorder (b) Preorder (c) Postorder
Important Tree Properties and Formulas
Binary Search Tree
Binary Search Tree, is a node-based binary tree data structure which has the following
properties:
The left subtree of a node contains only nodes with keys less than the nodes key.
The right subtree of a node contains only nodes with keys greater than the nodes key.
The left and right subtree each must also be a binary search tree. There must be no duplicate
nodes.
1. Insertion
2. Deletion
AVL Tree
AVL tree is a self-balancing Binary Search Tree (BST) where the difference between heights of
left and right subtrees cannot be more than one for all nodes.
1. Insertion
2. Deletion
B-Tree
B-Tree is a self-balancing search tree. In most of the other self-balancing search trees
(like AVL and Red Black Trees), it is assumed that everything is in main memory. To
understand use of B-Trees, we must think of huge amount of data that cannot fit in main
memory. When the number of keys is high, the data is read from disk in the form of blocks. Disk
access time is very high compared to main memory access time. The main idea of using B-Trees
is to reduce the number of disk accesses. Properties of B-Tree
1. B-Tree Insertion
2. B-Tree Deletion
Graph
Graph is a data structure that consists of following two components: 1. A finite set of vertices
also called as nodes. 2. A finite set of ordered pair of the form (u, v) called as edge. The pair is
ordered because (u, v) is not same as (v, u) in case of directed graph(di-graph). The pair of form
(u, v) indicates that there is an edge from vertex u to vertex v. The edges may contain
weight/value/cost. Following two are the most commonly used representations of graph.
1. Adjacency Matrix: Adjacency Matrix is a 2D array of size V x V where V is the number of
vertices in a graph.
2. Adjacency List : An array of linked lists is used. Size of the array is equal to number of vertices.
Graph Algorithms
Algorithm
Time Complexity
O(V+E)
Example: Prims Minimum Spanning Tree Algorithm, Kruskals Minimum Spanning Tree
Algorithm
Divide and Conquer
1. Divide:Break the given problem into subproblems of same type.
2. Conquer: Recursively solve these subproblems
3. Combine: Appropriately combine the answers
Following are some standard algorithms that are Divide and Conquer algorithms. 1) Binary
Search is a searching algorithm. In each step, the algorithm compares the input element x with
the value of the middle element in array. If the values match, return the index of middle.
Otherwise, if x is less than the middle element, then the algorithm recurs for left side of middle
element, else recurs for right side of middle element. 2) Quicksort is a sorting algorithm. The
algorithm picks a pivot element, rearranges the array elements in such a way that all elements
smaller than the picked pivot element move to left side of pivot, and all greater elements move to
right side. Finally, the algorithm recursively sorts the subarrays on left and right of pivot
element. 3) Merge Sort is also a sorting algorithm. The algorithm divides the array in two
halves, recursively sorts them and finally merges the two sorted halves. 4) Closest Pair of
Points The problem is to find the closest pair of points in a set of points in x-y plane. The
problem can be solved in O(n^2) time by calculating distances of every pair of points and
comparing the distances to find the minimum. The Divide and Conquer algorithm solves the
problem in O(nLogn) time.
Greedy Approach
Greedy is an algorithmic paradigm that builds up a solution piece by piece, always choosing the
next piece that offers the most obvious and immediate benefit. Greedy algorithms are used for
optimization problems. An optimization problem can be solved using Greedy if the problem has
the following property: At every step, we can make a choice that looks best at the moment, and
we get the optimal solution of the complete problem. Following are some standard algorithms
that are Greedy algorithms. 1) Kruskals Minimum Spanning Tree (MST): In Kruskals
algorithm, we create a MST by picking edges one by one. The Greedy Choice is to pick the
smallest weight edge that doesnt cause a cycle in the MST constructed so far. 2) Prims
Minimum Spanning Tree: In Prims algorithm also, we create a MST by picking edges one by
one. We maintain two sets: set of the vertices already included in MST and the set of the vertices
not yet included. The Greedy Choice is to pick the smallest weight edge that connects the two
sets. 3) Dijkstras Shortest Path: The Dijkstras algorithm is very similar to Prims algorithm.
The shortest path tree is built up, edge by edge. We maintain two sets: set of the vertices already
included in the tree and the set of the vertices not yet included. The Greedy Choice is to pick the
edge that connects the two sets and is on the smallest weight path from source to the set that
contains not yet included vertices. 4) Huffman Coding: Huffman Coding is a loss-less
compression technique. It assigns variable length bit codes to different characters. The Greedy
Choice is to assign least bit length code to the most frequent character.
Dynamic Programming
Dynamic Programming is an algorithmic paradigm that solves a given complex problem by
breaking it into subproblems and stores the results of subproblems to avoid computing the same
results again. Properties:
1. Overlapping Subproblems: Dynamic Programming is mainly used when solutions of same
subproblems are needed again and again. In dynamic programming, computed solutions to
subproblems are stored in a table so that these dont have to be recomputed.
2. Optimal Substructure: A given problems has Optimal Substructure Property if optimal solution
of the given problem can be obtained by using optimal solutions of its subproblems.
LMN-DATA STRUCTURES
Arrays
An array is collection of items stored at continuous memory locations. The idea is to declare
multiple items of same type together. Array declaration: In C, we can declare an array by
specifying its and size or by initializing it or by both.
// Array declaration by specifying size
int arr[10];
// Array declaration by initializing elements
int arr[] = {10, 20, 30, 40};
// Array declaration by specifying size and
// initializing elements
int arr[6] = {10, 20, 30, 40}
Formulas:
Length of Array = UB - LB + 1
Given the address of first element, address of any other element is calculated using the formula:Loc (arr [k]) = base (arr) + w * k
w = number of bytes per storage location
of for one element
k = index of array whose address we want
to calculate
Column major order: Elements are stored column by column, i.e. all elements of first column are
stored, and then all elements of second column stored and so on.
Loc(arr[i][j]) = base(arr) + w (m *j + i)
Row major order: Elements are stored row by row, i.e. all elements of first row are stored, and
then all elements of second row stored and so on.
Stacks
Stack is a linear data structure which follows a particular order in which the operations are
performed. The order may be LIFO(Last In First Out) or FILO(First In Last Out).
Basic operations :
Push: Adds an item in the stack. If the stack is full, then it is said to be an Overflow condition.
(Top=Top+1) Pop: Removes an item from the stack. The items are popped in the reversed order
in which they are pushed. If the stack is empty, then it is said to be an Underflow
condition.(Top=Top-1) Peek: Get the topmost item.
Infix, prefix, Postfix notations
Infix notation: X + Y Operators are written in-between their operands. This is the usual way
we write expressions. An expression such as
A * ( B + C ) / D
Postfix notation (also known as Reverse Polish notation): X Y + Operators are written after
their operands. The infix expression given above is equivalent to
A B C + * D/
Prefix notation (also known as Polish notation): + X Y Operators are written before their
operands. The expressions given above are equivalent to
/ * A + B C D
Queues
Queue is a linear structure which follows a particular order in which the operations are
performed. The order is First In First Out (FIFO). A good example of queue is any queue of
consumers for a resource where the consumer that came first is served first. Stack : Remove the
item the most recently added Queue: Remove the item the least recently added Operations on
Queue:
Enqueue: Adds an item to the queue. If the queue is full, then it is said to be an Overflow
condition.
Dequeue: Removes an item from the queue. The items are popped in the same order in which
they are pushed. If the queue is empty, then it is said to be an Underflow condition.
Front: Get the front item from queue.
Rear: Get the last item from queue.
Linked Lists
Linked List is a linear data structure. Unlike arrays, linked list elements are not stored at
contiguous location; the elements are linked using pointers.
Advantages over arrays 1) Dynamic size 2) Ease of insertion/deletion Drawbacks: 1) Random
access is not allowed. We have to access elements sequentially starting from the first node. So
we cannot do binary search with linked lists. 2) Extra memory space for a pointer is required
with each element of the list. Representation in C: A linked list is represented by a pointer to the
first node of the linked list. The first node is called head. If the linked list is empty, then value of
head is NULL. Each node in a list consists of at least two parts: 1) data 2) pointer to the next
node In C, we can represent a node using structures. Below is an example of a linked list node
with an integer data.
// A linked list node
struct node
{
int data;
struct node *next;
};