Sei sulla pagina 1di 53

COMPUTER AND NETWORK SECURITY NOTES

BCA 6TH SEMESTER

UNIT 1

The meaning of computer security

The meaning of the term computer security has evolved in recent years. Before the problem
of data security became widely publicized in the media, most people’s idea of computer
security focused on the physical machine. Traditionally, computer facilities have been
physically protected for three reasons:

To prevent theft of or damage to the hardware

To prevent theft of or damage to the information

To prevent disruption of service

The field covers all the processes and mechanisms by which digital equipment, information
and services are protected from unintended or unauthorized access, change or destruction,
and are of growing importance in line with the increasing reliance on computer systems of
most societies worldwide. It includes physical security to prevent theft of equipment, and
information security to protect the data on that equipment. It is sometimes referred to as
"cyber security" or "IT security", though these terms generally do not refer to physical
security (locks and such).

Some important terms used in computer security are:

Vulnerability

Vulnerability is a weakness which allows an attacker to reduce a system's information


assurance. Vulnerability is the intersection of three elements: a system susceptibility or flaw,
attacker access to the flaw, and attacker capability to exploit the flaw. To exploit vulnerability,
an attacker must have at least one applicable tool or technique that can connect to a system
weakness. In this frame, vulnerability is also known as the attack surface.

Vulnerability management is the cyclical practice of identifying, classifying, remediating, and


mitigating vulnerabilities. This practice generally refers to software vulnerabilities in
computing systems.

Backdoors

A backdoor in a computer system, is a method of bypassing normal authentication, securing


remote access to a computer, obtaining access to plaintext, and so on, while attempting to
remain undetected.
The backdoor may take the form of an installed program (e.g., Back Orifice), or could be a
modification to an existing program or hardware device. It may also fake information about
disk and memory usage.

Denial-of-service attack

Unlike other exploits, denials of service attacks are not used to gain unauthorized access or
control of a system. They are instead designed to render it unusable. Attackers can deny
service to individual victims, such as by deliberately entering a wrong password enough
consecutive times to cause the victim account to be locked, or they may overload the
capabilities of a machine or network and block all users at once. These types of attack are, in
practice, very hard to prevent, because the behaviour of whole networks needs to be
analyzed, not only the behaviour of small pieces of code. Distributed denial of service
(DDoS) attacks are common, where a large number of compromised hosts (commonly
referred to as "zombie computers", used as part of a botnet with, for example; a worm,
trojan horse, or backdoor exploit to control them) are used to flood a target system with
network requests, thus attempting to render it unusable through resource exhaustion.

Direct-access attacks

An unauthorized user gaining physical access to a computer (or part thereof) can perform
many functions, install different types of devices to compromise security, including operating
system modifications, software worms, key loggers, and covert listening devices. The
attacker can also easily download large quantities of data onto backup media, for instance
CD- R/DVD-R, tape; or portable devices such as key drives, digital cameras or digital
audio players. Another common technique is to boot an operating system contained on a CD-
ROM or other bootable media and read the data from the hard drive(s) this way. The only
way to defeat this is to encrypt the storage media and store the key separate from the system.
Direct-access attacks are the only type of threat to Standalone computers (never connect to
internet), in most cases.

Eavesdropping

Eavesdropping is the act of surreptitiously listening to a private conversation, typically


between hosts on a network. For instance, programs such as Carnivore and NarusInsight
have been used by the FBI and NSA to eavesdrop on the systems of internet service
providers.

Spoofing

Spoofing of user identity describes a situation in which one person or program successfully
masquerades as another by falsifying data and thereby gaining an illegitimate advantage.

Tampering

Tampering describes an intentional modification of products in a way that would make them
harmful to the consumer.
Principle of security

There are five principles of security. They are as follows:

Confidentiality:

The principle of confidentiality specifies that only the sender and the intended recipient
should be able to access the content of the message.

A B

Integrity:

The confidential information sent by A to B which is accessed by C without the permission or


knowledge of A and B.

A B

Authentication:
Authentication mechanism helps in establishing proof of identification.

Non-repudiation:

Access control:
Access control specifies and control who can access what.

Availability:
It means that assets are accessible to authorized parties at appropriate times.

Attacks
We want our security system to make sure that no data are disclosed to unauthorized parties.
Data should not be modified in illegitimate ways Legitimate user can access the data

Types of attacks

Attacks are grouped into two types:

Passive attacks: does not involve any modification to the contents of an original message

Active attacks: the contents of the original message are modified in some ways.
Repudiation

Repudiation describes a situation where the authenticity of a signature is being challenged.


Teardrop is a program that sends IP fragments to a machine connected to the Internet or a
network. Teardrop exploits an overlapping IP fragment bug present in Windows 95, Windows NT and
Windows 3.1 machines. The bug causes the TCP/IP fragmentation re-assembly code to improperly handle
overlapping IP fragments. This attack has not been shown to cause any significant damage to systems, and
a simple reboot is the preferred remedy. It should be noted, though, that while this attack is considered to
be non-destructive, it could cause problems if there is unsaved data in open applications at the time that the
machine is attacked. The primary problem with this is a loss of data.
Symptoms of Attack
When a Teardrop attack is run against a machine, it will crash (on Windows machines, a user
will likely experience the Blue Screen of Death), or reboot. If you have protected yourself
from the winnuke and ssping DoS attacks and you still crash, then the mode of attack is
probably teardrop or land. If you are using IRC, and your machine becomes disconnected
from the network or Internet, but does not crash, the mode of attack is probably

Information disclosure

Information Disclosure (Privacy breach or Data leak) describes a situation where


information, thought as secure, is released in an untrusted environment.

Elevation of privilege

Elevation of Privilege describes a situation where a person or a program want to gain


elevated privileges or access to resources that are normally restricted to him/it.

Exploits

An exploit is a piece of software, a chunk of data, or sequence of commands that takes


advantage of a software "bug" or "glitch" in order to cause unintended or unanticipated
behaviour to occur on computer software, hardware, or something electronic (usually
computerized). This frequently includes such things as gaining control of a computer system
or allowing privilege escalation or a denial of service attack. The term "exploit" generally
refers to small programs designed to take advantage of a software flaw that has been
discovered, either remote or local. The code from the exploit program is frequently reused in
Trojan horses and computer viruses.

Indirect attacks

An indirect attack is an attack launched by a third-party computer. By using someone else's


computer to launch an attack, it becomes far more difficult to track down the actual attacker.
There have also been cases where attackers took advantage of public anonymizing systems,
such as the tor onion router system.

Computer crime: Computer crime refers to any crime that involves a computer and a
network.
Threats, Vulnerabilities, and Attacks

Now that we have reviewed some of the TCP/IP basics, we can proceed in our discussion of
threats, vulnerabilities, and attacks. It is important to understand the difference between a
threat, vulnerability, or an attack in the context of network security.
The terms threat and attack are commonly used to mean more or less the same thing and the
actual definitions are
Threat: A potential for violation of security, which exists when there is a circumstance,
Capability, action, or event that could breach security and cause harm. That is, a
threat is a possible danger that might exploit vulnerability.

Threats

A threat is anything that can disrupt the operation, functioning, integrity, or availability of a
network or system. This can take any form and can be malevolent, accidental, or simply an
act of nature.

Attack: An assault on system security that derives from an intelligent threat; that is, an
Intelligent act that is a deliberate attempt (especially in the sense of a method or technique) to
evade security services and violate the security policy of a system.

Security Attacks:
Security attacks, used both in X.800 and RFC 2828, are classified as passive attacks and
active attacks.
A passive attack attempts to learn or make use of information from the system but does not
affect system resources.
An active attack attempts to alter system resources or affect their operation.

Passive Attacks:
Passive attacks are in the nature of eavesdropping on, or monitoring of, transmissions. The
goal of the opponent is to obtain information that is being transmitted. Two types of passive
attacks are release of message contents and traffic analysis.
The release of message contents is easily understood. A telephone conversation, an electronic
mail message, and a transferred file may contain sensitive or confidential information. To
prevent an opponent from learning the contents of these transmissions.

A second type of passive attack, traffic analysis, is subtler (Figure 1.3b). Suppose that we had
a way of masking the contents of messages or other information traffic so that opponents,
even if they captured the message, could not extract the information from the message. The
common technique for masking contents is encryption. If we had encryption protection in
place, an opponent might still be able to observe the pattern of these messages. The opponent
Could determine the location and identity of communicating hosts and could observe the
frequency and length of messages being exchanged. This information might be useful in
guessing the nature of the communication that was taking place.
Passive attacks are very difficult to detect because they do not involve any alteration of the
data. Typically, the message traffic is sent and received in an apparently normal fashion and
neither the sender nor receiver is aware that a third party has read the messages or observed
the traffic pattern. However, it is feasible to prevent the success of these attacks, usually by
means of encryption. Thus, the emphasis in dealing with passive attacks is on prevention
rather than detection.
Active Attacks:
Active attacks involve some modification of the data stream or the creation of a false stream
and can be subdivided into four categories: Masquerade, Replay, Modification of messages,
and Denial of service

A masquerade takes place when one entity pretends to be a different entity (Figure 1.4a). A
masquerade attack usually includes one of the other forms of active attack. For example,
authentication sequences can be captured and replayed after a valid authentication sequence
has taken place, thus enabling an authorized entity with few privileges to obtain extra
privileges by impersonating an entity that has those privileges.
Replay involves the passive capture of a data unit and its subsequent retransmission to
produce an unauthorized effect (Figure 1.4b).
Modification of messages simply means that some portion of a legitimate message is altered,
or that messages are delayed or reordered, to produce an unauthorized effect For example, a
message meaning "Allow John Smith to read confidential file accounts" is modified to mean
"Allow Fred Brown to read confidential file accounts."
The denial of service prevents or inhibits the normal use or management of communications
facilities (Figure 1.4d). This attack may have a specific target; for example, an entity may
suppress all messages directed to a particular destination (e.g., the security audit service).
Another form of service denial is the disruption of an entire network, either by disabling the
network or by overloading it with messages so as to degrade performance.

The Difference between passive and Active Attacks are summarized as follows

Passive Attacks Active Attacks


Very Difficult to Detect and Very easy to Detect and Very
Measures are Available to difficult to Prevent.
prevent their Success The Attacker needs to gain
2 The Attacker merely needs to be Physical control of a portion of
able to observe Transmissions. the link and be able to Insert and
. Capture Transmission.
3 The Entity is unaware of the The Entity gets aware of it, when
Attack. attacked.
. 4 Don’t involve any modification Involve
4 Don’t involve any modification Involve modification of the
modification of the of the contents of original
message. original contents
The Attacks may be
_ Masquerade
5 No Such changes _ Modification
_ Replay
_ DOS
1. AUTHENTICATION: The assurance that the communicating entity is the one that it
claims to be.
_ Peer Entity Authentication: Used in association with a logical connection to provide
confidence in the identity of the entities connected.
_ Data Origin Authentication: In a connectionless transfer, provides assurance that the source
of received data is as claimed.
2. ACCESS CONTROL: The prevention of unauthorized use of a resource (i.e., this service
controls who can have access to a resource, under what conditions access can occur, and what
those accessing the resource are allowed to do).
3. DATA CONFIDENTIALITY: The protection of data from unauthorized disclosure.
_ Connection Confidentiality: The protection of all user data on a connection.
_ Connectionless Confidentiality: The protection of all user data in a single data block
_ Selective-Field Confidentiality: The confidentiality of selected fields within the user Data
on a connection or in a single data block.
_ Traffic Flow Confidentiality: The protection of the information that might be Derived from
observation of traffic flows.
4. DATA INTEGRITY: The assurance that data received are exactly as sent by an authorized
entity (i.e., contain no modification, insertion, deletion, or replay).
_ Connection Integrity with Recovery: Provides for the integrity of all user data on a
connection and detects any modification, insertion, deletion, or replay of any data within an
entire data sequence, with recovery attempted.
_ Connection Integrity without Recovery: As above, but provides only detection without
recovery.
_ Selective-Field Connection Integrity: Provides for the integrity of selected fields within the
user data of a data block transferred over a connection and takes the form of determination of
whether the selected fields have been modified, inserted, deleted, or replayed.
_ Connectionless Integrity: Provides for the integrity of a single connectionless data
block and may take the form of detection of data modification. Additionally, a limited form of
replay detection may be provided.
_ Selective-Field Connectionless Integrity: Provides for the integrity of selected fields within
a single connectionless data block; takes the form of determination of whether the selected
fields have been modified.
5. NONREPUDIATION: Provides protection against denial by one of the entities involved
in a communication of having participated in all or part of the communication.
_ Nonrepudiation, Origin: Proof that the message was sent by the specified party.
_ Nonrepudiation, Destination: Proof that the message was received by the specified party.

Vulnerabilities

A vulnerability is an inherent weakness in the design, configuration, implementation, or


management of a network or system that renders it susceptible to a threat. Vulnerabilities are
what make networks susceptible to information loss and downtime. Every network and
system has some kind of vulnerability.

Attacks

An attack is a specific technique used to exploit a vulnerability. For example, a threat could
be a denial of service. A vulnerability is in the design of the operating system, and an attack
could be a "ping of death." There are two general categories of attacks, passive and active.
Passive attacks are very difficult to detect, because there is no overt activity that can be
monitored or detected. Examples of passive attacks would be packet sniffing or traffic
analysis. These types of attacks are designed to monitor and record traffic on the network.
They are usually employed for gathering information that can be used later in active attacks.

Active attacks, as the name implies, employ more overt actions on the network or system. As
a result, they can be easier to detect, but at the same time they can be much more devastating
to a network. Examples of this type of attack would be a denial-of-service attack or active
probing of systems and networks.

Networks and systems face many types of threats. There are viruses, worms, Trojan horses,
trap doors, spoofs, masquerades, replays, password cracking, social engineering, scanning,
sniffing, war dialing, denial-of-service attacks, and other protocol-based attacks. It seems new
types of threats are being developed every month. The following sections review the general
types of threats that network administrators face every day, including specific descriptions of
a few of the more widely known attacks.

Viruses

According to Computer Economics, Inc. , a computer research and analysis group, over $12
billion was spent worldwide in 1999 as a result of computer viruses. A virus, a parasitic
program that cannot function independently, is a program or code fragment that is self-
propagating. It is called a virus, because like its biological counterpart, it requires a "host" to
function. In the case of a computer virus the host is some other program to which the virus
attaches itself. A virus is usually spread by executing an infected program or by sending an
infected file to someone else, usually in the form of an e-mail attachment.
There are several virus scanning programs available on the market. Most are effective against
known viruses. Unfortunately, however, they are incapable of recognizing and adapting to
new viruses.
In general, virus scanning programs rely on recognizing the "signature" of known viruses,
turning to a database of known virus signatures that they use to compare against scanning
results. The program detects a virus when a match is found. If the database is not regularly
updated the virus scanner can become obsolete quickly. As one would expect, there is usually
some lag time between the introduction of a new virus and a vendor updating its database.
Invariably, someone always has the dubious distinction of being one of the early victims of
newly released virus.

Worm

A worm is a self-contained and independent program that is usually designed to propagate or


spawn itself on infected systems and to seek other systems via available networks. The main
difference between a virus and a worm is that a virus is not an independent program.
However, there are new breeds of computer bugs that are blurring the difference between
viruses and worms. The Melissa virus is an example of this new hybrid. In 1999 the Melissa
virus attacked many users of Microsoft products. It was spread as an attachment, but the virus
spread as an active process initiated by the virus. It was not a passive virus passed along by
unsuspecting users.
One of the first and perhaps the most famous worms was the Internet Worm created and
released by Robert Morris. In 1986, Morris wrote his worm program and released it onto the
Internet. The worm's functioning was relatively benign, but it still had a devastating effect on
the Internet. The worm was designed to simply reproduce and infect other systems. Once
released, the program would spawn another process. The other process was simply another
running copy of the program. Then the program would search out other systems connected to
the infected system and propagate itself onto the other systems on the network. The number of
processes running grew geometrically. An example below illustrates how the Internet worm
grew and spread: One process spawned to become two processes. Two processes spawned to
become four processes. Four processes spawned to become eight. It didn't take very long for
the spawning processes to consume all the CPU and memory resources until the system
crashed. In addition, each time the processes spawned another, the processes would seek
outside connections. The worm was designed to propagate, seek out other systems to infect
them, and then repeat the process. Stopping the processes from growing was a simple matter
of rebooting the system. However, system administrators found that they would reboot their
systems and get them functioning again only to find them being reinfected by another system
on the Internet. To stop the worm from reinfecting systems on the network, all of the systems
had to be shut down at the same time or taken off-line. The cost to clean up the Internet worm
was estimated to be in the tens of millions of dollars. Morris was arrested, prosecuted, and
convicted for his vandalism.
Trojan Horses

A Trojan horse is a program or code fragment that hides inside a program and performs a
disguised function. This type of threat gets its name from Greek mythology and the story of
the siege of Troy. The story tells of how Odysseus and his men conquered Troy by hiding
within a giant wooden horse. A Trojan horse program hides within another program or
disguises itself as a legitimate program. This can be accomplished by modifying the existing
program or by simply replacing the existing program with a new one. The Trojan horse
program functions much the same way as the legitimate program, but usually it also performs some
other function, such as recording sensitive information or providing a trap door.

An example would be a password grabber program. A password grabber is a program designed to


look and function like the normal login prompt that a user sees when first accessing a system. For
example, in the screen depicted in Figure the user has entered the username john and the correct
password. However, the system tells the user that the login is incorrect. When the user tries again it
works and he or she is able to log on.
Trojan horse login.
In this example a Trojan horse designed to steal passwords is actually controlling the
interaction. The standard login.exe has been replaced with a Trojan horse program. It looks
like the standard login prompt, but what is actually occurring is that the first login prompt is
the Trojan horse. When the username and password is entered that information is recorded
and stored. Then the Trojan horse program displays the "login incorrect" message and passes
the user off to the real login program, so that he or she can actually log on to the system. The
user simply assumes that he or she mistyped the password the first time never knowing that
her or his username and password have just been stolen.

Trap Doors

A trap door or back door is an undocumented way of gaining access to a system that is built
into the system by its designer(s). It can also be a program that has been altered to allow
someone to gain privileged access to a system or process.

There have been numerous stories of vendors utilizing trap doors in disputes with customers.
One example is the story of a consultant who was contracted to build a system for a company.
The consultant designed a trap door into the delivered system. When the consultant and the
company got into a dispute over payment, the consultant used the trap door to gain access to
the system and disable the system. The company was forced to pay the consultant to get its
system turned back on again.
Logic Bombs

A logic bomb is a program or subsection of a program designed with malevolent intent. It is


referred to as a logic bomb, because the program is triggered when certain logical conditions
are met. This type of attack is almost always perpetrated by an insider with privileged access
to the network. The perpetrator could be a programmer or a vendor that supplies software.

As an example, I once heard a story about a programmer at a large corporation who


engineered this type of attack. Apparently, the programmer had been having some trouble at
the company at which he worked and was on probation. Fearing that he might be fired and
with vengeance in mind, he added a subroutine to another program. The subroutine was
added to a program that ran once a month and was designed to scan the company's human
resources employee database to determine if a termination date had been loaded for his
employee record. If the subroutine found that a termination date had been loaded, then it was
designed to wipe out the entire system by deleting all files on the disk drives. The program
ran every month and so long as his employee record did not have a termination date then
nothing would happen. In other words, if he were not fired the program would do no damage.

Sure enough this stellar employee was fired, and the next time the logic bomb that he created
ran it found a termination date in his employee record and wiped out the system. This is an
example of how simple it can be, for one with privileged access to a system, to set up this
type of attack.

Port Scanning

Like a burglar casing a target to plan a break-in, a hacker will often case a system to gather
information that can later be used to attack the system. One of the tools that hackers often use
for this type of reconnaissance is a port scanner. A port scanner is a program that listens to
well-known port numbers to detect services running on a system that can be exploited to
break into the system.

There are several port-scanning programs available on the Internet at various sites. They are
not difficult to find. Organizations can monitor their system log files to detect port scanning
as a prelude to an attack. Most intrusion detection software monitors for port scanning. If you
find that your system is being scanned you can trace the scan back to its origination point and
perhaps take some pre-emptive action. However, some scanning programs take a more
stealthy approach to scanning that is very difficult to detect. For example, some programs use
a SYN scan, which employs a SYN packet to create a half-open connection that doesn't get
logged. SYN packets and half-open connections will be detailed later in this chapter.

Spoofs

Spoofs cover a broad category of threats. In general terms, a spoof entails falsifying one's
identity or masquerading as some other individual or entity to gain access to a system or
network or to gain information for some other unauthorized purpose. There are many
different kinds of spoofs, including, among many others, IP address spoofing, session high
jacking, domain name service (DNS) spoofing, sequence number spoofing, and replay
attacks.

IP Address Spoofing
Every device on a TCP/IP network has a unique IP address. The IP address is a unique
identification of the device, and no two devices on the network can have the same IP address.
IP addresses are formatted as four decimal numbers separated by dots (e.g., 147.34.28.103).
IP address spoofing takes advantage of systems and networks that rely on the IP address of
the connecting system or device for authentication. For example, packet-filtering routers are
sometimes used to protect an internal network from an external untrusted network. These
routers will only allow specified IP addresses to pass from the external network to the internal
network. If a hacker is able to determine an IP address that is permitted access through the
router, he or she can spoof the address on the external network to gain access to the internal
network. The hacker in effect masquerades as someone else.

Sequence Number Spoofing

TCP/IP network connections use sequence numbers. The sequence numbers are part of each
transmission and are exchanged with each transaction. The sequence number is based upon
each computer's internal clock, and the number is predictable because it is based on a set
algorithm.

By monitoring a network connection, a hacker can record the exchange of sequence numbers
and predict the next set of sequence numbers. With this information, a hacker can insert
himself or herself into the network connection and, effectively, take over the connection or
insert misinformation.

The best defense against sequence number spoofing is to encrypt a connection. Encrypting a
connection prevents anyone who may be monitoring the network from being able to
determine the sequence numbers or any other useful information.

Session Highjacking

Session high jacking is similar to sequence number spoofing. In this process, a hacker takes
over a connection session, usually between a client user and a server. This is generally done
by gaining access to a router or some other network device acting as a gateway between the
legitimate user and the server and utilizing IP spoofing. Since session high jacking usually
requires the hacker to gain privileged access to a network device, the best defense to take is to
properly secure all devices on the network.

DNS

Domain Name Service (DNS) is a hierarchical name service used with TCP/IP hosts that is
distributed and replicated on servers across the Internet. It is used on the Internet and on
intranets for translating IP addresses into host names. The host names can be used in URLs.
DNS can be thought of as a lookup table that allows users to specify remote computers by
host names rather than their IP addresses. The advantage of DNS is that you don't have to
know the IP addresses for all the Internet sites to access the sites. DNS can be configured to
use a sequence of name servers, based on the domains in the name being sought, until a match
is found. The most commonly deployed DNS server software on the Internet is BIND. DNS is
subject to several different spoofs. Two common ones are the man in the middle (MIM) and
DNS poisoning. Redirects, another less common attack, rely on the manipulation of the
domain name registry itself to redirect a URL.
Man in the Middle Attack (MIM)

In a MIM attack, a hacker inserts himself or herself between a client program and a server on
a network. By doing so the hacker can intercept information entered by the client, such as
credit card numbers, passwords, and account information. Under one execution of this
scheme, a hacker would place himself or herself between a browser and a Web server. The
MIM attack, which is also sometimes called Web spoofing, is usually achieved by DNS or
hyperlink spoofing.

There are several ways a hacker can launch a MIM attack. One way is to register a URL that
is very similar to an existing URL. For example, a hacker could register a URL like
www.microsoft.com. When someone who wants to go to the Microsoft Web site at
www.microsoft.com mistakenly types in www.microsoft.com they would be brought to a Web
site set up by the hacker to look like the Microsoft Web site. Figure 2.5 illustrates how the
process works.

To Web surfers everything would look normal. They would interact with the counterfeit Web
site just as they would with the real site. As the Web surfer enters in choices and information
the hacker's Web site can even pass it onto the real site and pass back to the Web surfer the
screens that the real site returns.

DNS Poisoning

Another method that can be used to launch this attack is to compromise a DNS server. One
method for doing so is known as DNS poisoning. DNS poisoning exploits a vulnerability in
early versions of the Berkeley Internet Name Daemon (BIND). BIND, the most commonly
deployed DNS software on the Internet, was developed for BSD UNIX. A network of Internet
BIND servers translates native Internet IP addresses to the commonly used names such as
www.ggu.edu for Golden Gate University. Prior to version 8.1 of BIND, it was possible to
"poison" the table entries of a DNS server with false information.

The information could include a false IP address for a DNS entry in the server's table. The
result could be that when someone used that DNS server to "resolve" the URL name, he or
she would be directed to the incorrect IP address.

By compromising a DNS server, a hacker can make a legitimate URL point to the hacker's
Web site. The Web surfer might enter in www.amazon.com expecting to go to the
Amazon.com Web site to purchase a book. The URL www.amazon.com normally points to
xxx.xxx.xxx.xxx, but the hacker has compromised a DNS server to point that URL to his or
her server. As a result, the Web surfer is brought to the hacker's site and not to Amazon.com.
Redirects
Under another method of DNS attack, hackers compromise a link on someone else's page or
set up their own page with false links. In either case, the link could state that it is for a
legitimate site, but in reality the link brings the Web surfer to a site set up and controlled by
the hacker that looks like the site the Web surfer was expecting.

If all other attempts fail, a hacker can try manipulating the domain name registry system
originally maintained by the Inter NIC. In 1999, on at least three occasions, hackers were able
to transfer domain names or redirect Internet surfers to sites other than the ones they were
attempting to access. In one case Network Solutions' own DNS entry was altered, so that
when users entered in the Network Solutions URL they were redirected to another site.

In at least three other cases hackers were able to transfer ownership of domain names to other
IP addresses. Once the ownership was transferred and the NSI database altered, anyone
attempting to access those domains would be redirected to the new sites. In one case the
domain for excite.com was transferred to an unsuspecting site that found itself inundated with
the millions of hits that excite.com normally receives. In other cases the ownership of the
domains for the Ku Klux Klan and another site opposed to homosexuality called
godhatesfags.com were transferred. Ownership of the Ku Klux Klan site was transferred to a
site dedicated to fighting bigotry. Ironically, the godhatesfags.com domain was transferred to
a site with the domain godlovesfags.com, a site that went on-line to appeal for tolerance. No
individuals from the sites to which the domain were redirected were involved with the
manipulation of the domain name registry system.

When employing the MIM attack, a hacker's false or counterfeit site can actually pass the
client's requests onto the real site and return to the client the requested pages from the real
site. All the while the hacker is monitoring and recording the interaction between the client
and the server.

There is really no effective countermeasure to MIM. This attack can even be successful when
encryption, such as SSL, is being employed. It only requires the hacker to obtain a valid
digital certificate to load on his or her server, so that SSL can be enabled. Web surfers need
only to be careful about where they are browsing, confirming links and only trusting links
from a secure and trusted site.

Note that there are other methods to execute a redirect or MIM attack. For example, certain
operating systems such as Microsoft's Windows 95, 98, and 2000 and Sun's Solaris have an
inherent vulnerability in their implementation of the Internet Control Message Protocol
(ICMP) Router Discovery Protocol (IRDF); ICMP is an integral part of the TCP/IP suite
protocols. Hackers can exploit this vulnerability by rerouting or modifying outbound traffic as
they choose. A key limitation on an attack using this vulnerability is that the attacker must be
on the same network as the targeted system.

Replay Attack

A hacker executes a replay attack by intercepting and storing a legitimate transmission


between two systems and retransmitting it at a later time. Theoretically, this attack can even
be successful against encrypted transmissions. The best defense to this attack is to use session
keys, check the time stamp on all transmissions, and employ time-dependent message digests.

Password Cracking

Password cracking is sometimes called a dictionary-based attack. Password crackers are


programs that decipher password files. Password-cracking programs are available for most
network and computer operating systems. They are able to decipher password files by
utilizing the same algorithm used to create the encrypted password. They generally employ a
dictionary of known words or phrases, which are also encrypted with the password algorithm.
The password crackers compare each record in the password file against each record in the
dictionary file to find a match. When a match is found, a password is found.

The source code for password-cracking programs for most computer and network operating
systems (NOSs) is easily available on the Web at sites such as http://www.L0pht.com. Some
of the programs available on the Web include Brute, CrackerJack, John The Ripper, and New
Hack Sniffing

Network sniffing or packet sniffing is the process of monitoring a network in an attempt to


gather information that may be useful in an attack. With the proper tools a hacker can monitor
the network packets to obtain passwords or IP addresses. Many vendors manufacture
hardware and software for legitimate purposes that can be abused by hackers. The only
comforting fact about these products is that hackers usually can't afford them. They can,
however, steal them. There are also some common utilities available and programs that can be
downloaded from hacker sites such as tcp mon, tcp dump, or gobbler. Network Associates'
Sniffer Pro is an example of a commercially available product.

Password sniffing is particularly a threat for users who log into Unix systems over a network.
Telnet or rlogin is usually employed when logging onto a Unix systems over a network.
Telnet and rlogin do not encrypt passwords. As a result, when a user enters in his or her
password, it is transmitted in the clear, meaning anyone monitoring the network can read it. In
contrast, both Novel and Windows NT workstations encrypt passwords for transmission.

Denial of Service

Denial-of-service attacks are designed to shut down or render inoperable a system or network.
The goal of the denial-of-service attack is not to gain access or information but to make a
network or system unavailable for use by other users. It is called a denial-of-service attack,
because the end result is to deny legitimate users access to network services. Such attacks are
often used to exact revenge or to punish some individual or entity for some perceived slight
or injustice. Unlike real hacking, denial-of-service attacks do not require a great deal of
experience, skill, or intelligence to succeed. As a result, they are usually launched by nerdy,
young programmers who fancy themselves to be master hackers.

There are many different types of denial-of -service attacks. The following sections present
four examples: ping of death, "synchronize sequence number" (SYN) flooding, spamming.
SYN Flooding

SYN flooding is a denial-of-service attack that exploits the three-way handshake that TCP/IP
uses to establish a connection. Basically, SYN flooding disables a targeted system by creating
many half-open connections. Figure illustrates how a typical TCP/IP connection is
established.

Figure Normal TCP/IP handshake.

In Figure the client transmits to the server the SYN bit set. This tells the server that the client
wishes to establish a connection and what the starting sequence number will be for the client.
The server sends back to the client an acknowledgment (SYN-ACK) and confirms its starting
sequence number. The client acknowledges (ACK) receipt of the server's transmission and
begins the transfer of data.

With SYN flooding a hacker creates many half- open connections by initiating the
connections to a server with the SYN number bit. However, the return address that is
associated with the SYN would not be a valid address. The server would send a SYN-ACK
back to an invalid address that would not exist or respond. Using available programs, the
hacker would transmit many SYN packets with false return addresses to the server. The server
would respond to each SYN with an acknowledgment and then sit there with the connection
half-open waiting for the final acknowledgment to come back. Figure illustrates how SYN
flooding works.

Figure SYN flooding exchange.

The result from this type of attack can be that the system under attack may not be able to
accept legitimate incoming network connections so that users cannot log onto the system.
Each operating system has a limit on the number of connections it can accept. In addition, the
SYN flood may exhaust system memory, resulting in a system crash. The net result is that the
system is unavailable or nonfunctional.

One countermeasure for this form of attack is to set the SYN relevant timers low so that the
system closes half-open connections after a relatively short period of time. With the timers set
low, the server will close the connections even while the SYN flood attack opens more.

SPAM

SPAM is unwanted e-mail. Anyone who has an e-mail account has received SPAM. Usually it
takes the form of a marketing solicitation from some company trying to sell something we
don't want or need. To most of us it is just an annoyance, but to a server it can also be used as
a denial-of-service attack. By inundating a targeted system with thousands of e-mail
messages, SPAM can eat available network bandwidth, overload CPUs, cause log files to
grow very large, and consume all available disk space on a system. Ultimately, it can cause a
system to crash.

SPAM can be used as a means to launch an indirect attack on a third party. SPAM messages
can contain a falsified return address, which may be the legitimate address of some innocent
unsuspecting person. As a result, an innocent person, whose address was used as the return
address, may be spammed by all the individuals targeted in the original SPAM.

E-mail filtering can prevent much unwanted e- mail from getting through. Unfortunately, it
frequently filters out legitimate e-mail as well.

Smurf Attack
The smurf attack is named after the source code employed to launch the attack (smurf.c). The
smurf attack employs forged ICMP echo request packets and the direction of those packets to
IP network broadcast addresses. The attack issues the ICMP ECHO_REQUEST to the
broadcast address of another network. The attack spoofs as the source address the IP address
of the system it wishes to target. Figure illustrates how a smurf attack works.

When the systems on the network to whose broadcast address the ECHO_REQUEST is sent
receive the packet with the falsified source address (i.e., the return address), they respond,
flooding the targeted victim with the echo replies. This flood can overwhelm the targeted
victim's network. Both the intermediate and victim's networks will see degraded performance.
The attack can eventually result in the inoperability of both networks.

There are steps that the intermediate network can take to prevent from being used in this way.
The steps include configuring network devices not to respond to ICMP ECHO_REQUESTs
and disabling IP directed broadcasts from passing the network routers. There are really no
steps that the targeted victim can take to prevent this kind of attack. The only defense is
contacting the intermediate network to stop the ECHO_REQUESTs from being relayed, once
an organization determines that it is the victim of an attack.

Denial-of-service attacks are the most difficult to defend against, and, of the possible attacks,
they require the least amount of expertise to launch. In general, organization should monitor
for anomalous traffic patterns, such as SYN-ACK but no return ACKs. Since most routers
filter incoming and outgoing packets, router-based filtering is the best defense against denial-
of-service attacks. Organizations should use packet filters that filter based on destination and
sender address. In addition, they should always use SPAM/send mail filters.

Keep in mind there is a tradeoff with packet and mail filtering. The filtering that is performed
to detect denial-of -service attacks will slow network performance, which may frustrate an
organization's end users and slow its applications. In addition, mail filtering will bounce some
e-mails that really should be allowed through, which may also aggravate end users.
and smurfing. These are examples only and are not necessarily the most frequently used
forms of denial-of-service attacks.

Buffer Overflow Attack

The buffer overflow attack was discovered in hacking circles. It uses input to a poorly
implemented, but (in intention) completely harmless application, typically with root /
administrator privileges. The buffer overflow attack results from input that is longer than the
implementer intended. The buffer overflow has long been a feature of the computer security
landscape. In fact the first self-propagating Internet worm—1988's Morris Worm—used a
buffer overflow in the Unix finger daemon to spread from machine to machine. Twenty-
seven years later, buffer overflows remain a source of problems. Windows infamously
revamped its security focus after two buffer overflow-driven exploits in the early 2000s.
And just this May, a buffer overflow found in a Linux driver left (potentially) millions of
home and small office routers vulnerable to attack.

At its core, the buffer overflow is an astonishingly simple bug that results from a common
practice. Computer programs frequently operate on chunks of data that are read from a file,
from the network, or even from the keyboard. Programs allocate finite-sized blocks of
memory—buffers—to store this data as they work on it. A buffer overflow happens when
more data is written to or read from a buffer than the buffer can hold.

On the face of it, this sounds like a pretty foolish error. After all, the program knows how big
the buffer is, so it should be simple to make sure that the program never tries to cram more
into the buffer than it knows will fit. You'd be right to think that. Yet buffer overflows
continue to happen, and the results are frequently a security catastrophe.

Definition - What does Network Scanning mean?

Network scanning is a procedure for identifying active hosts on a network, either for the
purpose of attacking them or for network security assessment. Scanning procedures, such as
ping sweeps and port scans, return information about which IP addresses map to live hosts
that are active on the Internet and what services they offer. Another scanning method, inverse
mapping, returns information about what IP addresses

do not map to live hosts; this enables an attacker to make assumptions about viable
addresses. Network scanning refers to the use of a computer network to gather information
regarding computing systems. Network scanning is mainly used for security assessment,
system maintenance, and also for performing attacks by hackers.

The purpose of network scanning is as follows:

 Recognize available UDP and TCP network services running on the targeted hosts
 Recognize filtering systems between the user and the targeted hosts
 Determine the operating systems (OSs) in use by assessing IP responses
 Evaluate the target host's TCP sequence number predictability to determine sequence
prediction attack and TCP spoofing
Network scanning consists of network port scanning as well as vulnerability scanning.

Network port scanning refers to the method of sending data packets via the network to a
computing system's specified service port numbers (for example, port 23 for Telnet, port 80
for HTTP and so on). This is to identify the available network services on that particular
system. This procedure is effective for troubleshooting system issues or for tightening the
system's security.

Vulnerability scanning is a method used to discover known vulnerabilities of computing


systems available on a network. It helps to detect specific weak spots in an application
software or the operating system (OS), which could be used to crash the system or
compromise it for undesired purposes.

Network port scanning as well as vulnerability scanning is an information-gathering


technique, but when carried out by anonymous individuals, these are viewed as a prelude to
an attack.

Network scanning processes, like port scans and ping sweeps, return details about which IP
addresses map to active live hosts and the type of services they provide. Another network
scanning method known as inverse mapping gathers details about IP addresses that do not
map to live hosts, which helps an attacker to focus on feasible addresses.

Network scanning is one of three important methods used by an attacker to gather


information. During the footprint stage, the attacker makes a profile of the targeted
organization. This includes data such as the organization's domain name system (DNS) and e-
mail servers, in addition to its IP address range. During the scanning stage, the attacker
discovers details about the specified IP addresses that could be accessed online, their system
architecture, their OSs and the services running on every computer. During the enumeration
stage, the attacker collects data, including routing tables, network user and group names,
Simple Network Management Protocol (SNMP) data and so on.

Definition - What does Port Scanning mean?

Port scanning refers to the surveillance of computer ports, most often by hackers for
malicious purposes. Hackers conduct port-scanning techniques in order to locate holes within
specific computer ports. For an intruder, these weaknesses represent opportunities to gain
access for an attack.
There are 65,535 ports in each IP address, and hackers may scan each and every one to find
any that are not secure.
While port scanning can be conducted for legitimate computer security reasons, it is also
considered an open-door hacking technique, which can easily be performed for malicious
reasons when a specific computer or operating system is the target. Conducted in stealth
mode or strobe, malicious port scanning is typically conducted on ports after the 1,024 mark
because the ports prior to that are usually affiliated with more standard port services. The
ports following that mark are more susceptible to malicious port scanning due to their
availability for probes.

ping sweep (ICMP sweep) definition

A ping sweep (also known as an ICMP sweep) is a basic network scanning technique used to
determine which of a range of IP addresses map to live hosts (computers). Whereas a
single ping will tell you whether one specified host computer exists on the network, a ping
sweep consists of ICMP (Internet Control Message Protocol) ECHO requests sent to multiple
hosts. If a given address is live, it will return an ICMP ECHO reply. Ping sweeps are among
the older and slower methods used to scan a network.

There are a number of tools that can be used to do a ping sweep, such as fping, gping,
and nmap for UNIX systems, and the Pinger software from Rhino9 and Ping Sweep from
SolarWinds for Windows systems. Both Pinger and Ping Sweep send multiple packets at the
same time and allow the user to resolve host names and save output to a file.

To disable ping sweeps on a network, administrators can block ICMP ECHO requests from
outside sources. However, ICMP TIMESTAMP and Address Mask Requests can be used in a
similar manner.

TEARDROP ATTACK

Teardrop attack is a denial-of-service (DoS) attack that involves sending fragmented packets
to a target machine. Since the machine receiving such packets cannot reassemble them due to
a bug in TCP/IP fragmentation reassembly, the packets overlap one another, crashing the
target network device. This generally happens on older operating systems such as Windows
3.1x, Windows 95, Windows NT and versions of the Linux kernel prior to 2.1.63.

One of the fields in an IP header is the “fragment offset” field, indicating the starting position,
or offset, of the data contained in a fragmented packet relative to the data in the original
packet. If the sum of the offset and size of one fragmented packet differs from that of the next
fragmented packet, the packets overlap. When this happens, a server vulnerable to teardrop
attacks is unable to reassemble the packets - resulting in a denial-of-service condition.
Implementations of TCP/IP differ slightly from platform to platform. With some operating
systems there is a weakness in the handling of IP packets that can be exploited using a
teardrop attack. In this attack, the client sends a packet of information that is intentionally
malformed in a specific way to exploit the error that occurs when the packet is reassembled.
The result could be a fatal crash in the operating system or application that handles the
packet.

By default the F5 BIG-IP system handles these attacks correctly by precisely checking the
incoming packet's frame alignment and discarding improperly formatted packets. In this way
teardrop packets are dropped and the attack is mitigated before the packets can pass into the
protected network.

Teardrop attacks exploit the reassembly of fragmented IP packets. In the IP header, one of the
fields is the fragment offset field, which indicates the starting position, or offset, of the data
contained in a fragmented packet relative to the data of the original unfragmented packet

When the sum of the offset and size of one fragmented packet differ from that of the next
fragmented packet, the packets overlap, and the server attempting to reassemble the packet
can crash, especially if it is running an older operating system that has this vulnerability.

Figure Teardrop Attacks


NETWORK AND INFORMATION SECURITY

BCA 6TH SEM NOTES UNIT 2

WHAT IS CRYPTOGRAPHY?

The word cryptography comes from the Greek words κρυπτο (hidden or secret) and γραφη
(writing). Oddly enough, cryptography is the art of secret writing. More generally, people
think of cryptography as the art of mangling information into apparent unintelligibility in a
manner allow-ing a secret method of unmangling. The basic service provided by
cryptography is the ability to send information between participants in a way that prevents
others from reading it. In this book we will concentrate on the kind of cryptography that is
based on representing information as numbers and mathematically manipulating those
numbers. This kind of cryptography can provide other ser-vices, such as
• integrity checking—reassuring the recipient of a message that the message has not been
altered since it was generated by a legitimate source
• authentication—verifying someone’s (or something’s) identity
But back to the traditional use of cryptography. A message in its original form is known
as plaintext or cleartext. The mangled information is known as ciphertext. The process for
produc-ing ciphertext from plaintext is known as encryption. The reverse of encryption is
called decryp-tion.

encryption decryption
plaintext ciphertext plaintext

While cryptographers invent clever secret codes, cryptanalysts attempt to break these
codes. These two disciplines constantly try to keep ahead of each other.

Cryptographic systems tend to involve both an algorithm and a secret value. The secret value
is known as the key. The reason for having a key in addition to an algorithm is that it is
difficult to keep devising new algorithms that will allow reversible scrambling of information,
and it is diffi-cult to quickly explain a newly devised algorithm to the person with whom
you’d like to start com-municating securely. With a good cryptographic scheme it is perfectly
OK to have everyone, including the bad guys (and the cryptanalysts) know the algorithm
because knowledge of the algo-rithm without the key does not help unmangle the
information.
The concept of a key is analogous to the combination for a combination lock. Although
the concept of a combination lock is well known (you dial in the secret numbers in the correct
sequence and the lock opens), you can’t open a combination lock easily without knowing the
combination.

MODERN CRYPTOGRAPHY
Modern cryptography is the cornerstone of computer and communications security. Its
foundation is based on various concepts of mathematics such as number theory,
computational-complexity theory, and probability theory.

Characteristics of Modern Cryptography


There are three major characteristics that separate modern cryptography from the classical
approach.

Classic Cryptography Modern Cryptography

It manipulates traditional characters, It operates on binary bit sequences.


i.e., letters and digits directly.

It relies on publicly known


It is mainly based on ‘security through mathematical algorithms for coding
obscurity’. The techniques employed for the information. Secrecy is obtained
coding were kept secret and only the through a secrete key which is used
parties involved in communication as the seed for the algorithms. The
knew about them. computational difficulty of algorithms,
absence of secret key, etc., make it
impossible for an attacker to obtain
the original information even if he
knows the algorithm used for coding.

It requires the entire cryptosystem for Modern cryptography requires parties


communicating confidentially. interested in secure communication
to possess the secret key only.

Context of Cryptography
Cryptology, the study of cryptosystems, can be subdivided into two branches:
• Cryptography
• Cryptanalysis
What is Cryptography?
Cryptography is the art and science of making a cryptosystem that is capable of providing
information security.

Cryptography deals with the actual securing of digital data. It refers to the design of
mechanisms based on mathematical algorithms that provide fundamental information security
services. You can think of cryptography as the establishment of a large toolkit containing
different techniques in security applications.

What is Cryptanalysis?
The art and science of breaking the cipher text is known as cryptanalysis.

Cryptanalysis is the sister branch of cryptography and they both co-exist. The cryptographic
process results in the cipher text for transmission or storage. It involves the study of
cryptographic mechanism with the intention to break them. Cryptanalysis is also used during
the design of the new cryptographic techniques to test their security strengths.

Note: Cryptography concerns with the design of cryptosystems, while cryptanalysis


studies the breaking of cryptosystems.

Security Services of Cryptography

The primary objective of using cryptography is to provide the following four fundamental
information security services. Let us now see the possible goals intended to be fulfilled by
cryptography.

Confidentiality
Confidentiality is the fundamental security service provided by cryptography. It is a security
service that keeps the information from an unauthorized person. It is sometimes referred to as
privacy or secrecy.

Confidentiality can be achieved through numerous means starting from physical securing to
the use of mathematical algorithms for data encryption.

Data Integrity
It is security service that deals with identifying any alteration to the data. The data may get
modified by an unauthorized entity intentionally or accidently. Integrity service confirms that
whether data is intact or not since it was last created, transmitted, or stored by an authorized
user.

Data integrity cannot prevent the alteration of data, but provides a means for detecting
whether data has been manipulated in an unauthorized manner.

Authentication
Authentication provides the identification of the originator. It confirms to the receiver that the
data received has been sent only by an identified and verified sender.
Authentication service has two variants:
1. Message authentication identifies the originator of the message without any regard
router or system that has sent the message.

2. Entity authentication is assurance that data has been received from a specific entity,
say a particular website.

Apart from the originator, authentication may also provide assurance about other parameters
related to data such as the date and time of creation/transmission.

Non-repudiation
It is a security service that ensures that an entity cannot refuse the ownership of a previous
commitment or an action. It is an assurance that the original creator of the data cannot deny
the creation or transmission of the said data to a recipient or third party.

Non-repudiation is a property that is most desirable in situations where there are chances of a
dispute over the exchange of data. For example, once an order is placed electronically, a
purchaser cannot deny the purchase order, if non-repudiation service was enabled in this
transaction.

Cryptography Primitives

Cryptography primitives are nothing but the tools and techniques in Cryptography that can be
selectively used to provide a set of desired security services:
Encryption
Hash functions
Message Authentication codes (MAC)
Digital Signatures

The following table shows the primitives that can achieve a particular security service on
their own.
Primitives Encryption Hash MAC Digital
Service Function Signature

Confidentiality Yes No No No

Integrity No Sometimes Yes Yes

Authentication No No Yes Yes

Non Reputation No No Sometimes Yes

Note: Cryptographic primitives are intricately related and they are often combined to achieve
a set of desired security services from a cryptosystem.

CRYPTOSYSTEM

A cryptosystem is an implementation of cryptographic techniques and their accompanying


infrastructure to provide information security services. A cryptosystem is also referred to as a
cipher system.
Components of a Cryptosystem
The various components of a basic cryptosystem are as follows:
Plaintext. It is the data to be protected during transmission.

Encryption Algorithm. It is a mathematical process that produces a ciphertext for any given
plaintext and encryption key. It is a cryptographic algorithm that takes plaintext and an
encryption key as input and produces a ciphertext.
Ciphertext. It is the scrambled version of the plaintext produced by the encryption algorithm
using a specific the encryption key. The ciphertext is not guarded. It flows on public channel.
It can be intercepted or compromised by anyone who has access to the communication
channel.
Decryption Algorithm, It is a mathematical process, that produces a unique plaintext for any
given cipher text and decryption key. It is a cryptographic algorithm that takes a cipher text
and a decryption key as input, and outputs a plaintext. The decryption algorithm essentially
reverses the encryption algorithm and is thus closely related to it.
Encryption Key. It is a value that is known to the sender. The sender inputs the encryption
key into the encryption algorithm along with the plaintext in order to compute the ciphertext.
Decryption Key. It is a value that is known to the receiver. The decryption key is related to
the encryption key, but is not always identical to it. The receiver inputs the decryption key
into the decryption algorithm along with the cipher text in order to compute the plaintext.

For a given cryptosystem, a collection of all possible decryption keys is called a key space.
An interceptor (an attacker) is an unauthorized entity who attempts to determine the
plaintext. He can see the cipher text and may know the decryption algorithm. He, however,
must never know the decryption key.

Types of Cryptosystems
Fundamentally, there are two types of cryptosystems based on the manner in which
encryption-decryption is carried out in the system:
Symmetric Key Encryption
Asymmetric Key Encryption

The main difference between these cryptosystems is the relationship between the encryption
and the decryption key. Logically, in any cryptosystem, both the keys are closely associated.
It is practically impossible to decrypt the cipher text with the key that is unrelated to the
encryption key.

Symmetric Key Encryption


The encryption process where same keys are used for encrypting and decrypting the
information is known as Symmetric Key Encryption. The study of symmetric cryptosystems
is referred to as symmetric cryptography. Symmetric cryptosystems are also sometimes
referred to as secret key cryptosystems.

The study of symmetric cryptosystems is referred to as symmetric cryptography.


Symmetric cryptosystems are also sometimes referred to as secret key cryptosystems.

In symmetric-key cryptography, the same key is used by both parties. The sender uses this
key and an encryption algorithm to encrypt data; the receiver uses the same key and the
corresponding decryption algorithm to decrypt the data.
Symmetric-key Cryptography

A few well-known examples of symmetric key encryption methods are: Digital Encryption
Standard (DES), Triple-DES (3DES), IDEA, and BLOWFISH.

Prior to 1970, all cryptosystems employed symmetric key encryption. Even today, its
relevance is very high and it is being used extensively in many cryptosystems. It is very
unlikely that this encryption will fade away, as it has certain advantages over asymmetric key
encryption.
The salient features of cryptosystem based on symmetric key encryption are:

Persons using symmetric key encryption must share a common key prior to exchange
of information.

Keys are recommended to be changed regularly to prevent any attack on the system.

A robust mechanism needs to exist to exchange the key between the communicating
parties. As keys are required to be changed regularly, this mechanism becomes
expensive and cumbersome.
In a group of n people, to enable two-party communication between any two persons,
the number of keys required for group is n × (n – 1)/2.

Length of Key (number of bits) in this encryption is smaller and hence, process of
encryption-decryption is faster than asymmetric key encryption.

Processing power of computer system required to run symmetric algorithm is less.

Challenge of Symmetric Key Cryptosystem


There are two restrictive challenges of employing symmetric key cryptography.

Key establishment – Before any communication, both the sender and the receiver
need to agree on a secret symmetric key. It requires a secure key establishment
mechanism in place.

Trust Issue – Since the sender and the receiver use the same symmetric key, there is
an implicit requirement that the sender and the receiver ‘trust’ each other. For
example, it may happen that the receiver has lost the key to an attacker and the sender
is not informed.

These two challenges are highly restraining for modern day communication. Today, people
need to exchange information with non-familiar and non-trusted parties. For example, a
communication between online seller and customer. These limitations of symmetric key
encryption gave rise to asymmetric key encryption schemes.

Asymmetric Key Encryption


In asymmetric or public-key cryptography, there are two keys: a private key and a public key.
The private key is kept by the receiver. The public key is announced to the public.The
encryption process where different keys are used for encrypting and decrypting the
information is known as Asymmetric Key Encryption. Though the keys are different, they
are mathematically related and hence, retrieving the plaintext by decrypting ciphertext is
feasible. The process is depicted in the following illustration:
Asymmetric Key Encryption was invented in the 20th century to come over the necessity of
pre-shared secret key between communicating persons. The salient features of this encryption
scheme are as follows:

Every user in this system needs to have a pair of dissimilar keys, private key and
public key. These keys are mathematically related – when one key is used for
encryption, the other can decrypt the ciphertext back to the original plaintext.

It requires to put the public key in public repository and the private key as a well-
guarded secret. Hence, this scheme of encryption is also called
Public Key Encryption.

Though public and private keys of the user are related, it is computationally not
feasible to find one from another. This is a strength of this scheme.

When Host1 needs to send data to Host2, he obtains the public key of Host2 from
repository, encrypts the data, and transmits.

Host2 uses his private key to extract the plaintext.

Length of Keys (number of bits) in this encryption is large and hence, the process of
encryption-decryption is slower than symmetric key encryption.

Processing power of computer system required to run asymmetric algorithm is higher.

Symmetric cryptosystems are a natural concept. In contrast, public-key cryptosystems are


quite difficult to comprehend.

You may think, how can the encryption key and the decryption key are ‘related’, and yet it is
impossible to determine the decryption key from the encryption key?
The answer lies in the mathematical concepts. It is possible to design a cryptosystem whose
keys have this property. The concept of public-key cryptography is relatively new. There are
fewer public-key algorithms known than symmetric algorithms.

Challenge of Public Key Cryptosystem


Public-key cryptosystems have one significant challenge: the user needs to trust that the
public key that he is using in communications with a person really is the public key of that
person and has not been spoofed by a malicious third party.

This is usually accomplished through a Public Key Infrastructure (PKI) consisting a trusted
third party. The third party securely manages and attests to the authenticity of public keys.
When the third party is requested to provide the public key for any communicating person X,
they are trusted to provide the correct public key.

The third party satisfies itself about user identity by the process of attestation, notarization, or
some other process - that X is the one and only, or globally unique, X. The most common
method of making the verified public keys available is to embed them in a certificate which is
digitally signed by the trusted third party.

Relation between Encryption Schemes


A summary of basic key properties of two types of cryptosystems is given below:
Symmetric Public Key Cryptosystems
Cryptosystems

Relation between Same Different, but mathematically


Keys related

Encryption Key Symmetric Public

Decryption Key Symmetric Private

Due to the advantages and disadvantage of both the systems, symmetric key and public-key
cryptosystems are often used together in the practical information security systems.

In cryptography, the following three assumptions are made about the security environment
and attacker’s capabilities.

Details of the Encryption Scheme


The design of a cryptosystem is based on the following two cryptography algorithms:

Public Algorithms: With this option, all the details of the algorithm are in the public
domain, known to everyone.

Proprietary algorithms: The details of the algorithm are only known by the system
designers and users.

In case of proprietary algorithms, security is ensured through obscurity. Private algorithms


may not be the strongest algorithms as they are developed in-house and may not be
extensively investigated for weakness.

Secondly, they allow communication among closed group only. Hence they are not suitable
for modern communication where people communicate with large number of known or
unknown entities. Also, according to Kerckhoff’s principle, the algorithm is preferred to be
public with strength of encryption lying in the key.

Thus, the first assumption about security environment is that the encryption algorithm is
known to the attacker.

Availability of Ciphertext
We know that once the plaintext is encrypted into ciphertext, it is put on unsecure public
channel (say email) for transmission. Thus, the attacker can obviously assume that it has
access to the ciphertext generated by the cryptosystem.

Availability of Plaintext and Ciphertext


This assumption is not as obvious as other. However, there may be situations where an
attacker can have access to plaintext and corresponding ciphertext. Some such possible
circumstances are:
The attacker influences the sender to convert plaintext of his choice and obtains the
ciphertext.

The receiver may divulge the plaintext to the attacker inadvertently. The attacker has
access to corresponding ciphertext gathered from open channel.

In a public-key cryptosystem, the encryption key is in open domain and is known to


any potential attacker. Using this key, he can generate pairs of corresponding
plaintexts and ciphertexts.

Earlier Cryptographic Systems


Before proceeding further, you need to know some facts about historical cryptosystems:
All of these systems are based on symmetric key encryption scheme.

The only security service these systems provide is confidentiality of information.

Unlike modern systems which are digital and treat data as binary numbers, the
earlier systems worked on alphabets as basic element.

These earlier cryptographic systems are also referred to as Ciphers. In general, a cipher is
simply just a set of steps (an algorithm) for performing both an encryption, and the
corresponding decryption.

Caesar Cipher

It is a mono-alphabetic cipher wherein each letter of the plaintext is substituted by another


letter to form the ciphertext. It is a simplest form of substitution cipher scheme.

This cryptosystem is generally referred to as the Shift Cipher. The concept is to replace
each alphabet by another alphabet which is ‘shifted’ by some fixed number between 0 and
25.

For this type of scheme, both sender and receiver agree on a ‘secret shift number’ for
shifting the alphabet. This number which is between 0 and 25 becomes the key of
encryption.

The name ‘Caesar Cipher’ is occasionally used to describe the Shift Cipher when the ‘shift
of three’ is used.

Process of Shift Cipher


In order to encrypt a plaintext letter, the sender positions the sliding ruler underneath
the first set of plaintext letters and slides it to LEFT by the number of positions of the
secret shift.

The plaintext letter is then encrypted to the ciphertext letter on the sliding ruler
underneath. The result of this process is depicted in the following illustration for an
agreed shift of three positions. In this case, the plaintext
‘tutorial’ is encrypted to the ciphertext ‘WXWRULDO’. Here is the ciphertext
alphabet for a Shift of 3:
On receiving the ciphertext, the receiver who also knows the secret shift, positions
his sliding ruler underneath the ciphertext alphabet and slides it to RIGHT by the
agreed shift number, 3 in this case.

He then replaces the ciphertext letter by the plaintext letter on the sliding ruler
underneath. Hence the ciphertext ‘WXWRULDO’ is decrypted to ‘tutorial’. To
decrypt a message encoded with a Shift of 3, generate the plaintext alphabet using
a shift of ‘-3’ as shown below:

MODERN SYMMETRIC KEY ENCRYPTION

Digital data is represented in strings of binary digits (bits) unlike alphabets. Modern
cryptosystems need to process this binary strings to convert in to another binary string. Based
on how these binary strings are processed, a symmetric encryption schemes can be classified
in to:

Block Ciphers

In this scheme, the plain binary text is processed in blocks (groups) of bits at a time; i.e. a
block of plaintext bits is selected, a series of operations is performed on this block to generate
a block of ciphertext bits. The number of bits in a block is fixed. For example, the schemes
DES and AES have block sizes of 64 and 128, respectively.

Stream Ciphers

In this scheme, the plaintext is processed one bit at a time i.e. one bit of plaintext is taken,
and a series of operations is performed on it to generate one bit of cipher text. Technically,
stream ciphers are block ciphers with a block size of one bit.

The basic scheme of a block cipher is depicted as follows:


A block cipher takes a block of plaintext bits and generates a block of ciphertext bits,
generally of same size. The size of block is fixed in the given scheme. The choice of block
size does not directly affect to the strength of encryption scheme. The strength of cipher
depends up on the key length.

Block Size
Though any size of block is acceptable, following aspects are borne in mind while selecting a
size of a block.

Avoid very small block size: Say a block size is m bits. Then the possible plaintext
bits combinations are then 2m. If the attacker discovers the plain text blocks
corresponding to some previously sent ciphertext blocks, then the attacker can launch
a type of ‘dictionary attack’ by building up a dictionary of plaintext/ciphertext pairs
sent using that encryption key. A larger block size makes attack harder as the
dictionary needs to be larger.

Do not have very large block size: With very large block size, the cipher becomes
inefficient to operate. Such plaintexts will need to be padded before being encrypted.

Multiples of 8 bit: A preferred block size is a multiple of 8 as it is easy for


implementation as most computer processor handle data in multiple of 8 bits.

Padding in Block Cipher

Block ciphers process blocks of fixed sizes (say 64 bits). The length of plaintexts is mostly
not a multiple of the block size. For example, a 150-bit plaintext provides two blocks of 64
bits each with third block of balance 22 bits. The last block of bits needs to be padded up with
redundant information so that the length of the final block equal to block size of the scheme.
In our example, the remaining 22 bits need to
have additional 42 redundant bits added to provide a complete block. The process of adding
bits to the last block is referred to as padding.

Too much padding makes the system inefficient. Also, padding may render the system
insecure at times, if the padding is done with same bits always.

Block Cipher Schemes


There is a vast number of block ciphers schemes that are in use. Many of them are publically
known. Most popular and prominent block ciphers are listed below.
Digital Encryption Standard (DES): The popular block cipher of the
1990s. It is now considered as a ‘broken’ block cipher, due primarily to its small key
size.

Triple DES: It is a variant scheme based on repeated DES applications. It is still a


respected block ciphers but inefficient compared to the new faster block ciphers
available.

Advanced Encryption Standard (AES): It is a relatively new block cipher based on


the encryption algorithm Rijndael that won the AES design competition.

IDEA: It is a sufficiently strong block cipher with a block size of 64 and a key size of
128 bits. A number of applications use IDEA encryption, including early versions of
Pretty Good Privacy (PGP) protocol. The use of IDEA scheme has a restricted
adoption due to patent issues.

Twofish: This scheme of block cipher uses block size of 128 bits and a key of
variable length. It was one of the AES finalists. It is based on the earlier block cipher
Blowfish with a block size of 64 bits.

Serpent: A block cipher with a block size of 128 bits and key lengths of 128, 192, or
256 bits, which was also an AES competition finalist. It is a slower but has more
secure design than other block cipher.

An important distinction in symmetric cryptographic algorithms is between stream and block


ciphers.
Stream cipher: Stream ciphers convert one symbol of plaintext directly into a symbol of
ciphertext.
Advantages:
Speed of transformation: algorithms are linear in time and constant in space.
Low error propogation: an error in encrypting one symbol likely will not affect subsequent
symbols.
Disadvantages:
Low diffusion: all information of a plaintext symbol is contained in a single ciphertext
symbol.
Susceptibility to insertions/ modifications: an active interceptor who breaks the algorithm
might insert spurious text that looks authentic.
Block ciphers: It encrypt a group of plaintext symbols as one block.
Advantages:
High diffusion: information from one plaintext symbol is diffused into several ciphertext
symbols.
Immunity to tampering: difficult to insert symbols without detection.
Disadvantages:
Slowness of encryption: an entire block must be accumulated before encryption / decryption
can begin.
Error propagation: An error in one symbol may corrupt the entire block.
Simple substitution is an example of a stream cipher. Columnar transposition is a block
cipher.

The Data Encryption Standard (DES)


The Data Encryption Standard (DES), a system developed for the U.S. government, was
intended for use by the general public. It has been officially accepted as a cryptographic
standard both in the United States and abroad.
The DES algorithm is a careful and complex combination of two fundamental building blocks
of encryption: substitution and transposition. The algorithm derives its strength from repeated
application of these two techniques, one on top of the other, for a total of 16 cycles. The sheer
complexity of tracing a single bit through 16 iterations of substitutions and transpositions has
so far stopped researchers in the public from identifying more than a handful of general
properties of the algorithm. The algorithm begins by encrypting the plaintext as blocks of 64
bits. The key is 64 bits long, but in fact it can be any 56-bit number. (The extra 8 bits are
often used as check digits and do not affect encryption in normal implementations.) The user
can change the key at will any time there is uncertainty about the Final permutation.

The Data Encryption Standard (DES) is a symmetric-key block cipher published by the
National Institute of Standards and Technology (NIST).

DES is an implementation of a Feistel Cipher. It uses 16 round Feistel structure. The block
size is 64-bit. Though, key length is 64-bit, DES has an effective key length of 56 bits, since 8
of the 64 bits of the key are not used by the encryption algorithm (function as check bits only)
Since DES is based on the Feistel Cipher, all that is required to specify DES is:
Round function
Key schedule
Any additional processing – Initial and final permutation

Initial and Final Permutation

The initial and final permutations are straight Permutation boxes (P-boxes) that are inverses
of each other. They have no cryptography significance in DES. The initial and final
permutations are shown as follows:

Round Function
The heart of this cipher is the DES function, f. The DES function applies a 48-bit key to the
rightmost 32 bits to produce a 32-bit output.
Expansion Permutation Box – Since right input is 32-bit and round key is a 48-bit,
we first need to expand right input to 48 bits. Permutation logic is graphically depicted
in the following illustration:

XOR (Whitener). After the expansion permutation, DES does XOR operation on the
expanded right section and the round key. The round key is used only in this operation.
Substitution Boxes. The S-boxes carry out the real mixing (confusion). DES uses 8 S-
boxes, each with a 6-bit input and a 4-bit output. Refer the following illustration:

Straight Permutation – The 32 bit output of S-boxes is then subjected to the straight
permutation with rule shown in the following illustration:
Key Generation
The round-key generator creates sixteen 48-bit keys out of a 56-bit cipher key. The process of
key generation is depicted in the following illustration:

The logic for Parity drop, shifting, and Compression P-box is given in the DES description.

DES Analysis
The DES satisfies both the desired properties of block cipher. These two properties make
cipher very strong.

Avalanche effect: A small change in plaintext results in the very grate change in the
ciphertext.

Completeness: Each bit of ciphertext depends on many bits of plaintext.

During the last few years, cryptanalysis have found some weaknesses in DES when key
selected are weak keys. These keys shall be avoided.

DES has proved to be a very well designed block cipher. There have been no significant
cryptanalytic attacks on DES other than exhaustive key search.
Public Key Cryptography
Unlike symmetric key cryptography, we do not find historical use of public-key cryptography.
It is a relatively new concept.

Symmetric cryptography was well suited for organizations such as governments, military, and
big financial corporations were involved in the classified communication.

With the spread of more unsecure computer networks in last few decades, a genuine need was
felt to use cryptography at larger scale. The symmetric key was found to be non-practical due
to challenges it faced for key management. This gave rise to the public key cryptosystems.
The process of encryption and decryption is depicted in the following illustration:

The most important properties of public key encryption scheme are:

Different keys are used for encryption and decryption. This is a property which set
this scheme different than symmetric encryption scheme.

Each receiver possesses a unique decryption key, generally referred to as his private
key.
Receiver needs to publish an encryption key, referred to as his public key.

Some assurance of the authenticity of a public key is needed in this scheme to avoid
spoofing by adversary as the receiver. Generally, this type of cryptosystem involves
trusted third party which certifies that a particular public key belongs to a specific
person or entity only.

Encryption algorithm is complex enough to prohibit attacker from deducing the


plaintext from the ciphertext and the encryption (public) key.

Though private and public keys are related mathematically, it is not be feasible to
calculate the private key from the public key. In fact, intelligent part of any public-key
cryptosystem is in designing a relationship between two keys.

Public-key cryptography is a radical departure from all that has gone


before. Right up to modern times all cryptographic systems have been based on the
elementary tools of substitution and permutation. However, public-key algorithms are
based on mathematical functions and are asymmetric in nature, involving the use of two
keys, as opposed to conventional single key encryption. Several misconceptions are
held about p-k:

1. That p-k encryption is more secure from cryptanalysis than conventional encryp-
tion. In fact the security of any system depends on key length and the computa-
tional work involved in breaking the cipher.

2. That p-k encryption has superseded single key encryption. This is unlikely due
to the increased processing power required.

3. That key management is trivial with public key cryptography, this is not correct.

Principles of Public-Key Cryptosystems

The concept of P-K evolved from an attempt to solve two problems, key distribution and the
development of digital signatures. In 1976 Whitfield Diffie and Martin Hell- man achieved
great success in developing the conceptual framework. For conventional encryption the same
key is used for encryption and decryption. This is not a necessary condition. Instead it is
possible to develop a cryptographic system that relies on one key for encryption and a
different but related key for decryption. Furthermore these algorithms have the following
important characteristic:

It is computationally infeasible to determine the decryption key given only knowledge


of the algorithm and the encryption key.
In addition, some algorithms such as RSA, also exhibits the following characteristics:
Either of the two related keys can be used for encryption, with the other used for
decryption.

The steps are:

1. Each system generates a pair of keys.


2. Each system publishes its encryption key (public key) keeping its companion
key private.
3. If A wishes to send a message to B it encrypts the message using B’s public key.
4. When B receives the message, it decrypts the message using its private key. No
one else can decrypt the message because only B knows its private key.

There are three types of Public Key Encryption schemes.

RSA Cryptosystem

This cryptosystem is one the initial system. It remains most employed cryptosystem even
today. The system was invented by three scholars Ron Rivest, Adi Shamir, and Len
Adleman and hence, it is termed as RSA cryptosystem.

We will see two aspects of the RSA cryptosystem, firstly generation of key pair and secondly
encryption-decryption algorithms.

Generation of RSA Key Pair


Each person or a party who desires to participate in communication using encryption needs to
generate a pair of keys, namely public key and private key. The process followed in the
generation of keys is described below:

Generate the RSA modulus (n)

o Select two large primes, p and q.

o Calculate n=p*q. For strong unbreakable encryption, let n be a large number,


typically a minimum of 512 bits.

Find Derived Number (e)

Number e must be greater than 1 and less than (p − 1)(q − 1).

There must be no common factor for e and (p − 1)(q − 1) except for 1.


In other words two numbers e and (p – 1)(q – 1) are coprime.

Form the public key

o The pair of numbers (n, e) form the RSA public key and is made public.

o Interestingly, though n is part of the public key, difficulty in factorizing a large


prime number ensures that attacker cannot find in finite time the two primes (p &
q) used to obtain n. This is strength of RSA.

Generate the private key

o Private Key d is calculated from p, q, and e. For given n and e, there is unique
number d.

o Number d is the inverse of e modulo (p − 1)(q – 1). This means that d is the
number less than (p − 1)(q − 1) such that when multiplied by e, it is equal to 1
modulo (p − 1)(q − 1).

o This relationship is written mathematically as follows:


ed = 1 mod (p − 1)(q − 1)

The Extended Euclidean Algorithm takes p, q, and e as input and gives d as output.

Example
An example of generating RSA Key pair is given below. (For ease of understanding, the
primes p & q taken here are small values. Practically, these values are very high).
Let two primes be p = 7 and q = 13. Thus, modulus n = pq = 7 x 13 = 91.

Select e = 5, which is a valid choice since there is no number that is common factor of
5 and (p − 1)(q − 1) = 6 × 12 = 72, except for 1.

The pair of numbers (n, e) = (91, 5) forms the public key and can be made available to
anyone whom we wish to be able to send us encrypted messages.

Input p = 7, q = 13, and e = 5 to the Extended Euclidean Algorithm. The output will be
d = 29.

Check that the d calculated is correct by computing:

de = 29 × 5 = 145 = 1 mod 72

Hence, public key is (91, 5) and private keys is (91, 29).

Encryption and Decryption


Once the key pair has been generated, the process of encryption and decryption are relatively
straightforward and computationally easy.

Interestingly, RSA does not directly operate on strings of bits as in case of symmetric key
encryption. It operates on numbers modulo n. Hence, it is necessary to represent the plaintext
as a series of numbers less than n.

RSA Encryption
Suppose the sender wish to send some text message to someone whose public key is
(n, e).

The sender then represents the plaintext as a series of numbers less than n.

To encrypt the first plaintext P, which is a number modulo n. The encryption process
is simple mathematical step as:

C = Pe mod n

In other words, the ciphertext C is equal to the plaintext P multiplied by itself e times
and then reduced modulo n. This means that C is also a number less than n.

Returning to our Key Generation example with plaintext P = 10, we get ciphertext C:

C = 105 mod 91

RSA Decryption
The decryption process for RSA is also very straightforward. Suppose that the receiver
of public-key pair (n, e) has received a ciphertext C.

Receiver raises C to the power of his private key d. The result modulo n will be the
plaintext P.

Plaintext = Cd mod n

Returning again to our numerical example, the ciphertext C = 82 would get decrypted
to number 10 using private key 29:

Plaintext = 8229 mod 91 = 10

RSA Analysis
The security of RSA depends on the strengths of two separate functions. The RSA
cryptosystem is most popular public-key cryptosystem strength of which is based on the
practical difficulty of factoring the very large numbers.

Encryption Function: It is considered as a one-way function of converting plaintext


into ciphertext and it can be reversed only with the knowledge of private key d.

Key Generation: The difficulty of determining a private key from an RSA public key
is equivalent to factoring the modulus n. An attacker thus cannot use knowledge of an
RSA public key to determine an RSA private key unless he can factor n. It is also a
one way function, going from p & q values to modulus n is easy but reverse is not
possible.

If either of these two functions are proved non one-way, then RSA will be broken. In fact, if a
technique for factoring efficiently is developed then RSA will no longer be safe.

The strength of RSA encryption drastically goes down against attacks if the number p and q
are not large primes and/ or chosen public key e is a small number.

TYPES OF CRYPTOGRAPHIC FUNCTIONS

Hash functions

Hash functions are extremely useful and appear in almost all information security
applications.

A hash function is a mathematical function that converts a numerical input value into another
compressed numerical value. The input to the hash function is of arbitrary length but output is
always of fixed length.

Values returned by a hash function are called message digest or simply hash values. The
following picture illustrated hash function:
Features of Hash Functions
The typical features of hash functions are:
Fixed Length Output (Hash Value)
Hash function coverts data of arbitrary length to a fixed length. This process is often referred
to as hashing the data.

In general, the hash is much smaller than the input data, hence hash functions are sometimes
called compression functions.

Since a hash is a smaller representation of a larger data, it is also referred to as a digest.

o Hash function with n bit output is referred to as an n-bit hash function. Popular
hash functions generate values between 160 and 512 bits.

Efficiency of Operation
o Generally for any hash function h with input x, computation of h(x) is a fast
operation.

o Computationally hash functions are much faster than a symmetric encryption.

Properties of Hash Functions


In order to be an effective cryptographic tool, the hash function is desired to possess
following properties:
Pre-Image Resistance
o This property means that it should be computationally hard to reverse a hash
function.

o In other words, if a hash function h produced a hash value z, then it should be a


difficult process to find any input value x that hashes to z.

o This property protects against an attacker who only has a hash value and is trying to
find the input.
Second Pre-Image Resistance
o This property means given an input and its hash, it should be hard to find a different
input with the same hash.

o In other words, if a hash function h for an input x produces hash value h(x), then it
should be difficult to find any other input value y such that h(y) = h(x).

o This property of hash function protects against an attacker who has an input value
and its hash, and wants to substitute different value as legitimate value in place of
original input value.
Collision Resistance
o This property means it should be hard to find two different inputs of any length that
result in the same hash. This property is also referred to as collision free hash
function.

o In other words, for a hash function h, it is hard to find any two different inputs x and
y such that h(x) = h(y).

o Since, hash function is compressing function with fixed hash length, it is


impossible for a hash function not to have collisions. This property of collision
free only confirms that these collisions should be hard to find.

o This property makes it very difficult for an attacker to find two input values with
the same hash.

o Also, if a hash function is collision-resistant then it is second pre-image resistant.

Design of Hashing Algorithms

At the heart of a hashing is a mathematical function that operates on two fixed-size blocks of
data to create a hash code. This hash function forms the part of the hashing algorithm.

The size of each data block varies depending on the algorithm. Typically the block sizes are
from 128 bits to 512 bits. The following illustration demonstrates hash function:

Hashing algorithm involves rounds of above hash function like a block cipher. Each round
takes an input of a fixed size, typically a combination of the most recent message block and
the output of the last round.

This process is repeated for as many rounds as are required to hash the entire message.
Schematic of hashing algorithm is depicted in the following illustration:

Since, the hash value of first message block becomes an input to the second hash operation,
output of which alters the result of the third operation, and so on. This effect, known as an
avalanche effect of hashing.

Avalanche effect results in substantially different hash values for two messages that differ by
even a single bit of data.
Understand the difference between hash function and algorithm correctly. The hash function
generates a hash code by operating on two blocks of fixed-length binary data.

Hashing algorithm is a process for using the hash function, specifying how the message will
be broken up and how the results from previous message blocks are chained together.

Popular Hash Functions


Let us briefly see some popular hash functions:

Message Digest (MD)


MD5 was most popular and widely used hash function for quite some years.

The MD family comprises of hash functions MD2, MD4, MD5 and MD6. It was
adopted as Internet Standard RFC 1321. It is a 128-bit hash function.

MD5 digests have been widely used in the software world to provide assurance about
integrity of transferred file. For example, file servers often provide a pre-computed
MD5 checksum for the files, so that a user can compare the checksum of the
downloaded file to it.

In 2004, collisions were found in MD5. An analytical attack was reported to be


successful only in an hour by using computer cluster. This collision attack resulted in
compromised MD5 and hence it is no longer recommended for use.

MD5
The MD5 function is a cryptographic algorithm that takes an input of arbitrary length and
produces a message digest that is 128 bits long. The digest is sometimes also called the
"hash" or "fingerprint" of the input. MD5 is used in many situations where a potentially long
message needs to be processed and/or compared quickly. The most common application is the
creation and verification of digital signatures.

MD5 was designed by well-known cryptographer Ronald Rivest in 1991. In 2004, some
serious flaws were found in MD5. The complete implications of these flaws has yet to be
determined.
How MD5 works
Preparing the input
The MD5 algorithm first divides the input in blocks of 512 bits each. 64 Bits are inserted
at the end of the last block. These 64 bits are used to record the length of the original input.
If the last block is less than 512 bits, some extra bits are 'padded' to the end.

Next, each block is divided into 16 words of 32 bits each. These are denoted as M0 ...
M15.

MD5 helper functions

The buffer
MD5 uses a buffer that is made up of four words that are each 32 bits long. These
words are called A, B, C and D. They are initialized as

word A: 01 23 45 67

word B: 89 ab cd ef

word C: fe dc ba 98

word D: 76 54 32 10
The table
MD5 further uses a table K that has 64 elements. Element number i is indicated as Ki.
The table is computed beforehand to speed up the computations. The elements are
computed using the mathematical sin function:

Ki = abs(sin(i + 1)) * 232

Four auxiliary functions


In addition MD5 uses four auxiliary functions that each take as input three 32-bit words and
produce as output one 32-bit word. They apply the logical operators and, or, not and xor to the
input bits.

F(X,Y,Z) = (X and Y) or
(not(X) and Z)

G(X,Y,Z) = (X and Z) or (Y
and not(Z))

H(X,Y,Z) = X xor Y xor Z

I(X,Y,Z) = Y xor (X or not(Z))

Processing the blocks


The contents of the four buffers (A, B, C and D) are now mixed with the words of the input,
using the four auxiliary functions (F, G, H and I). There are four rounds, each involves 16
basic operations. One operation is illustrated in the figure below.

The figure shows how the auxiliary function F is applied to the four buffers (A, B, C and D),
using message word Mi and constant Ki. The item "<<<s" denotes a binary left shift by s bits.

The output
After all rounds have been performed, the buffers A, B, C and D contain the MD5 digest of the
original input.

Secure Hash Function (SHA)


Family of SHA comprise of four SHA algorithms; SHA-0, SHA-1, SHA-2, and SHA-
3. Though from same family, there are structurally different.

The original version is SHA-0, a 160-bit hash function, was published by the National
Institute of Standards and Technology (NIST) in 1993. It had few weaknesses and did
not become very popular. Later in 1995, SHA-1 was designed to correct alleged
weaknesses of SHA-0.

SHA-1 is the most widely used of the existing SHA hash functions. It is employed in
several widely used applications and protocols including Secure Socket Layer (SSL)
security.
Digital signatures

Digital signatures are the public-key primitives of message authentication. In the physical world,
it is common to use handwritten signatures on handwritten or typed messages. They are used to
bind signatory to the message.

Similarly, a digital signature is a technique that binds a person/entity to the digital data. This
binding can be independently verified by receiver as well as any third party.

Digital signature is a cryptographic value that is calculated from the data and a secret key known
only by the signer.

In real world, the receiver of message needs assurance that the message belongs to the sender
and he should not be able to repudiate the origination of that message. This requirement is very
crucial in business applications, since likelihood of a dispute over exchanged data is very high.

Model of Digital Signature

As mentioned earlier, the digital signature scheme is based on public key cryptography. The
model of digital signature scheme is depicted in the following illustration:

The following points explain the entire process in detail:


Each person adopting this scheme has a public-private key pair.

Generally, the key pairs used for encryption/decryption and signing/verifying are
different. The private key used for signing is referred to as the signature key and the
public key as the verification key.
Signer feeds data to the hash function and generates hash of data.

Hash value and signature key are then fed to the signature algorithm which produces the
digital signature on given hash. Signature is appended to the data and then both are sent
to the verifier.

Verifier feeds the digital signature and the verification key into the verification algorithm.
The verification algorithm gives some value as output.

Verifier also runs same hash function on received data to generate hash value.

For verification, this hash value and output of verification algorithm are compared. Based
on the comparison result, verifier decides whether the digital signature is valid.

Since digital signature is created by ‘private’ key of signer and no one else can have this
key; the signer cannot repudiate signing the data in future.

It should be noticed that instead of signing data directly by signing algorithm, usually a hash of
data is created. Since the hash of data is a unique representation of data, it is sufficient to sign the
hash in place of data. The most important reason of using hash instead of data directly for
signing is efficiency of the scheme.

Let us assume RSA is used as the signing algorithm. As discussed in public key encryption
chapter, the encryption/signing process using RSA involves modular exponentiation.

Signing large data through modular exponentiation is computationally expensive and time
consuming. The hash of the data is a relatively small digest of the data, hence signing a hash is
more efficient than signing the entire data.

Importance of Digital Signature

Out of all cryptographic primitives, the digital signature using public key cryptography is
considered as very important and useful tool to achieve information security.

Apart from ability to provide non-repudiation of message, the digital signature also provides
message authentication and data integrity. Let us briefly see how this is achieved by the digital
signature:

Message authentication – When the verifier validates the digital signature using public
key of a sender, he is assured that signature has been created only by sender who possess
the corresponding secret private key and no one else.

Data Integrity – In case an attacker has access to the data and modifies it, the digital
signature verification at receiver end fails. The hash of modified data and the output
provided by the verification algorithm will not match. Hence, receiver can safely deny
the message assuming that data integrity has been breached.

Non-repudiation – Since it is assumed that only the signer has the knowledge of the
signature key, he can only create unique signature on a given data. Thus the receiver can
present data and the digital signature to a third party as evidence if any dispute arises in
the future.

By adding public-key encryption to digital signature scheme, we can create a cryptosystem that
can provide the four essential elements of security namely: Privacy, Authentication, Integrity,
and Non-repudiation.

Encryption with Digital Signature

In many digital communications, it is desirable to exchange an encrypted messages than


plaintext to achieve confidentiality. In public key encryption scheme, a public (encryption) key
of sender is available in open domain, and hence anyone can spoof his identity and send any
encrypted message to the receiver.

This makes it essential for users employing PKC for encryption to seek digital signatures along
with encrypted data to be assured of message authentication and non-repudiation.

This can archived by combining digital signatures with encryption scheme. Let us briefly discuss
how to achieve this requirement. There are two possibilities, sign-then-encrypt and encrypt-
then-sign.

However, the crypto system based on sign-then-encrypt can be exploited by receiver to spoof
identity of sender and sent that data to third party. Hence, this method is not preferred. The
process of encrypt-then-sign is more reliable and widely adopted. This is depicted in the
following illustration:

PUBLIC KEY INFRASTRUCTURE

The receiver after receiving the encrypted data and signature on it, first verifies the signature
using sender’s public key. After ensuring the validity of the signature, he then retrieves the data
through decryption using his private key.

The most distinct feature of Public Key Infrastructure (PKC) is that it uses a pair of keys to
achieve the underlying security service. The key pair comprises of private key and public key.

Since the public keys are in open domain, they are likely to be abused. It is, thus, necessary to
establish and maintain some kind of trusted infrastructure to manage these keys.

Key Management

It goes without saying that the security of any cryptosystem depends upon how securely its keys
are managed. Without secure procedures for the handling of cryptographic keys, the benefits of
the use of strong cryptographic schemes are potentially lost.

It is observed that cryptographic schemes are rarely compromised through weaknesses in their
design. However, they are often compromised through poor key management.
There are some important aspects of key management which are as follows:

Cryptographic keys are nothing but special pieces of data. Key management refers to the
secure administration of cryptographic keys.

Key management deals with entire key lifecycle as depicted in the following illustration:

There are two specific requirements of key management for public key cryptography.

o Secrecy of private keys. Throughout the key lifecycle, secret keys must remain secret
from all parties except those who are owner and are authorized to use them.

o Assurance of public keys. In public key cryptography, the public keys are in open
domain and seen as public pieces of data. By default there are no assurances of
whether a public key is correct, with whom it can be associated, or what it can be
used for. Thus key management of public keys needs to focus much more explicitly
on assurance of purpose of public keys.

The most crucial requirement of ‘assurance of public key’ can be achieved through the public-
key infrastructure (PKI), a key management systems for supporting public-key cryptography.

Public Key Infrastructure (PKI)

PKI provides assurance of public key. It provides the identification of public keys and their
distribution. An anatomy of PKI comprises of the following components.
Public Key Certificate, commonly referred to as ‘digital certificate’.
Private Key tokens.
Certification Authority.
Registration Authority.
Certificate Management System.
Digital Certificate

For analogy, a certificate can be considered as the ID card issued to the person. People use ID
cards such as a driver's license, passport to prove their identity. A digital certificate does the same
basic thing in the electronic world, but with one difference.

Digital Certificates are not only issued to people but they can be issued to computers, software
packages or anything else that need to prove the identity in the electronic world.

Digital certificates are based on the ITU standard X.509 which defines a standard certificate
format for public key certificates and certification validation. Hence digital certificates are
sometimes also referred to as X.509 certificates
Public key pertaining to the user client is stored in digital certificates by The Certification
Authority (CA) along with other relevant information such as client information,
expiration date, usage, issuer etc.

CA digitally signs this entire information and includes digital signature in the certificate.

Anyone who needs the assurance about the public key and associated information of
client, he carries out the signature validation process using CA’s public key. Successful
validation assures that the public key given in the certificate belongs to the person whose
details are given in the certificate.

The process of obtaining Digital Certificate by a person/entity is depicted in the following


illustration.
As shown in the illustration, the CA accepts the application from a client to certify his public
key. The CA, after duly verifying identity of client, issues a digital certificate to that client.

Potrebbero piacerti anche