Sei sulla pagina 1di 10

CHAPTER 3: DATA STORAGE

Unit Structure
3.1 3.2 3.3 Overview Learning Objectives Data Representation 3.3.1 3.3.2 3.3.3 3.3.4 3.3.5 3.4 Binary Representation Binary Representation within the computer Data vs. Information Memory Units Coding Scheme

Primary Memory 3.4.1 Random Access Memory (RAM) 3.4.1.1 Types of RAM 3.4.2 Read-Only Memory (ROM) 3.4.2.1 Types of ROM 3.4.3 3.4.4 Cache Memory Virtual memory

3.5

Data Organisation on Secondary Storage 3.5.1 Data Organisation on Magnetic Disk 3.5.2 Data Organisation on Magnetic Tape 3.5.3 Data Organisation on Optical drives

3.6 Summary

3.1 OVERVIEW
Section Outline: Data Representation Types of Primary Memory Data organization on secondary storage

3.2 LEARNING OBJECTIVES


Upon completion of this section students should be able to: List and describe the four data coding schemes Explain the RAM and ROM technologies Illustrate the difference between Cache Memory and Virtual Memory Explain data organization on magnetic disk, magnetic tape and optical drive

3.3 DATA REPRESENTATION


Data representation is mainly concerned with the way data is represented within the computer. Computer uses the binary system to represent data.

3.3.1 Binary Representation


The binary system uses the base or the radix 2; which means that only 2 symbols or digits can be used to represent data. The digit can be thought of as a box to hold a number. In the binary system, this number can be either a 0 or a 1. Data is thus represented in a computer as a sequence of 0s and 1s.

0111100001110101011111111000000111100..111111100001110101011111110000

A binary digit is called a bit after the term binary digit. A bit is the smallest unit of data a computer can recognize. A collection of 8 bits forms a byte. Any English character or symbol can be represented in a byte of data (8 bits).

For Example:

The alphabet A represented as a byte (8 bits) is as follows: 0 1 0 0 0 0 0 1

3.3.2 Binary Representation within the computer


Computers can only understand binary language. The Computer Processing Unit (CPU) is a chip made up of transistors. The transistor is simply a tiny switch that can be on or off. On is equal to binary 1 and Off is equal to binary 0. The CPU consists of several million of transistors.

3.3.3 Data vs. Information


Data in the computer is represented as 0 and 1. For example, switch off is 0 and switch on is 1. Data is raw numbers and text. Information is processed data which is meaningful.

3.3.4 Memory Units


1 byte (octet) = 8 bits 1 Kilobyte (KB) = 1024 bytes 1 Megabyte (MB) = 1024 x 1024 bytes 1 Gigabyte (GB) = 1024 x 1024 x 1024 bytes 1 Terabyte (TB) = 1024 x 1024 x 1024 x 1024 bytes 1 Petabyte (PB) = 1024 TBs 1 Exabyte (PB) = 1024 PBs 1 Zettabyte (ZB) = 1024 EBs

3.3.5 Coding Scheme


The Text codes is an agreed upon system which allow computers and programmers to represent letters of the alphabet, punctuation marks and other symbols. Text codes allow the same combinations of (binary) numbers to represent the same individual pieces of data, which make exchange of data possible between computers.

Standard text (alphanumeric) code systems are as follows: BCD EBCDIC ASCII Unicode

3.3.5.1 BCD
Binary Coded Decimal (BCD) provides a method for coding decimal numbers in which each digit is represented by its own binary sequence.

For example: The decimal number 2345 in BCD is as follows:

0010

0011

0100

0101

3.3.5.2 EBCDIC
Extended Binary Coded Decimal Interchange Code (EBCDIC), pronounced as ebb-se-dick, was created to extend the BCD. It was designed by IBM for early computers and is still used in IBM mainframe and midrange systems. The EBCDIC uses 8 bit code to define 256 symbols.

Example of EBCDIC Scheme

Representation of A in different notations Notations Denary Hexadecimal Octal Representations A C1 301

The Hexadecimal notation is base 16 and uses the following symbols: 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F

The Octal notation is base 8 and uses the following symbols: 0,1,2,3,4,5,6,7

3.3.5.3 ASCII
American Standard Code for Information Interchange (ASCII), pronounced as aeski, is by far the most common code used on computers of all types. It is a character encoding based on the English alphabet. The ASCII uses an 8 bit code and can define up to 256 characters which cover all Western European languages.

Example of ASCII scheme:

Alphabet A A

ASCII 65 97

Binary 01000001 01100001

Note that the alphabet a and the alphabet A in the table above are symbols with different ASCII code.

3.3.5.4 Unicode
The Unicode is a Worldwide Character set standard, designed to allow text and symbols from all the writing systems of the world to be consistently represented and manipulated by computers. The Unicode uses a 2 byte (16 bit) coding scheme. The Unicode extends far beyond ASCII and can be used to represent 65,536 characters.

Example of Unicode scheme:

The Euro sign is represented by 20AC16. The alphabet A represents 10 and the alphabet C represents 12 in the hexadecimal (base 16) notation which is generally used for memory addresses.

3.4 PRIMARY MEMORY


Memory, also known as Primary memory, Primary storage, Main memory or Internal Storage, is the part of the computer that holds data and instructions for processing. Memory is separated from the CPU, although closely associated with it. Memory has the advantage of faster access compared to the backing storage (for example, hard disk).

Primary Memory is basically of two types: Random Access Memory (RAM) volatile or non permanent

Read Only Memory (ROM) non-volatile or permanent

There are different RAM and ROM technologies. They are classified according to their data access (read & write) speed and storage capacity.

RAM can be classified as: SRAM DRAM SDRAM DDR SRAM

ROM can be classified as: ROM EPROM EEPROM Flash ROM (BIOS)

3.4.1 Random Access Memory (RAM)


Whenever software is installed on a computer, the latter is placed on the hard disk of the computer. However, when that software is run (e.g. when you double-click on its icon), the latter has to be transferred to the memory of the computer for the CPU to be able to execute it. RAM can be purchased in the form of RAM sticks and it is normally measured in terms of its size (256 MB, 512MB, 1GB, 2GB...). The physical components of memory are called the memory chips. One important feature about the memory chip is the capacity of data it can hold. For example, a capacity of 512MB means that the memory chip can hold 512 millions of characters of data or instructions.

RAM requires current to retain values. It is said to be volatile, meaning that, information stays here as long as the power supply is on but as soon as it is turned off, the information inside the RAM disappears. Data and instructions can be read and modified in RAM. Since the data stored in memory is volatile, it can be lost during a power failure or when the computer is switched off. It is therefore a good practice to save your work in hard disk or pen drive every 10 minutes.

Generally, the more RAM a computer has, the more capacity the computer has to hold and process large programs and files. The amount and type of memory in the system can make a big difference in the system performance.

RAM is used to hold the following: i) ii) Operating System Program currently running

iii) Data needed by the program iv) Intermediate results waiting to be output

3.4.1.1 Types of RAM


SRAM Static Random Access Memory (SRAM) is a very fast, relatively expensive RAM. The SRAM uses more power than other types of RAM. The word static indicates that the memory retains its contents as long as power remains applied, unlike dynamic RAM (DRAM) that needs to be periodically refreshed. However, data are lost when the circuit gets powered down, which makes SRAM a volatile memory as opposed to read-only memory and flash memory. Nowadays, SRAM is currently used in digital cameras and cell phones, onboard cache in computers and data buffers in hard disks.

DRAM Dynamic RAM (DRAM) is the most common form of RAM used for main memory storage. It requires frequent refreshes, as data stored in them deteriorates over time (typically within a few milliseconds). DRAM, is rated in nanoseconds (time needed to read or write one word of data) and ranged from 30-100+ ns. A higher number mean slower DRAM.

SDRAM Synchronous Dynamic RAM (SDRAM) is DRAM that is synchronized with the system bus, i.e. it runs at the same speed as the motherboard Front Side Bus speed. It requires higher tolerances and faster architecture. Speed measured in MHz, from 66 MHz to current maximums of 400 MHz.

DDR SRAM Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM) is a type of memory currently used in computers. It achieves greater bandwidth than ordinary SDRAM by transferring data on both the rising and falling edges of the clock signal. The current types of RAM memory used in computer systems are referred to as DDR2 and DDR3 RAM. It has replaced the old SDRAM which was used some time back and it offers improved performance. DDR4 is the new type of RAM technology that is very likely going to take over from DDR3 RAM. It will be out in the market beginning 2014.

Figure 2-8 SD, DDR and DDR2 RAM

3.4.2 Read-Only Memory (ROM)


The ROM chips permanently hold programs and instructions, typically recorded at the factory, for booting the computer. Unlike the RAM, the data and instructions in ROM chips can be read, but not modified (non-volatile). Chips retain their contents even when the computer is powered down (non-volatile memory). On a PC, ROM contains essential information for computer start-up like the BIOS (Basic Input/Output System). BIOS contains the instructions and data in the ROM chip that control the boot process and the computer hardware. After testing the hardware (POST), it fetches the boot sector from hard disk. The instructions inside a ROM are generally called as the firmware. A set of chips on the Motherboard working jointly with CPU called the chipset is an example of ROM memory.

3.4.2.1 Types of ROM


Read Only Memory (ROM) I used for BIOS and other applications where code or data is fixed by manufacturer and must not be overwritten during normal operations.

PROM Programmable ROM (PROM) are standardised blank chips like the Programmable Logic Arrays (PLA) where data or code can be stored once, but never erased afterwards (write-once). They are permanent memory chips programmed by the customer rather than by the chip manufacturer. They differ from a ROM chip, which is created and coded at the time of manufacture by the manufacturer.

EPROM Erasable PROM (EPROM) is a special type of programmable read-only memory (PROM) that can be erased using a specific frequency of UV light. Usually it has a glass window on top of the chip to facilitate erasing.

EEPROM Electrically Erasable PROM (EEPROM) are erased using voltage higher than standard operating voltage instead of ultraviolet light. It must usually be removed from socket and transferred to an EEPROM programmable device.

FLASH ROM Flash ROM, is EEPROM that can be quickly erased and rewritten using standard operating voltage. It is used in smaller devices as non-volatile read-write secondary storage. EEPROM can be found in the following: MMC cards Compact Flash cards for cellphones and digital cameras Sony Memory stick PCMCIA Flash cards for laptops and handhelds USB pocket drives

3.4.3 Cache Memory


Caching is a method used to improve performance when transferring data to and from a fast device to a relatively slow one. The purpose of cache memory is to speed up a computer while keeping its price low. Cache exists in different forms in computer but it primarily refers to the small memory locations found on the processor. The fastest cache RAM is found inside the CPU. Cache can also be found as small SRAM chips on the Motherboard. A quantity of memory is installed between the CPU and RAM, for example, and frequently accessed data from RAM is temporarily stored in cache. The next time the CPU tries to access the same data, the data is retrieved from cache, instead of the relatively slower RAM.

There are several levels of cache in the processor of a computer these are Level 1 cache (L1), Level 2 cache (L2) and Level 3 cache (L3).

When a program is being run, it will take data from the hard disk and place it in the RAM. RAM is closer to the CPU and is also faster to access but yet, it is still slower than the processor. To speed things up even more, the data that are going to be worked with are transferred from the RAM to the cache which means that when a program is being executed by the CPU, the latter will look for information in the cache first and not in the RAM.

The problem is that the cache is very small (we do not have a lot of space on the processor to have a big amount of cache) and so cannot store a lot of information. Therefore before trying to access data from the cache, we must first check if it is there a cache hit. Else, we have a cache miss and we need to go retrieve the data from the RAM.

Cache Size and processor performance The size of the cache is an important aspect that will determine the overall performance of processor. The cache size usually quoted is that of the Level 2 and Level 3. For example, on a Pentium Core i7, the size of the L2 cache is 1MB and L3 cache is 8MB.

Cache memory is arranged in a hierarchy as shown below:

Figure 2-9 Cache Hierarchy

The table below shows the approximate size of each component. Memory Type L1 Cache L2 Cache L3 Cache RAM Hard Disk Size ~ 16 - 64 KB ~ 512 KB 2 MB ~ 8 MB 256 MB 4 GB 50 GB 10 TB

Table 2-2 Memory Type and Speed

Another domain where the term cache is often used is the web. Connecting to the Internet (especially on a 56 KB modem) is very slow and also the data that we are trying to access might be for example in Hawaii making data retrieval an even slower process. So to try to improve the speed of accessing the page, that page can be stored in a web server in Mauritius the first time it is accessed and the next time another user wants access to the same page, the latters browser will not have to fetch the page from Hawaii but can instead get it directly from the web server in Mauritius, thus speeding up data access.

3.4.4 Virtual Memory


In practice a CPU executes multiple processes concurrently. Virtual Memory (VM) provides the basis for multi process operation. VM supports the illusion that multiple programs are running concurrently. Virtual Memory is not a physical memory like the cache memory. However, the algorithm used by the virtual memory for swapping files from and to the main memory requires some space from the secondary storage, i.e. the hard disk.

3.5 DATA ORGANISATION ON SECONDARY STORAGE


We should differentiate between the way data is stored and organized on a storage medium and how it is being accessed. Data is organized in similar manner on magnetic disk and optical disk but differently on magnetic tapes

There are 2 methods for accessing data: Random Access Random access refers to reading and writing of data in any order.

Sequential Access Sequential access refers to reading or writing data records in sequential order, that is, one record after the other. For example, to read record 10, records 1 through 9 must be read.

3.5.1 Data organization on Magnetic Disk Data is organized and stored in concentric circular tracks on the disk. Each track is divided into sectors. Each track has the same number of sectors and each sector stores the same number of bits. Formatting involves the creation of tracks and sectors on a disk. Formatting wipes the disk. Devices can also be classified as sequential access or random access.

3.5.2 Data organization on Magnetic Tape Each bit is stored by magnetizing a small region of the tape surface. It is reliable, cheap, and provides high capacity (many GB). Access time is relatively long since the tape must be read sequentially. Magnetic tape is commonly used for backup of data files for the simple reason that past updates in master files can be easily traced.

3.5.3 Data organization on Optical drives CDR/DVD Data is organized in single spiral track which is read from the centre outwards. The bit density along track is constant. The track is divided into sectors of approximately 2 KB and provides a total capacity of around 10 GB. Each bit is stored as a mark or bump on the surface, and is read using laser light. When an optical drive shines light into a pit, the light cannot be reflected back. This represents a bit value of 0 (off). A land reflects light back to its source, representing a bit value of 1 (on).

3.6 SUMMARY Data used in computers is represented using binary notation symbolizing an on state with a binary digit 1 and an off state with a binary digit 0.

To encode alphanumeric characters there are commonly 4 coding schemes BCD, EBDIC, ASCII and Unicode. There are other schemes to encode multimedia files.

Cache memory as well as virtual memory speeds up the performance of computers. Cache is physical and real whereas virtual memory makes us of a swapping algorithm and part of the hard disk to enable multitasking.

Magnetic disk is a random access device whereas a magnetic tape is a sequential access device.

10

Potrebbero piacerti anche