Sei sulla pagina 1di 32

Data Formats

Introduction
Examples Data
Real World Computer Input device Data

Dear Mom:

Keyboard

10110010

Digital camera

10110010

Format must be appropriate


The

internal representation must be appropriate for the type of processing to take place (e.g., text, images, sound)

Examples of Standards
Type of Data Alphanumeric
Image Motion picture Sound Outline graphics/fonts

Standards ASCII, EBCDIC, Unicode


JPEG, GIF, PCX, TIFF MPEG-2, Quick Time Sound Blaster, WAV, AU PostScript, TrueType, PDF

Why Standards?

Standard

are arbitrary They exist because they are

Convenient Efficient Flexible Appropriate Etc.

Alphanumeric Data
Problem:

Distinguishing between the number 123 (one hundred and twentythree) and the characters 123 (one, two, three) Four standards for representing letters (alpha) and numbers

BCD Binary-coded decimal ASCII American standard code for information interchange EBCDIC Extended binary-coded decimal interchange code Unicode

Standard Alphanumeric Formats


BCD ASCII EBCDIC Unicode

Next 2 slides

Binary-Coded Decimal (BCD)


Digit 0
Note: the following bit Four bits per digit patterns are not used: 1010 1011 1100 1101 1110 1111

Bit pattern 0000 0001 0010

1 2

3
4 5 6 7

0011
0100 0101 0110 0111

8
9

1000
1001

Example
709310

= ? (in BCD)
7 0 9 3

0111

0000

1001

0011

Standard Alphanumeric Formats


BCD ASCII EBCDIC Unicode
Next 22 slides

The Problem
Representing

text strings, such as Hello, world, in a computer

Codes and Characters


Each

character is coded as a byte Most common coding system is ASCII (Pronounced ass-key) ASCII = American National Standard Code for Information Interchange Defined in ANSI document X3.4-1977

ASCII Features
7-bit

code 8th bit is unused (or used for a parity bit) 27 = 128 codes Two general types of codes:

95 are Graphic codes (displayable on a console) 33 are Control codes (control features of the console or communications channel)

ASCII Chart
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI 001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US 010 ! " # $ % & ' ( ) * + , . / 011 0 1 2 3 4 5 6 7 8 9 : ; < = > ? 100 @ A B C D E F G H I J K L M N O 101 P Q R S T U V W X Y Z [ \ ] ^ _ 110 ` a b c d e f g h i j k l m n o 111 p q r s t u v w x y z { | } ~ DEL

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI

001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

010 ! " # $ % & ' ( ) * + , . /

011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

100 @ A B C D E F G H I J K L M N O

101 P Q R S T U V W X Y Z [ \ ] ^ _

110 ` a b c d e f g h i j k l m n o

111 p q r s t u v w x y z { | } ~ DEL

000 001 0000 NULL DLE 0001 SOH DC1 0010 STX DC2 0011 ETX DC3 0100 EDT DC4 0101 ENQ NAK 0110 ACK SYN 0111 BEL ETB 1000 BS CAN 1001 HT EM 1010 LF SUB 1011 VT ESC Least significant bit 1100 FF FS 1101 CR GS 1110 SO RS 1111 SI US

011 100 0 @ ! 1 A " 2 B # 3 C Most significant bit D $ 4 % 5 E & 6 F ' 7 G ( 8 H ) 9 I * : J + ; K , < L = M . > N / ? O

010

101 P Q R S T U V W X Y Z [ \ ] ^ _

110 ` a b c d e f g h i j k l m n o

111 p q r s t u v w x y z { | } ~ DEL

e.g., a = 1100001

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI

001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

010 ! " # $ % & ' ( ) * + , . /

011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

100 @ A B C D E F G H I J K L M N O

101 P Q R S T U V W X Y Z [ \ ] ^ _

110 ` a b c d e f g h i j k l m n o

111 p q r s t u v w x y z { | } ~ DEL

95 Graphic codes

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI

001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

010 ! " # $ % & ' ( ) * + , . /

011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

100 @ A B C D E F G H I J K L M N O

101 P Q R S T U V W X Y Z [ \ ] ^ _

110 ` a b c d e f g h i j k l m n o

111 p q r s t u v w x y z { | } ~ DEL

33 Control codes

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI

001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

010 ! " # $ % & ' ( ) * + , . /

011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

100 @ A B C D E F G H I J K L M N O

101 P Q R S T U V W X Y Z [ \ ] ^ _

110 ` a b c d e f g h i j k l m n o

111 p q r s t u v w x y z { | } ~ DEL

Alphabetic codes

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI

001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

010 ! " # $ % & ' ( ) * + , . /

011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

100 @ A B C D E F G H I J K L M N O

101 P Q R S T U V W X Y Z [ \ ] ^ _

110 ` a b c d e f g h i j k l m n o

111 p q r s t u v w x y z { | } ~ DEL

Numeric codes

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI

001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

010 ! " # $ % & ' ( ) * + , . /

011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

100 @ A B C D E F G H I J K L M N O

101 P Q R S T U V W X Y Z [ \ ] ^ _

110 ` a b c d e f g h i j k l m n o

111 p q r s t u v w x y z { | } ~ DEL

Punctuation, etc.

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI

001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

010 ! " # $ % & ' ( ) * + , . /

011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

100 @ A B C D E F G H I J K L M N O

101 P Q R S T U V W X Y Z [ \ ] ^ _

110 ` a b c d e f g h i j k l m n o

111 p q r s t u v w x y z { | } ~ DEL

Hello, world Example


H e l l o , w o r l d = = = = = = = = = = = = Binary 01001000 01100101 01101100 01101100 01101111 00101100 00100000 01110111 01100111 01110010 01101100 01100100 = = = = = = = = = = = = Hexadecimal 48 65 6C 6C 6F 2C 20 77 67 72 6C 64 = = = = = = = = = = = = Decimal 72 101 108 108 111 44 32 119 103 114 108 100

Common Control Codes


CR LF HT DEL NULL

0D 0A 09 7F 00

carriage return line feed horizontal tab delete null

Hexadecimal code

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI

001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

010 ! " # $ % & ' ( ) * + , . /

011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

100 @ A B C D E F G H I J K L M N O

101 P Q R S T U V W X Y Z [ \ ] ^ _

110 ` a b c d e f g h i j k l m n o

111 p q r s t u v w x y z { | } ~ DEL

Terminology
Learn

the names of the special symbols


brackets braces parentheses commercial at sign ampersand tilde

[] {} () @ & ~

0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

000 NULL SOH STX ETX EDT ENQ ACK BEL BS HT LF VT FF CR SO SI

001 DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US

010 ! " # $ % & ' ( ) * + , . /

011 0 1 2 3 4 5 6 7 8 9 : ; < = > ?

100 @ A B C D E F G H I J K L M N O

101 P Q R S T U V W X Y Z [ \ ] ^ _

110 ` a b c d e f g h i j k l m n o

111 p q r s t u v w x y z { | } ~ DEL

Standard Alphanumeric Formats

BCD ASCII EBCDIC Unicode


Next 1 slides

EBCDIC
Extended

BCD Interchange Code (pronounced ebb-se-dick) 8-bit code Developed by IBM Rarely used today IBM mainframes only

Standard Alphanumeric Formats

BCD ASCII EBCDIC Unicode


Next 2 slides

Unicode
16-bit

standard Developed by a consortia Intended to supercede older 7- and 8-bit codes

Unicode Version 2.1


1998 Improves

on version 2.0 Includes the Euro sign (20AC16 = From the standard:

contains 38,887 distinct coded characters derived from the supported scripts. These characters cover the principal written languages of the Americas, Europe, the Middle East, Africa, India, Asia, and Pacifica.

Potrebbero piacerti anche