Sei sulla pagina 1di 22

CS 105

“Tour of the Black Holes of Computing!”

Bits and
Topics Bytes
 Why bits? Tell me
 Representing information as bits
 Binary/Hexadecimal
 Byte representations
» numbers
» characters and strings
» Instructions
 Bit-level manipulations – READ BOOK
 Boolean algebra
 Expressing in C

bits.ppt CS 105
Binary Representations
Base 2 Number Representation
 Represent 1521310 as 111011011011012
 Represent 1.2010 as 1.0011001100110011[0011]…2
 Represent 1.5213 X 104 as 1.11011011011012 X 213

Electronic Implementation
 Easy to store with bistable elements - relays
 Reliably transmitted on noisy and inaccurate wires (media)

0 1 0

3.3
V
2.8
V
0.5
V
0.0
Straightforward implementation of arithmetic functions
V
–2– CS 105
Byte-Oriented Mem
Organization
0 •F
•• • ••
0 0 FF
•••
Programs Refer to Virtual Addresses
 Conceptually very large array of bytes
 Actually implemented with hierarchy of different memory
types – SRAM, DRAM, Disk
 System provides address space private to particular
“process”
 Program being executed
 Program can clobber its own data, but not that of others

Compiler + Run-Time System Control Allocation


 Where different program objects should be stored
 Multiple mechanisms: static, stack, heap, etc.
 All allocation within single virtual address space
–3– CS 105
Encoding Byte Values
al y
Byte = 8 bits x cim ar
e n
 Binary 000000002 to 111111112 H De Bi
0 0 0000
 Decimal: 010 to 25510 1 1 0001
2 2 0010
 Hexadecimal 0016 to FF16 3 3 0011
 Base 16 number representation 4 4 0100
 Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’
5 5 0101
6 6 0110
 Write FA1D37B16 in C as 0xFA1D37B 7 7 0111
8 8 1000
» Or 0xfa1d37b 9 9 1001
 Internet used term ‘Octet’ Why? A 10 1010
B 11 1011
C 12 1100
D 13 1101
E 14 1110
F 15 1111

–4– CS 105
Word-Oriented Memory
Organization
64-bit 32-bit Bytes Addr.
Words Words
0000
Add
0001
Addresses Specify Byte r
=
0000 0002
Locations ?? Add
0003
r
 Address of first byte in =
0000 0004
Add
word r
??
0005
 Addresses of =
0004 0006
??
successive words differ 0007
by 4 (32-bit) or 8 (64-bit) 0008
Add
r 0009
=
0008
Add
0010
??
r 0011
=
0008 0012
Add ??
r 0013
=
0012 0014
??
0015
–5– CS 105
Data Representations
Sizes of C Objects (in Bytes)
 C Data TypeTypical 32-bit Intel IA32
x86-64
 char 1 1 1
 short 2 2 2
 int 4 4 4
 long 4 4 8
 long long 8 8 8
 float 4 4 4
 double 8 8 8
 long double 8 10/12 10/16
 char * 4 4 8
» Or any other pointer

–6– CS 105
Examining Data
Representations
Code to Print Byte Representation of Data
 Casting pointer to unsigned char * creates byte
array
typedef unsigned char *pointer;

void show_bytes(pointer start, int len)


{
int i;
for (i = 0; i < len; i++)
printf("0x%p\t0x%.2x\n",
start+i, start[i]);
printf("\n");
}

Printf directives:
%p: Print pointer
%x: Print Hexadecimal
–7– CS 105
show_bytes Execution
Example

int a = 15213;
printf("int a = 15213;\n");
show_bytes((pointer) &a, sizeof(int));

Result (Linux):
int a = 15213;
0x11ffffcb8 0x6d
0x11ffffcb9 0x3b
0x11ffffcba 0x00
0x11ffffcbb 0x00

–8– CS 105
Representing Integers
Decimal: 15213
int A = 15213;
int B = -15213; Binary: 0011 1011 0110 1101
long int C = 15213; Hex: 3 B 6 D

IA32, x86-64 A Sun A IA32 C x86-64 C Sun C


6D 00 6D 6D 00
3B 00 3B 3B 00
00 3B 00 00 3B
00 6D 00 00 6D
00
IA32, x86-64 B Sun B 00
00
93 FF
00
C4 FF
FF C4
FF 93 Two’s complement representation
(Covered later)
–9– CS 105
Byte Ordering
How should bytes within multi-byte word be
ordered in memory? Religious Issue
Conventions
 Suns, PowerPCs are “Big Endian” machines
 Least significant data byte has highest address
 Most significant data byte has lowest address
 Alphas, PCs are “Little Endian” machines
 Least significant byte has lowest address

– 10 – CS 105
Byte Ordering Example
Big Endian
 Least significant byte has highest address

Little Endian
 Least significant byte has lowest address

Example
 Variable x has 4-byte representation 0x01234567
 Address given by &x is 0x100

Big Endian 0x100 0x101 0x102 0x103


01 23 45 67

Little Endian 0x100 0x101 0x102 0x103


67 45 23 01

– 11 – CS 105
Reading Byte-Reversed
Listings
Disassembly
 Text representation of binary machine code
 Generated by program that reads the machine code

Example Fragment
Address Instruction Code Assembly Rendition
8048365: 5b pop %ebx
8048366: 81 c3 ab 12 00 00 add $0x12ab,%ebx
804836c: 83 bb 28 00 00 00 00 cmpl $0x0,0x28(%ebx)

Deciphering Numbers
 Value: 0x12ab
 Pad to 4 bytes: 0x000012ab
 Split into bytes: 00 00 12 ab
 Reverse: ab 12 00 00
– 12 – CS 105
Representing Pointers
int B = -15213;
int *P = &B; // p is pointer, address values are
different on different machines.
Sun P IA32 P x86-64 P
EF D4 0C
FF F8 89
FB FF EC
2C BF FF
FF
7F
00
00

Different compilers & machines assign different locations to objects

– 13 – CS 105
Machine-Level Code
Representation
Encode Program as Sequence of Instructions
 Each simple operation
 Arithmetic operation
 Read or write memory
 Conditional branch
 Instructions encoded as bytes
 Alpha’s, Sun’s, Mac’s use 4 byte instructions
» Reduced Instruction Set Computer (RISC)
 PC’s use variable length instructions
» Complex Instruction Set Computer (CISC)
 Different instruction types and encodings for different
machines
 Most code not binary compatible

Programs are Byte Sequences Too - von


Neumman!
– 14 – CS 105
Representing Instructions
int sum(int x, int y)
{ Alpha sum Sun sum PC sum
return x+y; 00 81 55
} 00 C3 89
30 E0 E5
 For this example, Alpha &
42 08 8B
Sun use two 4-byte
01 90 45
instructions
80 02 0C
 Use differing numbers of
FA 00 03
instructions in other
6B 09 45
cases
08
 PC uses 7 instructions 89
with lengths 1, 2, and 3 EC
bytes 5D
 Same for NT and for C3
Linux
Different
 NT / machines use totally
Linux not fully binary different instructions and encodings
– 15 – compatible CS 105
Boolean Algebra – a review
Developed by George Boole in 19th Century
 Algebraic representation of logic
 Encode “True” as 1 and “False” as 0

And Or
 A&B = 1 when both A=1 and  A|B = 1 when either A=1 or B=1
B=1

Not Exclusive-Or (Xor)


 ~A = 1 when A=0  A^B = 1 when either A=1 or
B=1, but not both

– 16 – CS 105
Bit-Level Operations in C
follow Boolean Algebra
Operations &, |, ~, ^ Available in C
 Apply to any “integral” data type
 long, int, short, char
 View arguments as bit vectors
 Arguments applied bit-wise

Examples (Char data type)


 ~0x41 --> 0xBE
~010000012 --> 101111102
 ~0x00 --> 0xFF
~000000002 --> 111111112
 0x69 & 0x55 --> 0x41
011010012 & 010101012 --> 010000012
 0x69 | 0x55 --> 0x7D
011010012 | 010101012 --> 011111012
– 17 – CS 105
Contrast: Logic Operations
in C
Contrast to Logical Operators
 &&, ||, !
 View 0 as “False”
 Anything nonzero as “True” -- key point
 Always return 0 or 1
 Early termination

Examples (char data type)


 !0x41 --> 0x00
 !0x00 --> 0x01
 !!0x41 --> 0x01

 0x69 && 0x55 --> 0x01


 0x69 || 0x55 --> 0x01
 p && *p (avoids null pointer access) ??
– 18 – CS 105
Shift Operations
Left Shift: x << y Argument x 01100010
 Shift bit-vector x left y
positions << 3 00010000
 Throw away extra bits on left Log. >> 2 00011000
 Fill with 0’s on right
Arith. >> 2 00011000
Right Shift: x >> y
 Shift bit-vector x right y
positions Argument x 10100010
 Throw away extra bits on right << 3 00010000
 Logical shift
Log. >> 2 00101000
 Fill with 0’s on left
 Arithmetic shift Arith. >> 2 11101000
 Replicate most significant bit
 Useful with two’s complement
integer representation
– 19 – CS 105
Cool Stuff with
Xor
 Bitwise XOR is void funny(int *x, int *y)
form of addition {
 With extra property *x = *x ^ *y; /* #1 */
that every value is *y = *x ^ *y; /* #2 */
its own additive *x = *x ^ *y; /* #3 */
inverse }
A^A=0 // x, y not equal

*x *y
Begin A B
1 A^B B
2 A^B (A^B)^B = A
3 (A^B)^A = B A
End B A
– 20 – CS 105
Main Points
It’s All About Bits & Bytes
 Numbers
 Programs
 Text

Different Machines Follow Different Conventions


 Word size
 Byte ordering
 Representations

Boolean Algebra is Mathematical Basis


 Basic form encodes “false” as 0, “true” as 1
 General form like bit-level operations in C
 Good for representing & manipulating sets

– 21 – CS 105
Practice Problems
• 2.5
• 2.8
• 2.14
• 2.23

– 22 – CS 105

Potrebbero piacerti anche