0% found this document useful (0 votes)
10 views88 pages

Lecture 3

This document discusses the representation of integer numbers in computers, focusing on bits, bytes, and various numeric types such as signed and unsigned integers. It highlights the importance of understanding computer arithmetic limitations, overflow issues, and the conversion between binary, decimal, and hexadecimal systems. Additionally, it introduces concepts like two's complement for representing negative numbers and the practical implications of these representations in programming.

Uploaded by

fanj631
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views88 pages

Lecture 3

This document discusses the representation of integer numbers in computers, focusing on bits, bytes, and various numeric types such as signed and unsigned integers. It highlights the importance of understanding computer arithmetic limitations, overflow issues, and the conversion between binary, decimal, and hexadecimal systems. Additionally, it introduces concepts like two's complement for representing negative numbers and the practical implications of these representations in programming.

Uploaded by

fanj631
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 88

CS107 Lecture 3

Bits and Bytes; Integer Representations

reading:
Bryant & O’Hallaron, Ch. 2.2-2.3

This document is copyright (C) Stanford Computer Science and Nick Troccoli, licensed under
Creative Commons Attribution 2.5 License. All rights reserved.
Based on slides created by Cynthia Lee, Chris Gregg, Jerry Cain, Lisa Yan and others.
NOTICE RE UPLOADING TO WEBSITES: This content is protected and may not be shared, 1
uploaded, or distributed. (without expressed written permission)
CS107 Topic 1: How can a
computer represent integer
numbers?

2
CS107 Topic 1
How can a computer represent integer numbers?

Why is answering this question important?


• Helps us understand the limitations of computer arithmetic (today)
• Shows us how to more efficiently perform arithmetic (next time)
• Shows us how we can encode data more compactly and efficiently (next time)

assign1: implement 3 programs that manipulate binary representations to (1) work


around the limitations of arithmetic with addition, (2) simulate an evolving colony of
cells, and (3) print Unicode text to the terminal.
3
Learning Goals
• Understand the limitations of computer arithmetic and how that can impact
our programs, such as with overflow
• Understand how positive and negative numbers stored in ints, longs, etc. are
represented in binary
• Learn about the binary and hexadecimal number systems and how to convert
between number systems

4
Delta/Comair Airline Holiday Chaos
Case study: Comair/Delta airline had to cancel thousands of flights days before
Christmas due to a system malfunction. An unusually high number of crew
reassignments caused a bug in the system. What happened?

5
Demo: Unexpected
Behavior

cp -r /afs/ir/class/cs107/lecture-code/lect3 . 6
Lecture Plan
• Integer Representations
• Bits and Bytes
• Hexadecimal
• Unsigned Integers
• Signed Integers
• Overflow

7
Lecture Plan
• Integer Representations
• Bits and Bytes
• Hexadecimal
• Unsigned Integers
• Signed Integers
• Overflow

8
Number Representations
• Numeric types are generally a fixed size (e.g. int is 4 bytes). This means there
is a limit to the range of numbers they can store.
C Declaration Size (Bytes)
int 4
double 8
float 4
char 1
short 2
long 8
• Overflow occurs when we exceed the maximum value or go below the
minimum value of a numeric type. It can cause unintended bugs!
9
The sizeof Operator
long sizeof(type);

// Example
long int_size_bytes = sizeof(int); // 4
long short_size_bytes = sizeof(short); // 2
long char_size_bytes = sizeof(char); // 1

sizeof takes a variable type (or a variable itself) as a parameter and returns
the size of that type, in bytes.

10
Number Representations
• Unsigned Integers: positive and 0 integers. (e.g. 0, 1, 2, … 99999…
• Signed Integers: negative, positive and 0 integers. (e.g. …-2, -1, 0, 1,… 9999…)

• Floating Point Numbers: real numbers. (e,g. 0.1, -12.2, 1.5x1012)


Look up IEEE floating point if you’re interested!

11
Lecture Plan
• Integer Representations
• Bits and Bytes
• Hexadecimal
• Unsigned Integers
• Signed Integers
• Overflow

12
One Bit At a Time
• A bit is 0 or 1
• Computers are built around the idea of two states: “on” and “off”. Bits
represent this idea in software! (transistors represent this in hardware).
• We can combine bits, like with base-10 numbers, to represent more data. 8
bits = 1 byte.
• Computer memory is just a large array of bytes! It is byte-addressable; you
can’t address (store location of) a bit; only a byte.
• Computers fundamentally operate on bits; but we creatively represent
different data as bits!
• Images
• Video
• Text
• And more… 13
Base 10

5934
Digits 0-9 (0 to base-1)

14
Base 10

5934
tens ones

= 5*1000 + 9*100 + 3*10 + 4*1


15
Base 10

5934
103 102 101 100

16
Base 10

10X:
5934
3 2 1 0

17
Base 2

2X:
1011
3 2 1 0

Digits 0-1 (0 to base-1)

18
Base 2

1011
23 22 21 20

19
Base 2

Most significant bit (MSB) Least significant bit (LSB)

1011
eights fours twos ones

= 1*8 + 0*4 + 1*2 + 1*1 = 1110


20
Practice: Base 2 to Base 10
What is the base-2 value 1010 in base-10?
a) 20
b) 101
c) 10
d) 5
e) Other

21
Base 10 to Base 2
Question: What is 6 in base 2?
• Strategy:
• What is the largest power of 2 ≤ 6? 22=4
• Now, what is the largest power of 2 ≤ 6 – 22? 21=2
• 6 – 22 – 21 = 0!

0 1
_ _ 1
_ 0
_
23 22 21 20
= 0*8 + 1*4 + 1*2 + 0*1 = 6 22
23
24
25
Byte Values
What is the minimum and maximum unsigned (>=0) base-10 value a single byte
(8 bits) can store? minimum = 0 maximum = ?255

2x:
11111111
7 6 5 4 3 2 1 0

• Strategy 1: 1*27 + 1*26 + 1*25 + 1*24 + 1*23+ 1*22 + 1*21 + 1*20 = 255
• Strategy 2: 28 – 1 = 255
26
Multiplying by Base

1450 x 10 = 14500
11002 x 2 = 11000
Key Idea: inserting 0 at the end multiplies by the base!

27
Dividing by Base

1450 / 10 = 145
11002 / 2 = 110
Key Idea: removing 0 at the end divides by the base!

28
Lecture Plan
• Integer Representations
• Bits and Bytes
• Hexadecimal
• Unsigned Integers
• Signed Integers
• Overflow

29
Hexadecimal
• When working with bits, oftentimes we have large numbers with 32 or 64 bits.
• Instead, we’ll represent bits in base-16 instead; this is called hexadecimal.

0110 1010 0011


0-15 0-15 0-15

30
Hexadecimal
• When working with bits, oftentimes we have large numbers with 32 or 64 bits.
• Instead, we’ll represent bits in base-16 instead; this is called hexadecimal.

0-15 0-15 0-15

Each is a base-16 digit!

31
Hexadecimal
Hexadecimal is base-16, so we need digits for 1-15. How do we do this?

0 1 2 3 4 5 6 7 8 9 a b c d e f
10 11 12 13 14 15

32
Hexadecimal

Hex digit 0 1 2 3 4 5 6 7
Decimal value 0 1 2 3 4 5 6 7
Binary value 0000 0001 0010 0011 0100 0101 0110 0111

Hex digit 8 9 A B C D E F
Decimal value 8 9 10 11 12 13 14 15
Binary value 1000 1001 1010 1011 1100 1101 1110 1111

33
Hexadecimal
• We distinguish hexadecimal numbers by prefixing them with 0x, and binary
numbers with 0b.
• E.g. 0xf5 is 0b11110101

0x f 5
1111 0101

34
Practice: Hexadecimal to Binary
What is 0x173A in binary?

Hexadecimal 1 7 3 A
Binary 0001 0111 0011 1010

35
Practice: Binary to Hexadecimal
What is 0b1111001010 in hexadecimal? (Hint: start from the right)

Binary 11 1100 1010


Hexadecimal 3 C A

36
Hexadecimal: It’s funky but concise
Let’s take a byte (8 bits):
Base-10: Human-readable,
165 but cannot easily interpret on/off bits

Base-2: Yes, computers use this,


0b10100101 but not human-readable

Base-16: Easy to convert to Base-2,


0xa5 More “portable” as a human-readable format
(fun fact: a half-byte is called a nibble or nybble)

37
Lecture Plan
• Integer Representations
• Bits and Bytes
• Hexadecimal
• Unsigned Integers
• Signed Integers
• Overflow

38
Unsigned Integers
• An unsigned integer is 0 or a positive integer (no negatives).
• We have already discussed converting between decimal and binary, which is a
nice 1:1 relationship. Examples:
0b0001 = 1
0b0101 = 5
0b1011 = 11
0b1111 = 15
• The range of an unsigned number is 0 → 2w - 1, where w is the number of bits.
E.g. a 32-bit integer can represent 0 to 232 – 1 (4,294,967,295).

39
Example: 4-bit Unsigned Integer

40
Unsigned Integer Representations
C Declaration Size (Bytes)
unsigned int 4
unsigned char 1
unsigned short 2
unsigned long 8

• We commonly omit leading zeroes, but make sure to note how many total bits
you are working with! E.g. if an unsigned short, 0b1101 has 12 implicit
leading 0s.

41
From Unsigned to Signed

A signed integer is a negative, 0, or positive


integer. How can we represent both negative
and positive numbers in binary?

42
Lecture Plan
• Integer Representations
• Bits and Bytes
• Hexadecimal
• Unsigned Integers
• Signed Integers
• Overflow

43
Signed Integers
A signed integer is a negative integer, 0, or a positive integer.
• Problem: How can we represent negative and positive numbers in binary?

One primary goal: want to make


arithmetic easy; “just works”.

44
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0101
+????
0000
45
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0101
+1011
10000
46
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0101
+1011
0000
47
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0011
+????
0000
48
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0011
+1101
0000
49
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0000
+????
0000
50
A Better Idea
• Ideally, binary addition would just work regardless of whether the number is
positive or negative.

0000
+0000
0000
51
A Better Idea
Decimal Positive Negative Decimal Positive Negative

0 0000 0000 8 1000 1000

1 0001 1111 9 1001 (same as -7!) NA

2 0010 1110 10 1010 (same as -6!) NA

3 0011 1101 11 1011 (same as -5!) NA

4 0100 1100 12 1100 (same as -4!) NA

5 0101 1011 13 1101 (same as -3!) NA

6 0110 1010 14 1110 (same as -2!) NA

7 0111 1001 15 1111 (same as -1!) NA


52
There Seems Like a Pattern Here…

0101 0011 0000


+1011 +1101 +0000
0000 0000 0000
The negative number is the positive number inverted, plus one!
53
There Seems Like a Pattern Here…

A binary number plus its inverse is all 1s. Add 1 to this to carry over all 1s and get 0!

0101 1111
+1010 +0001
1111 0000
54
Two’s Complement

55
Two’s Complement
• In two’s complement, we represent a
positive number as itself, and its
negative equivalent as the two’s
complement of itself.
• The two’s complement of a number is
the binary digits inverted, plus 1.
• This works to convert from positive to
negative, and back from negative to
positive!

56
Two’s Complement
• Pro: addition works for any combination
of positive and negative!
• Con: takes some steps to convert
between positive and negative, can’t
eyeball a negative number and
immediately know what value it is.
• Pro: the most significant bit indicates the
sign of a number.

57
Two’s Complement
Adding two numbers is just…adding! There is no special case needed for
negatives. E.g. what is 2 + -5?

0010 2

+1011 -5

1101 -3

58
Lecture Plan
• Integer Representations
• Bits and Bytes
• Hexadecimal
• Unsigned Integers
• Signed Integers
• Overflow

59
Overflow
If you exceed the maximum value of your bit representation, you wrap around
or overflow back to the smallest bit representation. E.g. with 4 bits:

0b1111 + 0b1 = 0b0000


0b1111 + 0b10 = 0b0001

If you go below the minimum value of your bit representation, you wrap around
or overflow back to the largest bit representation. E.g. with 4 bits:

0b0000 - 0b1 = 0b1111


0b0000 - 0b10 = 0b1110
60
Overflow
Overflow occurs because we don’t have enough bits to store a value.
E.g. if we have unsigned short x = 65535 and add 2, we get 1!

1111 1111 1111 1111


+
0000 0000 0000 0010
10000 0000 0000 0001
61
Overflow

62
Overflow
Overflow occurs because we don’t have enough bits to store a value.
E.g. if we have unsigned short x = 65535 and add 2, we get 1!
Overflow means discontinuity in values (i.e. not what we expect).

1111 1111 1111 1111


+
0000 0000 0000 0010
10000 0000 0000 0001
63
Min and Max Integer Values
Size
Type Minimum Maximum
(Bytes)
char 1 -128 127
unsigned char 1 0 255

short 2 -32768 32767


unsigned short 2 0 65535

int 4 -2147483648 2147483647


unsigned int 4 0 4294967295

long 8 -9223372036854775808 9223372036854775807


unsigned long 8 0 18446744073709551615

Provided. Constants in C: INT_MIN, INT_MAX, UINT_MAX, LONG_MIN,


LONG_MAX, ULONG_MAX, … 64
Unsigned Integers
≈+4billion 0
111…111 000…000
111…110 000…001

More increasing positive numbers


000…010

Increasing positive numbers


111…101
111…100 Discontinuity 000…011
means overflow
possible here

… …

100…010 011…101
100…001 011…110
100…000 011…111

65
Signed Numbers
-1 0 +1
Negative numbers becoming less negative
111…111 000…000
111…110 000…001
000…010

Increasing positive numbers


111…101
111…100 000…011

… …

Discontinuity
(i.e. increasing)

means overflow
possible here
100…010 011…101
100…001 011…110
100…000 011…111

≈+2billion
≈-2billion 66
Overflow In Practice: PSY

YouTube: “We never thought a video would be watched in numbers


greater than a 32-bit integer (=2,147,483,647 views), but that was before
we met PSY. "Gangnam Style" has been viewed so many times we had to
upgrade to a 64-bit integer (9,223,372,036,854,775,808)!” [link]

“We saw this coming a couple months ago and updated our systems to
prepare for it” [link] 67
Overflow In Practice: Games

Super Mario Bros (NES):


Impossible Pacman Level 256 losing all extra lives if you exceed 127 68
Overflow In Practice
Many systems store timestamps as the number of seconds since Jan. 1, 1970 in
a signed 32-bit integer.
• Problem: the latest timestamp that can be represented this way is 3:14:07 UTC
on Jan. 13 2038!

• Casino erroneous slot machine payout ($42,949,672.76) due to overflow


• Reported vulnerability CVE-2019-3857 in libssh2 may allow a hacker to
remotely execute code
• Apple CoreGraphics overflow bug exploited via iMessage, used in known
spyware

69
Demo Revisited: Unexpected Behavior
Comair/Delta airline had to cancel thousands of flights days before Christmas
because of integer overflow – they exceeded 32,767 crew changes (limit of
short).

int main(int argc, char *argv[]) {


short airlineCrewChangesThisMonth = 0;

for (int i = 0; i < 31; i++) {


airlineCrewChangesThisMonth += 1200;
printf(...);
}
}
70
Recap
• Integer Representations Lecture 3 takeaway: computers
• Bits and Bytes represent everything in binary.
• Hexadecimal We must determine how to
• Unsigned Integers represent our data (e.g., base-10
• Signed Integers numbers) in a binary format so a
• Overflow computer can manipulate it.
There may be limitations to these
representations! (overflow)

Next time: How can we manipulate individual bits and bytes?


71
Extra Slides

72
Truncating Bit Representation
If we want to reduce the bit size of a number, C truncates the representation
and discards the more significant bits.

int x = 53191;
short sx = x; // -12345!

x = 0000 0000 0000 0000 1100 1111 1100 0111


sx = 1100 1111 1100 0111

73
Truncating Bit Representation
If we want to reduce the bit size of a number, C truncates the representation
and discards the more significant bits.

int x = -3;
short sx = x; // still -3

x = 1111 1111 1111 1111 1111 1111 1111 1101


sx = 1111 1111 1111 1101

74
Expanding Bit Representations
Sometimes, we want to carry over a value to a larger variable (e.g. make an int
and set it equal to a short).
• For unsigned values, C adds leading zeros to the representation (“zero
extension”)
• For signed values, C repeats the sign of the value for new digits (“sign
extension”

75
Expanding Bit Representation
If we want to expand the bit size of an unsigned number, C adds leading zeros.

unsigned short s = 4;
unsigned int i = s; // still 4

s = 0000 0000 0000 0100


i = 0000 0000 0000 0000 0000 0000 0000 0100

76
Expanding Bit Representation
If we want to expand the bit size of an signed number, C adds repeats the sign.

short s = -4;
int i = s; // still -4

s = 1111 1111 1111 1100


i = 1111 1111 1111 1111 1111 1111 1111 1100

77
Expanding Bit Representation
If we want to expand the bit size of an signed number, C adds repeats the sign.

short s = 4;
int i = s; // still 4

s = 0000 0000 0000 0100


i = 0000 0000 0000 0000 0000 0000 0000 0100

78
Practice: Two’s Complement
What are the negative or positive equivalents of the numbers below?
a) -4 (1100)
b) 7 (0111)
c) 3 (0011)

79
Practice: Two’s Complement
Fill in the below table: It’s easier to compute
base-10 for positive
numbers, so use two’s
complement first if
char x = ____; char y = -x; negative.
decimal binary decimal binary
1. 0b1111 1100
2. 0b0001 1000
3. 0b0010 0100
4. 0b1101 1111
80
Practice: Two’s Complement
Fill in the below table: It’s easier to compute
base-10 for positive
numbers, so use two’s
complement first if
char x = ____; char y = -x; negative.
decimal binary decimal binary
1. -4 0b1111 1100 4 0b0000 0100
2. 0b0001 1000
3. 0b0010 0100
4. 0b1101 1111
81
Practice: Two’s Complement
Fill in the below table: It’s easier to compute
base-10 for positive
numbers, so use two’s
complement first if
char x = ____; char y = -x; negative.
decimal binary decimal binary
1. -4 0b1111 1100 4 0b0000 0100
2. 24 0b0001 1000 -24 0b1110 1000
3. 36 0b0010 0100 -36 0b1101 1100
4. -33 0b1101 1111 33 0b0010 0001
82
Underspecified question
What is the following base-2 number in sign
ed
base-10?

0b1101 15 0 1
14 unsigned 2
13 3
12 4

11 5
10 6
9 8 7

83
Underspecified question
What is the following base-2 number in sign
ed
base-10?

0b1101 15 0 1
14 unsigned 2
If 4-bit signed: -3 13 3
If 4-bit unsigned: 13 12 4
If >4-bit signed or unsigned: 13 11 5
10 6
9 8 7
You need to know the type to determine the
number! (Note by default, numeric constants
in C are signed ints)
84
Overflow
• What is happening here? Assume 4-bit numbers. sign
ed
0b1101
+ 0b0100 15 0 1
14 unsigned 2
13 3
12 4

11 5
10 6
9 8 7

85
Overflow
• What is happening here? Assume 4-bit numbers. sign
ed
0b1101
+ 0b0100 15 0 1
14 unsigned 2
13 3
12 4

Signed Unsigned 11 5
10 6
-3 + 4 = 1 13 + 4 = 1 9 8 7

No overflow Overflow

86
Limits and Comparisons
1. What is
the… Largest unsigned? Largest signed? Smallest signed?
char

int

87
Limits and Comparisons
1. What is
the… Largest unsigned? Largest signed? Smallest signed?
char 28 - 1 = 255 27 – 1 = 127 -27 = -128

int 232 - 1 = 231 - 1 = -231 =


4294967296 2147483647 -2147483648

These are available as


UCHAR_MAX, INT_MIN,
INT_MAX, etc. in the
<limits.h> header.
88

You might also like