0% found this document useful (0 votes)
3 views65 pages

Lec02 C Intro

The document outlines the course structure and key topics for Computer Science 61C, focusing on the C programming language and its semantics, including pointers, memory management, and number representations. It discusses the significance of signed and unsigned integers, two's complement representation, and the basics of digital systems. Additionally, it provides an overview of historical computers and the components of a computer system, emphasizing the levels of representation in programming.

Uploaded by

2430385640cy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views65 pages

Lec02 C Intro

The document outlines the course structure and key topics for Computer Science 61C, focusing on the C programming language and its semantics, including pointers, memory management, and number representations. It discusses the significance of signed and unsigned integers, two's complement representation, and the basics of digital systems. Additionally, it provides an overview of historical computers and the components of a computer system, emphasizing the levels of representation in programming.

Uploaded by

2430385640cy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Computer Science 61C Wawrzynek and Weaver

Introduction to C

1st Edition: 1978


1
Administrivia:
Computer Science 61C Wawrzynek and Weaver

• Reminder, check that you are signed up on Gradescope, Piazza, &


GitHub Classroom
• GitHub allows us to see & check the progress on projects
• Defaults also keep you from screwing up and making your archives for the class public

• Also, remember to sign up for a lab section. More info:


https://piazza.com/class/kqrhto1l4fa6s8?cid=14
• We have added 3 remote only lab sections
• We have moved three of the Thursday 8-10pm lab sections to Friday 9am-11am and
11am-1pm.
• HW1 (number rep) released Tuesday (noon)
• Project 1 to be released Friday (noon)
2
Administrivia 2:
Notes on Project 1...
Computer Science 61C Wawrzynek and Weaver

• Project 1 is remarkably • Generic pointers to void * and


casting
subtle
• Pointers to functions
• Designed to cover alot of C
semantics • But you don't actually
• Items include need to write a lot of code!
• Pointers and structures A little more than 100 lines
• Strings of code
• Memory allocation and
deallocation

3
Favorites: binary and hexadecimal (plus decimal)
Hex is a convenient way to represent binary
Computer Science 61C Fall 2021 Wawrzynek and Weaver

• Hexadecimal digits: 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F


• For instance, the binary number: 1101 1110 1100 1010 11112 = DECAF16,
• DAD16 = 1101 1010 11012 Note: one-to-one correspondence
• Proof of equivalence: between “nibbles” and hex digits.
• DAD hex = 13 10 x 1610
2 + 1010 x 16 10
1 + 13 10 x 16 10
0

= 332810 + 16010 + 1310


= 350110
• 1101 1010 1101 2 = 1x2 11 + 1x2 10 + 1x2 8 + 1x27 + 1x2 5 + 1x2 3 + 1x2 2 + 1x2 0

= 204810 + 102410 + 25610 + 12810 + 3210 + 810 + 410 + 110


= 350110
(May put blanks every group of binary or hexadecimal digits to make it easier to parse, like
commas in decimal)
4
Signed and Unsigned Integers
Computer Science 61C Fall 2021 Wawrzynek and Weaver

• C, C++, and Java have signed integers, e.g., 7, -255:


int x, y, z;
• C, C++ also have unsigned integers, which are usually used
for memory addresses
• 32-bit word can represent 2 32 binary numbers
• Unsigned integers in 32 bit word represent
0 to 232-1 (4,294,967,295)

5
Unsigned Integers: 32-bit example
Computer Science 61C Fall 2021 Wawrzynek and Weaver

0000 0000 0000 0000 0000 0000 0000 00002 = 010


0000 0000 0000 0000 0000 0000 0000 00012 = 110
0000 0000 0000 0000 0000 0000 0000 00102 = 210
... ...
0111 1111 1111 1111 1111 1111 1111 11012 = 2,147,483,64510
0111 1111 1111 1111 1111 1111 1111 11102 = 2,147,483,64610
0111 1111 1111 1111 1111 1111 1111 11112 = 2,147,483,64710
1000 0000 0000 0000 0000 0000 0000 00002 = 2,147,483,64810
1000 0000 0000 0000 0000 0000 0000 00012 = 2,147,483,64910
1000 0000 0000 0000 0000 0000 0000 00102 = 2,147,483,65010
... ...
1111 1111 1111 1111 1111 1111 1111 11012 = 4,294,967,29310
1111 1111 1111 1111 1111 1111 1111 11102 = 4,294,967,29410
1111 1111 1111 1111 1111 1111 1111 11112 = 4,294,967,29510
6
Signed Integers and
Two’s-Complement Representation
Computer Science 61C Fall 2021 Wawrzynek and Weaver

• Signed integers in C; want ½ numbers <0, want ½ numbers >0, and want
just a single 0
• Two’s complement treats 0 as positive, so 32-bit word represents 232
integers from
-231 (–2,147,483,648) to 231-1 (2,147,483,647)
• Note: one negative number with no positive version
• Book lists some other options:
• All of which are worse except in very limited circumstances
• Every computer uses two’s complement integers today
• Most-significant bit (leftmost) is the sign bit, since 0 means positive
(including 0), 1 means negative Sign-bit has negative weight
• Bit 31 is most significant, bit 0 is least significant
7
Two’s-Complement Integers
Computer Science 61C Fall 2021 Wawrzynek and Weaver

Sign Bit 0000 0000 0000 0000 0000 0000 0000 00002 = 010
0000 0000 0000 0000 0000 0000 0000 00012 = 1ten
0000 0000 0000 0000 0000 0000 0000 00102 = 2ten
... ...
0111 1111 1111 1111 1111 1111 1111 11012 = 2,147,483,64510
0111 1111 1111 1111 1111 1111 1111 11102 = 2,147,483,64610
0111 1111 1111 1111 1111 1111 1111 11112 = 2,147,483,64710
1000 0000 0000 0000 0000 0000 0000 00002 = –2,147,483,64810
1000 0000 0000 0000 0000 0000 0000 00012 = –2,147,483,64710
1000 0000 0000 0000 0000 0000 0000 00102 = –2,147,483,64610
... ...
1111 1111 1111 1111 1111 1111 1111 11012 = –310
1111 1111 1111 1111 1111 1111 1111 11102 = –210
1111 1111 1111 1111 1111 1111 1111 11112 = –110 8
Ways to Make Two’s Complement
Computer Science 61C Fall 2021 Wawrzynek and Weaver

• In two’s complement the sign-bit has negative weight:


• So the value of an N-bit word [bN-1 bN-2 … b1 b0] is:
-2 N-1 x b N-1 + 2N-2 x b N-2 +…+ 21 x b1 + 20 x b0

• For a 4-bit number, 310=00112, its two’s complement


-310 =11012 (-10002 + 01012 = -810 + 510)

• Here is an easier way: 310 00112


Bitwise Invert 11002
– Invert all bits and add 1 + 12
works both ways!

– Computers circuits do it like this, too -310 11012


9
Two’s complement representation makes signed
addition simple: addition examples
Computer Science 61C Fall 2021 Wawrzynek and Weaver

Carry
0010
3 0011
+2 0010
5 0 01 0 1

10
Two’s-Complement Addition Examples
Computer Science 61C Fall 2021 Wawrzynek and Weaver

• Assume for simplicity 4 bit width, -8 to +7 representable

3 0011 3 0011 -3 1101


+2 0010 + (-2) 1110 + (-2) 1110
5 0101 1 1 0001 -5 1 1011
Overflow when
magnitude of result
7 0111 -8 1000
too big to fit into
result +1 0001 + (-1) 1111
Carry into MSB =
representation -8 1000 +7 1 0111 Carry Out MSB
Overflow! Overflow!
Carry into MSB !=
Carry Out MSB
11
Computer Science 61C Fall 2021 Wawrzynek and Weaver

Suppose we had a 5-bit word.


What integers can be represented
in two’s complement?
☐ -32 to +31

☐ 0 to +31

☐ -16 to +15

☐ -15 to +16
12
Computer Science 61C Fall 2021 Wawrzynek and Weaver

Suppose we had a 5-bit word.


What integers can be represented
in two’s complement?
☐ -32 to +31

☐ 0 to +31

☐ -16 to +15

☐ -15 to +16
13
In digital systems everything stored, communicated,
and manipulated is done using bits...
Computer Science 61C Wawrzynek and Weaver

• A bit can represent one of two possible things: 0 or 1


• But what those things are is up to how you want to interpret
them: the default is just the number 0 or the number 1
• But it can also be "False" or "True", or depending on context say, "green" or
"purple"
• Likewise, a collection of N bits can represent one of 2N

possible things
• So far we have talked about representing integers, but could represent pixel
values, sound samples, floating-point numbers, …

14
How collections of bits are treated is dependent on
the context (PL types help define the context)
Computer Science 61C Wawrzynek and Weaver

• Say we have a collection of 32 bits...


• We can treat it as a single unsigned number
• 0 to 232-1
• Or a single signed number in two's complement
• -231 to 231-1
• Or even as a 16 bit unsigned number, followed by an 10 bit signed
number, followed by 6 true/false bits
• So a number from 0 to 216-1, followed by a number from -29 to 29-1, followed by 6 true/
false bits
• In the end, taken together, its still representing a single instance out of 232 distinct
things
15
Summary: Number Representations
Computer Science 61C Fall 2021 Wawrzynek and Weaver

• Everything in a computer is a number, in fact only 0 and 1.


• Integers are interpreted by adhering to fixed length
• Signed numbers are represented with Two’s complement
• Overflows can be detected utilizing the carry bit
• We will get into some more representations later when we
talk about floating point
• “Sign Magnitude & Biased” representations are used in floating point
• Not going to talk about 1’s complement

16
Agenda
Computer Science 61C Wawrzynek and Weaver

• Computer Organization
• Compile vs. Interpret
• C vs Java
• Arrays and Pointers (perhaps)

17
ENIAC (U.Penn., 1946)
First Electronic General-Purpose Computer
Computer Science 61C Wawrzynek and Weaver

• Blazingly fast (multiply in


2.8ms!)
• 10 decimal digits x 10 decimal digits
• But needed 2-3 days to
setup new program, as
programmed with patch
cords and switches
• At that time & before, "computer"
mostly referred to people who did
calculations
18
EDSAC (Cambridge, 1949)
First General Stored-Program Computer
Computer Science 61C Wawrzynek and Weaver

• 35-bit binary 2’s complement


words
• Programs held as numbers in
memory
• This is the revolution:
It isn't just programmable, but the
program is just the same type of data
that the computer computes on:
Bits are not just the numbers being
manipulated, but the instructions
on how to manipulate the
numbers!

19
Components of a Computer
Computer Science 61C Wawrzynek and Weaver

Memory
Processor (CPU/core) Input
Enable?
Read/Write
Control

Program
Datapath
Address
PC Bytes “Read Data” is both
Write instructions and data
Registers Data

Arithmetic & Logic Unit Read Data


(ALU) Output
Data

Processor-Memory Interface I/O-Memory Interfaces


20
Great Idea: Levels of Representation/Interpretation
Computer Science 61C
temp = v[k]; Wawrzynek and Weaver
High Level Language v[k] = v[k+1]; We are here!
lw t0, t2, 0
Program (e.g., C) v[k+1] = temp;
lw t1, t2, 4
sw t1, t2, 0 Compiler Anything can be represented
sw t0, t2, 4 Assembly Language as a number,
Program (e.g., RISC-V) i.e., data or instructions
Assembler 0000 1001 1100 0110 1010 1111 0101 1000
Machine Language 1010 1111 0101 1000 0000 1001 1100 0110
Program (RISC-V) 1100 0110 1010 1111 0101 1000 0000 1001
0101 1000 0000 1001 1100 0110 1010 1111

Machine
Interpretation
Hardware Architecture Description
(e.g., block diagrams)
Architecture
Implementation

Logic Circuit Description


(Circuit Schematic Diagrams)
21
Introduction to C
“The Universal Assembly Language”
Computer Science 61C Wawrzynek and Weaver

• Class pre-req included classes


teaching Java
• “Some” experience is required before CS61C
• C++ or Java OK

• Python used in two labs


• C used for everything else "high" level
• Almost all low level assembly is RISC-V
• But Project 4 may require touching some x86…

“K&R” 22
Intro to C
Computer Science 61C Wawrzynek and Weaver

• C is not a “very high-level” language, nor a “big” one, and is


not specialized to any particular area of application. But its
absence of restrictions and its generality make it more
convenient and effective for many tasks than supposedly
more powerful languages.
• Kernighan and Ritchie
• Enabled first operating system not written in assembly
language: UNIX - A portable OS!

23
Intro to C
Computer Science 61C Wawrzynek and Weaver

• Why C?: we can write programs that allow us to exploit underlying features of
the architecture (memory management, special instructions) and do it in a
portable way (C compilers universally available for all existing processor
architectures)
• C and derivatives (C++/Obj-C/C#) still one of the most popular application
programming languages after >40 years!
• It’s popularity is mainly because of momentum:
• Most Operating Systems (Linux, Windows) written in C (or C++),
• as are world’s most popular databases, including Oracle Database, MySQL, MS SQL Server, and PostgreSQL.
• However, if you are starting a new project where performance matters consider either Go or Rust
• Rust, “C-but-safe”: By the time your C is (theoretically) correct with all the necessary checks it should be no
faster than Rust.
• Go, “Concurrency”: Actually able to do practical concurrent programming to take advantage of modern multi-
core microprocessors.
24
Recommendations on using C
Computer Science 61C Wawrzynek and Weaver

• Use C/C++/Objective C if...


1. You are starting from an existing C code base
2. Or, you are targeting a very small computer
• E.G. Adafruit "trinket": 16 MHz processor,
8 kB of Flash, 512 B of SRAM, 512 B of EEPROM
• KL-02: 2mm x 2mm containing a 32b ARM at 48 MHz,
32 kB FLASH, 4 KB of SRAM
3. Or, you are learning how things really work
• This class, CS162, etc...

• Otherwise, don't...
• If you can tolerate GC pauses, go (aka golang) is really nice
• Or C#, Java, Scala, Swift, etc...
• If you can't, there is rust...
25
Disclaimer
Computer Science 61C Wawrzynek and Weaver

• You will not learn how to fully code in C in these lectures! You’ll still need
your C reference for this course
• K&R is a must-have
• Useful Reference: “JAVA in a Nutshell,” O’Reilly
• Chapter 2, “How Java Differs from C”
• http://oreilly.com/catalog/javanut/excerpt/index.html
• Brian Harvey’s helpful transition notes
• On CS61C class website: pages 3-19
• http://inst.eecs.berkeley.edu/~cs61c/resources/HarveyNotesC1-3.pdf

• Key C concepts: Pointers, Arrays, Implications for Memory management


• Key security concept: All of the above are unsafe: If your program contains an error in these
areas it might not crash immediately but instead leave the program in an inconsistent (and
often exploitable) state
26
Agenda
Computer Science 61C Wawrzynek and Weaver

• Computer Organization
• Compile vs. Interpret
• C vs Java

27
Compilation: Overview
Computer Science 61C Wawrzynek and Weaver

• C compilers map C programs directly into


architecture-specific machine code (string of 1s and 0s)
• The processor directly executes the machine code (the job of the hardware)
• Unlike Java, which converts to architecture-independent “bytecode” which are
interpeted by “virtual machine” and/or converted by a just-in-time compiler (JIT)
to machine code.
• Unlike Python environments, which converts to a byte code at runtime
• These differ mainly in exactly when your program is converted to low-level machine
instructions (“levels of interpretation”)

• For C, generally a two part process of compiling source files


(.c) to object files (.o), then linking the .o files into executables;
28
C Compilation Simplified Overview
(more later in course)
Computer Science 61C Wawrzynek and Weaver

foo.c bar.c C source files (text)

Compiler Compiler/assembler
Compiler
combined here

foo.o bar.o Machine code object files

lib.o Pre-built object


Linker
file libraries

a.out Machine code executable file


29
Compilation: Advantages
Computer Science 61C Wawrzynek and Weaver

• Excellent run-time performance: generally much faster than


Scheme or Java for comparable code (because it optimizes
for a given architecture)
• But these days, a lot of performance is in libraries:
Plenty of people do scientific computation in python!?!, because they have
optimized libraries
• Reasonable compilation time: enhancements in compilation
procedure (Makefiles) allow only modified files to be
recompiled
30
Compilation: Disadvantages
Computer Science 61C Wawrzynek and Weaver

• Compiled files, including the executable, are architecture-specific,


depending on processor type (e.g., MIPS vs. x86 vs. RISC-V) and the
operating system (e.g., Windows vs. Linux vs. MacOS)
• And even library versions under Linux. Linux is so bad we came up with "containers", that
effectively ship around whole miniature OS images just to run single programs
• Executable must be rebuilt on each new system
• I.e., “porting your code” to a new architecture
• “Change → Compile → Run [repeat]” iteration cycle can be slow during
development
• but make only rebuilds changed pieces, and can do compiles in parallel on multiple cores
(make -j X)
• linker is sequential though → Amdahl’s Law
31
C Pre-Processor (CPP)
Computer Science 61C Wawrzynek and Weaver

foo.c CPP foo.i Compiler

• C source files first pass through “macro preprocessor”, CPP, before compiler sees code
• CPP commands begin with “#”
#include “file.h” /* Inserts file.h into output */
#include <stdio.h> /* Looks for file in standard location */
#define M_PI (3.14159) /* Define constant */
#if/#endif /* Conditional inclusion of text */
• CPP replaces comments with a single space
• Use –save-temps option to gcc to see result of preprocessing
• Full documentation at: http://gcc.gnu.org/onlinedocs/cpp/

32
CPP Macros:
A Warning...
Computer Science 61C Wawrzynek and Weaver

• You often see C preprocessor macros defined to create


small "functions"
• But they aren't actual functions, instead it just changes the text of the program
• In fact, all #include does is copy that file into the current file and replace
arguments
#define twox(x) (x + x)…
• Example:
twox(3); ⇒ (3 + 3);

• Could lead to interesting errors with macros

twox(y++); ⇒ (y++ + y++);

33
C vs. Java
Computer Science 61C Wawrzynek and Weaver

C Java
Type of Language Function Oriented Object Oriented
Programming Unit Function Class = Abstract Data Type
gcc hello.c creates javac Hello.java creates Java virtual machine
Compilation
machine language code language bytecode
a.out loads and executes
Execution java Hello interprets bytecodes
program
#include<stdio.h> public class HelloWorld {
int main(void) { public static void main(String[] args) {
hello, world printf("Hello\n"); System.out.println("Hello");
return 0; }
} }
New allocates & initializes,
Storage Manual (malloc, free)
Automatic (garbage collection) frees

From http://www.cs.princeton.edu/introcs/faq/c2java.html 34
C vs. Java
Computer Science 61C Wawrzynek and Weaver

C Java
Comments /* … */ /* … */ or // … end of line
Constants #define, const final
Preprocessor Yes No
Variable
At beginning of a block Before you use it
declaration
Variable
naming sum_of_squares sumOfSquares
conventions
Accessing a #include <stdio.h> import java.io.File;
library

From http://www.cs.princeton.edu/introcs/faq/c2java.html 35
Typed Variables in C
Computer Science 61C Wawrzynek and Weaver

int variable1 = 2; • Must declare the type of data a variable will hold
float variable2 = 1.618;
– Types can't change
char variable3 = 'A';

Type Description Example


int Integer Numbers (including negatives) 0, 78, -217, 0x7337
At least 16 bits, can be larger
unsigned int Unsigned Integers 0, 6, 35102
float Floating point decimal 0.0, 3.14159, 6.02e23
double Equal or higher precision floating point 0.0, 3.14159, 6.02e23
char Single character ‘a’, ‘D’, ‘\n’
Longer int,
long 0, 78, -217, 301720971
Size >= sizeof(int), at least 32b
Even longer int,
long long 31705192721092512
size >= sizeof(long), at least 64b

36
Integers: Python vs. Java vs. C
Computer Science 61C Wawrzynek and Weaver

• C: int should be integer type that target processor works with


most efficiently
• Only guarantee:
sizeof(long long) ≥ sizeof(long) ≥ sizeof(int) ≥ sizeof(short)
• Also, short >= 16 bits, long >= 32 bits
• All could be 64 bits
Language sizeof(int)
Python >=32 bits (plain ints)
Java 32 bits
C Depends on computer; 16 or 32 or 64 37
Consts and Enums in C
Computer Science 61C Wawrzynek and Weaver

• Constant is assigned a typed value once in the declaration;


value can't change during entire execution of program
const float golden_ratio = 1.618;
const int days_in_week = 7;
const double the_law = 2.99792458e8;
• You can have a constant version of any of the standard C
variable types
• Enums: a group of related integer constants. Ex:
enum cardsuit {CLUBS,DIAMONDS,HEARTS,SPADES};
enum color {RED, GREEN, BLUE};
38
Typed Functions in C
Computer Science 61C Wawrzynek and Weaver

int number_of_people ()
{
return 3;
• You have to declare the type of data you
} plan to return from a function
• Return type can be any C variable type, and
float dollars_and_cents () is placed to the left of the function name
{ • You can also specify the return type as void
return 10.33; – Just think of this as saying that no value will be returned
} • Also necessary to declare types for values
passed into a function
int sum ( int x, int y) • As with variables, functions MUST be
{ declared before they are used
return x + y;
}
39
Structs in C
Computer Science 61C Wawrzynek and Weaver

• Structs are structured groups of variables, e.g.,

typedef struct {
int length_in_seconds;
int year_recorded;
} Song; Dot notation: x.y = value

Song song1;
song1.length_in_seconds = 213;
song1.year_recorded = 1994;

Song song2;
song2.length_in_seconds = 248;
song2.year_recorded = 1988;

40
A First C Program: Hello World
Computer Science 61C Wawrzynek and Weaver

Original C: ANSI Standard C:

main() #include <stdio.h>


{
printf("\nHello World\n"); int main(void)
{
}
printf("\nHello World\n");
return 0;
}

41
C Syntax: main
Computer Science 61C Wawrzynek and Weaver

• When C program starts


• C executable a.out is loaded into memory by operating system
(OS)
• OS sets up stack, then calls into C runtime library,
• Runtime first initializes memory and other libraries,
• then calls your procedure named main()
• We’ll see how to retrieve command-line arguments in
main() later…

42
A Second C Program:
Compute Table of Sines
Computer Science 61C Wawrzynek and Weaver

#include <stdio.h> angle_degree = 0;


#include <math.h> /* initial angle value */
/* scan over angle */
int main(void) while (angle_degree <= 360)
{ /* loop until angle_degree > 360 */
int angle_degree; {
double angle_radian, pi, value; angle_radian = pi*
/* Print a header */ angle_degree/180.0;
printf("\nCompute a table of the value = sin(angle_radian);
sine function\n\n"); printf (" %3d %f \n ",
/* obtain pi once for all */ angle_degree, value);
/* or just use pi = M_PI, where */ angle_degree += 10;
/* M_PI is defined in math.h */ /* increment the loop index */
pi = 4.0*atan(1.0); }
printf("Value of PI = %f \n\n", return 0;
pi); }
printf("angle Sine \n");

43
Second C Program
Sample Output
Computer Science 61C Wawrzynek and Weaver

Compute a table of the sine function

Value of PI = 3.141593

angle Sine
0 0.000000
10 0.173648
20 0.342020
30 0.500000
40 0.642788
50 0.766044
60 0.866025
70 0.939693
80 0.984808
90 1.000000
100 0.984808
110 0.939693
120 0.866025
130 0.766044
140 0.642788
....
44
C Syntax: Variable Declarations
Computer Science 61C Wawrzynek and Weaver

• Similar to Java, but with a few minor but important


differences
• All variable declarations must appear before they are used
• All must be at the beginning of a block.
• A variable may be initialized in its declaration;
if not, it holds garbage! (the contents are undefined)
• Examples of declarations:
• Correct: { int a = 0, b = 10; ...
• Incorrect: for (int i = 0; i < 10; i++) { ...

Newer C standards are more flexible about this 45


An Important Note:
Undefined Behavior…
Computer Science 61C Wawrzynek and Weaver

• A lot of C has “Undefined Behavior”


• This means it is often unpredictable behavior
• It will run one way on one compiler and computer…
• But some other way on another
• Or even just be different each time the program is executed!

• Often contributes to “heisenbugs”


• Bugs that seem random/hard to reproduce
• (In contrast to “bohrbugs” which are deterministic)

46
C Syntax : Control Flow (1/2)
Computer Science 61C Wawrzynek and Weaver

• Within a function, remarkably close to Java constructs (shows Java’s


legacy) in terms of control flow
• A statement can be a {} of code or just a standalone statement

• if-else
• if (expression) statement
• if (x == 0) y++;
• if (x == 0) {y++;}
• if (x == 0) {y++; j = j + y;}
• if (expression) statement1 else statement2
• There is an ambiguity in a series of if/else if/else if you don't use {}s, so use {}s to block the code
• In fact, it is a bad C habit to not always have the statement in {}s, it has resulted in some amusing errors...

• while
• while (expression) statement
• do statement while (expression);
47
C Syntax : Control Flow (2/2)
Computer Science 61C Wawrzynek and Weaver

• for
• for (initialize; check; update) statement

• switch
• switch (expression){
case const1: statements
case const2: statements
default: statements
}
• break; /* need to break out of case */
• Note: until you do a break statement things keep executing in the switch statement
• C also has goto
• But it can result in spectacularly bad code if you use it, so don’t! Makes your code hard to
understand, debug, and modify.
48
C Syntax: True or False
Computer Science 61C Wawrzynek and Weaver

• What evaluates to FALSE in C?


• 0 (integer)
• NULL (a special kind of pointer that is also 0: more on this later)
• No explicit Boolean type in old-school C
• Often you see #define bool (int)
• Then #define false 0
• Basically anything where all the bits are 0 is false
• What evaluates to TRUE in C?
• Anything that isn’t false is true
• Same idea as in Python: only 0s or empty sequences are false,
anything else is true!
49
C and Java operators nearly identical
Computer Science 61C Wawrzynek and Weaver

• arithmetic: +, -, *, /, % • subexpression grouping: ()


• assignment: = • order relations: <, <=, >,
• augmented assignment: +=, >=
-=, *=, /=, %=, &=, |=, • increment and decrement: ++
^=, <<=, >>= and --
• bitwise logic: ~, &, |, ^ • member selection: ., ->
• bitwise shifts: <<, >> • This is slightly different than Java
because there are both structures and
• boolean logic: !, &&, || pointers to structures, more later

• equality testing: ==, != • conditional evaluation: ? :


50
Our Tip of the
Day… Valgrind
Computer Science 61C Wawrzynek and Weaver

• Valgrind turns most unsafe "heisenbugs" into "bohrbugs"


• It adds almost all the checks that Java does but C does not
• The result is your program immediately crashes where you make a mistake
• It is installed on the lab machines
• Nick's scars from his 60C experience:
• First C project, spent an entire day tracing down a fault...
• That turned out to be a <= instead of a < in initializing an array!

51
Agenda
Computer Science 61C Wawrzynek and Weaver

• Pointers
• Arrays in C

52
Remember What We Said Earlier About
Buckets of Bits?
Computer Science 61C Wawrzynek and Weaver

• C's memory model is that conceptually


there is simply one huge bucket of bits 0xFFFFFFFC xxxx xxxx xxxx xxxx
• Arranged in bytes 0xFFFFFFF8 xxxx xxxx xxxx xxxx

• Each byte has an address 0xFFFFFFF4 xxxx xxxx xxxx xxxx


0xFFFFFFF0 xxxx xxxx xxxx xxxx
• Starting at 0 and going up to the maximum value
(0xFFFFFFFF on a 32b architecture) 0xFFFFFFEC xxxx xxxx xxxx xxxx
• 32b architecture means the # of bits in the address ... ... ... ... ...
0x14 xxxx xxxx xxxx xxxx
• We commonly think in terms of "words"
0x10 xxxx xxxx xxxx xxxx
• Least significant bits of the address are the offset
within the word 0x0C xxxx xxxx xxxx xxxx

• Word size is 32b for a 32b architecture, 64b for a 0x08 xxxx xxxx xxxx xxxx
64b architecture: 0x04 xxxx xxxx xxxx xxxx
A word is big enough to hold an address 0x00 xxxx xxxx xxxx xxxx

53
Address vs. Value
Computer Science 61C Wawrzynek and Weaver

• Consider memory to be a single huge array


• Each cell of the array has an address associated with it
• Each cell also stores some value
• For addresses do we use signed or unsigned numbers? Negative
address?!
• Don’t confuse the address referring to a memory location
with the value stored there
101 102 103 104 105 ...
... 23 42 ...

54
Pointers
Computer Science 61C Wawrzynek and Weaver

• An address refers to a particular memory location; e.g., it


points to a memory location
• Pointer: A variable that contains the address of a variable

Location (address)

101 102 103 104 105 ...


... 23 42 104 ...
x y p
name

55
Pointer Syntax
Computer Science 61C Wawrzynek and Weaver

• int *p;
• Tells compiler that variable p is address of an int
• p = &y;
• Tells compiler to assign address of y to p
• & called the “address operator” in this context
• z = *p;
• Tells compiler to assign value at address in p to z
• * called the “dereference operator” in this context

56
Creating and Using Pointers
Computer Science 61C Wawrzynek and Weaver

• How to create a pointer: Note the “*” gets


p ? x ? used two different
& operator: get address of a variable ways in this example.
In the declaration to
int *p, x; x = 3; p ? x 3 indicate that p is
going to be a pointer,
p = &x; and in the printf to
p x 3 get the value pointed
to by p.
• How get a value pointed to?
“*” (dereference operator): get the value that the pointer points to
printf(“p points to %d\n”,*p);

57
Using Pointer for Writes
Computer Science 61C Wawrzynek and Weaver

• How to change a variable pointed to?


• Use the dereference operator * on left of assignment operator =

p x 3

*p = 5; p x 5

58
Pointers and Parameter Passing
Computer Science 61C Wawrzynek and Weaver

• Java and C pass basic parameters “by value”:


Procedure/function/method gets a copy of the parameter, so
changing the copy cannot change the original
void add_one (int x)
{
x = x + 1;
}
int y = 3;
add_one(y);

y remains equal to 3

59
Pointers and Parameter Passing
Computer Science 61C Wawrzynek and Weaver

• How can we get a function to change the value held in a variable?

void add_one (int *p)


{
*p = *p + 1;
}
int y = 3;

add_one(&y);

y is now equal to 4

60
Types of Pointers
Computer Science 61C Wawrzynek and Weaver

• Pointers are used to point to any kind of data (int, char, a struct,
etc.)
• Normally a pointer only points to one type (int, char, a struct, etc.).
• void * is a type that can point to anything (generic pointer)
• Use void * sparingly to help avoid program bugs, and security issues, and other bad things!

• You can even have pointers to functions…


• int (*fn) (void *, void *) = &foo
• fn is a function that accepts two void * pointers and returns an int
and is initially pointing to the function foo.
• (*fn)(x, y) will then call the function

61
More C Pointer Dangers
Computer Science 61C Wawrzynek and Weaver

• Declaring a pointer just allocates space to hold the pointer –


it does not allocate the thing being pointed to!
• Local variables in C are not initialized, they may contain
anything (aka “garbage”)
• What does the following code do?
void f()
{
int *ptr;
*ptr = 5;
}
62
Pointers and Structures
Computer Science 61C Wawrzynek and Weaver

typedef struct {
int x; /* dot notation */
int y; int h = p1.x;
} Point; p2.y = p1.y;

Point p1; /* arrow notation */


Point p2; int h = paddr->x;
Point *paddr; int h = (*paddr).x;

/* This works too */


p1 = p2;

63
Pointers in C
Computer Science 61C Wawrzynek and Weaver

• Why use pointers?


• If we want to pass a large struct or array, it’s easier / faster / etc. to pass a pointer
than the whole thing
• Otherwise we’d need to copy a huge amount of data
• You notice in Java that more complex objects are passed by reference....
Under the hood this is a pointer
• In general, pointers allow cleaner, more compact code
• So what are the drawbacks?
• Pointers are probably the single largest source of bugs in C, so be careful anytime
you deal with them
• Most problematic with dynamic memory management—coming up next time
• Dangling references and memory leaks
64
Why Pointers in C?
Computer Science 61C Wawrzynek and Weaver

• At time C was invented (early 1970s), compilers often didn’t produce


efficient code
• Computers 100,000x times faster today, compilers better
• C designed to let programmer say what they want code to do without
compiler getting in way
• Even give compilers hints which registers to use!
• Today’s compilers produce much better code, so may not need to use raw
pointers in application code
• Most other languages use “pass by reference” for objects, which is semantically similar but with
checks for misuse
• Low-level system code still needs low-level access via pointers
• And compilers basically convert "pass by reference" into pointer-based code
65

You might also like