ITW4
ITW4
Workshop
LECTURE 4
Characters and Strings
Text in String and Character Arrays
There are two ways to represent text in MATLAB. You can store text in string arrays or you can
store text in character arrays. A typical use for character arrays is to store pieces of text as
character vectors. MATLAB displays strings with double quotes and character vectors with
single quotes.
str =
"Hello, world“
Though the text "Hello, world" is 12 characters long, str itself is a 1-by-1 string, or string scalar.
String examples…
To find the number of characters in a string, use the strlength function.
n = strlength(str)
n = 12
If the text includes double quotes, use two double quotes within the definition.
str =
fahrenheit = 71;
celsius = (fahrenheit-32)/1.8;
tempText =
"temperature is 21.6667C"
tempText2 =
N = 2×3
766
683
Represent Text with Character Vectors
To store a 1-by-n sequence of characters as a character vector, using the char data type, enclose it in single
quotes.
chr =
'Hello, world'
The text 'Hello, world' is 12 characters long, and chr stores it as a 1-by-12 character vector.
whos chr
• To represent data that is encoded using characters. In such cases, you might need easy access to
individual characters.
seq = 'GCTAGAATCC';
Character vectors examples…
Access individual characters or subsets of characters by indexing, just as you would index into a numeric
array.
seq(4:6)
ans =
'AGA'
Concatenate character vector with square brackets, just as you concatenate other types of arrays.
seq2 =
'GCTAGAATCCATTAGAAACC'
Create String Arrays (from Variables)…
String arrays store pieces of text and provide a set of functions for working with text as data.
chr =
'Greetings, friend'
str = string(chr)
str =
"Greetings, friend“
chr is a 1-by-17 character vector. str is a 1-by-1 string that has the same text as the character vector.
Create String Arrays (from Variables)
Convert a numeric array to a string array. Convert a datetime value to a string.
X = [5 10 20 3.1416]; d = datetime('now');
string(X) string(d)
chr = ''
chr =
0x0 empty char array
Create Missing Strings
The missing string is the string equivalent to NaN for numeric arrays.
str = string(missing)
str =
<missing>
str(1) = "";
str(2) = "Gemini";
str(3) = string(missing)
str(2,2)
ans =
"Skylab B“
Access Characters Within Strings
str = ["Mercury", "Gemini", "Apollo";
chr = str{2,2}
chr =
'Skylab B’
str{2,2}(1:3)
ans =
'Sky'
Concatenate Strings into String Array
str1 = ["Mercury", "Gemini", "Apollo"];
names = split(names)
"Mary" "Jones" CAUTION! All the terms inside must have the same delimiter
"John" "Adams"
"Elizabeth" "Young"
"Paul" "Burns"
"Ann" "Spencer"
Split, Join, and Sort String Array…
Join the last and first names.
names = join(names)
"Jones, Mary"
"Adams, John"
"Young, Elizabeth"
"Burns, Paul"
"Spencer, Ann"
Split, Join, and Sort String Array
Sort the elements of names so that they are in alphabetical order.
names = sort(names)
"Adams, John"
"Burns, Paul"
"Jones, Mary"
"Spencer, Ann"
"Young, Elizabeth"
Compare String Arrays for Equality
Compare string arrays and character vectors with str1 = ["Mercury", "Gemini", "Apollo";...
relational operators and with the strcmp function.
"Skylab", "Skylab B", "International Space
str1 = "Hello"; Station"];
0 001
000
Compare String Arrays with Other
Relational Operators
Strings that start with uppercase letters come before strings that start with lowercase letters. Digits and some punctuation
marks also come before letters.
ans = logical
str(TF)
1
ans = 1x3 string
"Sanchez" "de Ponte" "Nash"
str = ["Sanchez", "Jones", "de Ponte", "Crosby", "Nash"];
10101
Search for Text
To determine if text is present, use a function that returns How many times the text occurs using the count
logical values, like contains, startsWith, or endsWith. function.
TF = contains(txt,"sea") n=2
TF = logical To locate where the text occurs, use the strfind function,
which returns starting indices.
1
idx = strfind(txt,"sea")
idx = 1×2
11 28
Building Simple Patterns
Patterns are a tool to aid in searching for and modifying text. Similar to regular expressions, a pattern
defines rules for matching text. Patterns can be used with text-searching functions like contains,
matches, and extract to specify which portions of text these functions act on.
txt = "abc123def";
For example, lettersPattern matches
pat = lettersPattern; any letter characters.
extract(txt,pat)
"abc"
"def"
Character Codes
Unicode, formally The Unicode Standard, is an information technology standard for the
consistent encoding, representation, and handling of text expressed in most of the world's writing
systems.
u = [77 65 84 76 65 66];
c = char(u)
c=
'MATLAB'
Unicode and ASCII Values
MATLAB® stores all characters as Unicode® characters using the UTF-16 encoding, where very character is
represented by a numeric code value. (Unicode incorporates the ASCII character set as the first 128 symbols, so
ASCII characters have the same numeric codes in Unicode and ASCII.)
C = 'MATLAB'
C=
'MATLAB'
unicodeValues = double(C)
unicodeValues = 1×6
77 65 84 76 65 66
You cannot convert characters in a string array directly to Unicode code values.