All 37 Functions
All 37 Functions
SUBSTR
The substr function extracts a text from the character variable, and the length of the new var is
same as that of the string, it takes 3 arguments string name, start position and length.
data a;
set sasuser.admit;
a= substr(actlevel,2,2);
run;
data a;
set sasuser.admit;
if substr(actlevel,1,2)='HI' then do;
substr(actlevel,1,2)='HO';
end;
run;
3. Substr counts blanks too The SAS System 14:52 Wednesday, January 7,
* The substr function extracts a text 2009 3
from the character variable, it counts
blanks too, trying to get two chars would Obs x phone
return just one as blank is counted;
1 (91) 9999265789 9
data a;
x='(91) 9999265789';
phone=substr(x,5,2);
run;
data a;
x='(91) 9999265789';
phone=substr(x,6);
run;
data a;
set sasuser.admit;
if actlevel =: 'HI';
run;
data a;
set sasuser.admit;
where actlevel =: 'HI';
run;
2. SCAN
The scan function searches for a particular string and puts the value in the target variable, the target variable
length using the scan function is 200 chars, the delimiter is by default is blank
data x;
set sasuser.admit;
lname=scan(name,2,',');
run;
data a;
x='amit ka school tha kv';
school=scan(x,5);
run;
data a;
x='amit~ka~school~tha~kv';
school=scan(x,5,'~');
run;
data a;
x='~amit~ ka~ school~ tha~* kv';
school=scan(x,1,'~');
run;
data a;
name=' Amit Kumar Singh ';
lname=scan(name,3); The SAS System 14:52 Wednesday, January 7,
run; 2009 13
data master;
set x;
lname=scan(name,-1);
run;
proc print data=master; The SAS System 14:52 Wednesday, January 7,
run; 2009 15
data a;
set email;
domain_name=scan(id,-1,'@');
run;
3. Compress,compbl,strip
3.a. Compress
Compress squeezes the string and takes the delimiter blank as default and same length as parent string,
removes the leading and trailing blanks +internal blanks if blank is the delimiter which is the default case;
data amit;
x='ab c d';
y=compress(x);
run;
data amit;
x='ab, c, d';
y=compress(x,',');
run;
3. Third argument, 1 dlm at a time The SAS System 14:52 Wednesday, January 7,
* The SAS would not recognize the third 2009 27
argument in a compress function, so
compress take only 2 arguments; Obs x y
3.b. Compbl
The compbl function compress the in between blanks as the name suggest and makes the blank gap to
uniform 1, The LEADING MULTIPLE BLANKS ARE REDUCED TO 1 BLANK ONLY; length of output variable is
same as that of parent variable
* The compbl function compress the in The SAS System 14:52 Wednesday, January 7,
between blanks as the name suggest and 2009 38
makes the blank gap to uniform 1, The
LEADING MULTIPLE BLANKS ARE REDUCED TO 1 Obs x y
BLANK ONLY;
1 a b c d a b c d
data amit;
x=' a b c d
';
y=compbl(x);
run;
3.c. Strip
The STRIP function just strips the leading and trailing blank(s) and no effect on internal blanks; length of
output variable is same as that of parent variable.
* It just strips the leading and trailing The SAS System 14:52 Wednesday, January 7,
blank(s) and no effect on internal 2009 40
blanks;
Obs x y
data amit;
x=' x c d '; 1 x c d x c d
y=strip(x);
run;
********************or*************; y=11
data _null_; NOTE: DATA statement used (Total process time):
x='ab cd fg hi'; real time 0.33 seconds
y='hi'; cpu time 0.00 second
z=index(x,y);
put z=;
run;
2.Index,trim n compress 19 x='a cc dd';
* The index function can be used 20 y='dd ';
with the trim if the excerpt has 21 z=index(x,trim(y));
trailing blanks, as trim removes the 22 put z=;
trailing blanks; 23 run;
I 4.b. Indexc
The INDEXC function searches for the occurrence of the first character of the first excerpt from left to right
and if nothing is found then returns 0, good thing about indexc is that it can take multiple excerpt strings as
input.
put y=;
run;
I4.c. Indexw
The indexw searches for words in the source string. The INDEXW function searches source, from left to right,
for the first occurrence of excerpt and returns the position in source of the substring's first character. If the
substring is not found in source, then INDEXW returns a value of 0. If there are multiple occurrences of the
string, then INDEXW returns only the position of the first occurrence.
The substring pattern must begin and end on a word boundary. For INDEXW, word boundaries are delimiters,
the beginning of source, and the end of source. If you use an alternate delimiter, then INDEXW does not
recognize the end of the text as the end data.
INDEXW has the following behaviour when the second argument contains blank spaces or has a length of 0:
If both source and excerpt contain only blank spaces or have a length of 0, then INDEXW returns a
value of 1.
If excerpt contains only blank spaces or has a length of 0, and source contains character or numeric
data, then INDEXW returns a value of 0.
5. Left
This function removes the leading blanks and the blanks now moves to right, thus the string gets aligned to
LEFT, the length of the new string is equal to the parent string;.
6. Right
This function removes the trailing blanks and the blanks now moves to left, thus the string gets aligned to
RIGHT, the length of the new string is equal to the parent string;.
7.a. Trim
This function removes the trailing blanks (you will be able to see results Only when you use it with
concatenation).
291 data x;
*here we can see by simple put it 292 a='cc bb ';
does not print leading or trailing 293 b=a||'*';
294 c=trim(a)||'*';
blanks so we concatenated it with 295 put b=;
‘*’ to show up the leading and 296
297
put c=;
run;
trailing blanks.
b=cc bb *
data x; c=cc bb*
a='cc bb '; NOTE: The data set WORK.X has 1 observations and 3
b=a||'*'; variables.
c=trim(a)||'*'; NOTE: DATA statement used (Total process time):
put b=; real time 0.00 seconds
put c=; cpu time 0.00 seconds
run;
7.b. TRIMN
TRIM vs. TRIMN - Both TRIM and TRIMN remove trailing blanks from a character string. The only difference is
how they deal with blank strings. If there is a blank string variable, the TRIM function returns one blank
whereas the TRIMN function returns no blank characters.
data sample;
input string $char14.;
datalines;
Mary Smith /* contains trailing blanks */
John Brown /* contains leading blanks */
Alice Park /* contains leading and trailing blanks */
Tom Wang /* contains leading, trailing and multiple blanks
in between */
/* contains a blank string */
;
data sample;
set sample;
original = '*' || string || '*';
trim = '*' || trim(string) || '*';
trimn = '*' || trimn(string) || '*';
run;
8. TRANWARD
The tranwrd function helps in the replacement of a string in a char variable.
data amit;
set sasuser.admit;
actlevel=tranwrd(actlevel,'HI', 'ho');
run;
2.Multiple occurrence The SAS System 00:29 Wednesday, July
* Converting the multiple occurence of a 1, 2009 3
string;
Obs name
data x; 1 hihi
input name$; 2 hi
datalines; 3 cc
highhigh
high
cc
;
run;
data a;
set x;
name=tranwrd(name,'high','hi');
run;
9. TRANSLATE
The translate function changes the string character wise the character in to and from should match, any a in
string would be changed to 1 and so on.
data x;
name='amit kumar';
newname=translate(name,'12','ami');
/*translate(string,to,from)*/
run;
**ALSO if from is unbalanced then no The SAS System 00:29 Wednesday, July 1, 2009
issues***********; 8
10. LOWCASE
It converts the string in to small letters or in lower case.
11. UPCASE
It converts the string in to capital letters or in upper case.
12. PROPCASE
It converts the string in to proper case. First letter of each word in upper case and all other latters in lower case.
14. QUARTER
The qtr function calculates the Qauarter of the date and returns the value ranging from 1 to 4.
data amit2;
set sasuser.empdata;
if qtr(hiredate) gt 2;
run;
x1=sum(4,9,3,8); 24
x1=9;
x2=39;
x3=sum(of x1-x2); 48 Sum for Range
x1=55;
x2=35;
x3=6;
x4=sum(of x1-x3, 5); 101 Range and a constant value
x1=7;
x2=7;
x5=sum(x1-x2);
0 As diff gets calculated
y1=20;
y2=30;
x6=sum(of y:); 50 Sum of all values of variable y
*The sum of function can be used to calculate table with the sum of range of
variables;
data a;
sale1=5;
sale2=10;
sale3=15;
sale_sum=sum(of sale1-sale3);
run;
*The second way could be if you want to include all sale variables, use the colon wild
card;
data a;
sale1=5;
sale2=10;
sale3=15;
sale_sum=sum(of sale:);
run;
*The sum of function can be used to calculate table with the sum of range of
variables, here just the sale1-sale4 sum is calculated and variable sale5 is created
with null value;
data a;
sale1=5;
sale2=10;
sale3=15;
sale4=5;
sale_sum=sum(of sale1-sale5);
run;
16. DAY
The day function calculates the day of the date and returns the value ranging from 1 to 30/31.
data amit2;
set sasuer.empdata;
if day(hiredate) gt 10;
run;
17. YEAR
The year function calculates the year from a date value.
data amit2;
set sasuser.empdata;
if year(hiredate)=1992;
run;
18. WEEKDAY
The weekday function calculates the day of the date and returns the value ranging from 1 to 7, 1 being the Sunday
and 2 Monday and so on.
data amit2;
set sasuer.empdata;
if weekday(hiredate) gt 5;
run;
19. Month
The month function calculates the month of the date and returns the value ranging from 1 to 12.
data amit2;
set sasuer.empdata;
if month(hiredate) gt 10;
run;
20. MDY
The mdy function creates a numeric date from the values of the month day and year.
data amit1;
set amit;
attrib bdy format=date9.;
bdy=mdy(month,day,year);
run;
proc sql;
select count(*) as cnt_flag_1 from a where
flag=1;
quit;
run;
22. PUT
The PUT function is used to convert the numeric values to the character values for the SAS.
**If we do not use put function THE The SAS System 00:29 Wednesday, July 1, 2009 43
SAS LOG INDICATES THE NUMERIC TO
CHARACTER CONVERSION OF THE AGE******; Obs name age monthsalary nameage
**USING THE PUT FUNCTION***; The SAS System 00:29 Wednesday, July 1, 2009 44
23. INPUT
The Input function is used to convert the character values to the numeric values for the SAS to perform the
calculations.
* Here salary is been converted from the The SAS System 00:44 Wednesday, July 1, 2009
character values to the numeric values;
Obs name age monthsalary salary yearsalry
data amit;
input name $ age monthsalary $ ; 1 amit 34 123,45.0 12345 148140
datalines; 2 na 23 213,45.0 21345 256140
amit 34 123,45.00
na 23 213,45.00
;
run;
data amit1;
set amit;
salary=input(monthsalary, comma9.2);
yearsalry=salary*12;
run;
proc print data=amit1;
run;
24. CATX
The catx function helps in the concatenating of the character strings and no need of left trim.
The CATX function first copies item-1 to the result, omitting leading and trailing blanks. Then for each subsequent
argument item-i, i=2, ..., n, if item-i contains at least one non-blank character, then CATX appends delimiter and
item-i to the result, omitting leading and trailing blanks from item-i. CATX does not insert the delimiter at the
beginning or end of the result. Blank items do not produce delimiters at the beginning or end of the result, nor do
blank items produce multiple consecutive delimiters.
29 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.14 seconds
cpu time 0.03 seconds
data amit; The SAS System 00:44 Wednesday, July 1, 2009
input name $ month year day;
datalines; Obs name month year day
amit 10 1981 13 1 amit 10 1981 13
pre 04 1982 20 2 pre 4 1982 20
;
run;
25. CAT
The cat just concatenates the value of variables.
The CAT function specifies a constant, variable, or expression, either character or numeric. If item is numeric, then its
value is converted to a character string by using the BESTw. format. In this case, leading blanks are removed and SAS
does not write a note to the log.
*CAT function with a series The SAS System 00:44 Wednesday, July 1, 2009
*The cat just concatenates the value of
variables, it just concatenates and does Obs x1 x2 x3 string
not remove space, leading or trailing;
1 a b c a bc
data a;
x1=' a';
x2=' b';
x3='c';
string=cat(of x1-x3);
run;
proc print data=a;
run;
*CAT function with a series of vars with The SAS System 00:44 Wednesday, July 1, 2009
colon
*The cat just concatenates the value of Obs x1 x2 x3 string
variables, it just concatenates and does
not remove space, leading or trailing; 1 a b c a bc
data a;
x1=' a';
x2=' b';
x3='c';
string=cat(of x:);
run;
proc print data=a;
run;
26. CATS
The cats just concatenates the value of variables. The cats is just equal to Strip or trim(left(var));.
The CATS function specifies a constant, variable, or expression, either character or numeric. If item is numeric, then
its value is converted to a character string by using the BESTw. format. In this case, SAS does not write a note to the
log.
27. CATT
CatT is equal to TRIM. The catt just concatenates the value of variables, after applying the TRIM on them
The CATT function specifies a constant, variable, or expression, either character or numeric. If item is numeric, then
its value is converted to a character string by using the BESTw. format. In this case, leading blanks are removed and
SAS does not write a note to the log.
28. FIND
* The find and index have the following differences;
1. The FIND function searches for substrings of characters in a character string, whereas the FINDC function
searches for individual characters in a character string.
2. The FIND function and the INDEX function both search for substrings of characters in a character string.
However, the INDEX function does not have the modifiers nor the start pos arguments.
235 data a;
*lets test the find function and the value 236 x='my name is amit';
returned, here score has a value 4 name is 237 score=find(x,'name');
the 4 char in string; 238 put score=;
239 run;
data a; score=4
x='my name is amit'; NOTE: The data set WORK.A has 1 observations and 2
score=find(x,'name'); variables.
put score=; NOTE: DATA statement used (Total process time):
run; real time 0.00 seconds
cpu time 0.01 seconds
*FIND function with a modifier 242 data a;
***The i modifier ignores the case of 243 x='my NAME is amit';
substring; 244 score=find(x,'name','i');
245 put score=;
data a; 246 run;
x='my NAME is amit';
score=find(x,'name','i'); score=4
put score=; NOTE: The data set WORK.A has 1 observations and 2
run; variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
***The t modifier strips the leading and cpu time 0.00 seconds
the trailing spaces in string and
substring;
251 data a;
data a; 252 x='my name is amit';
x='my name is amit'; 253 score=find(x,'name','t');
score=find(x,'name','t'); 254 put score=;
put score=; 255 run;
run;
score=4
NOTE: The data set WORK.A has 1 observation and 2
variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 second
* The Find function helps in the searching The SAS System 00:44 Wednesday, July 1, 2009
of a string in the character variable, it
returns the position of the string if Obs Name Sex Age Date Height
found else 0;
1 Reberson, P F 32 9 67
data amit; 2 Eberhardt, S F 49 27 64
set sasuser.admit; 3 Oberon, M F 28 17 62
if find(name,'be','t') gt 0; /* thet 4 Derber, B M 25 23 75
modifier here trims the trailing blanks of
the name var*/
run;
29. COUNT
The count function is used to count the ocuurence of a substring in a string
*Modifiers in count
***The count again take two modifiers
similar to find, the i and t, i for
ignoring the case and t for trimming the
balnk space;
30. COUNTW
This function counts the number of words in a string
y=4
NOTE: The data set WORK.A has 1 observations and 2
variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
31. INT
The INT function returns the integer value of a numeric variable, thus discarding the decimal portion
y1=round(x1,2);
y2=round(x2);
y3=round(x3);
y4=round(x4);
y5=round(x5);
run;
33. N
The N function counts the number of non-missing values in a row
nvars_nonmiss=n(of x1-x3); 1 2 3 4 3
put nvars_nonmiss=;
run;
nvars_nonmiss=n(of x:);
put nvars_nonmiss=;
run;
35. INTCK
The INTCK function returns the value of the complete interval passed between two dates, it can take diff arguments
like week, month, year.
z3=intck('year',x1,y1);
run;
36. INTNX
The INTNX function increments the day, year or month on the specified date, here it increments the date by 6
weeks, means date after 6 weeks including current week.
z=01NOV1981
data amit; NOTE: The data set WORK.AMIT has 1 observations and 2
x='13oct1981'd; variables.
z=put(intnx('month',x,1,'b'),date9.); NOTE: DATA statement used (Total process time):
put z=; real time 0.00 seconds
run; cpu time 0.00 seconds
* The intnx function increments the day, 511 data amit;
year or month on the specified date, you 512 x='13oct1981'd;
can use the e or end argument then the 513 z=put(intnx('month',x,1,'e'),date9.);
output date would be 1 month and first 514 put z=;
day.; 515 run;
'ACT/ACT'
uses the actual number of days between dates in calculating the number of years. SAS calculates this value as the
number of days that fall in 365-day years divided by 365 plus the number of days that fall in 366-day years divided by
366.
38. REVERSE
The reverse function just reverse the string, if there are leading blanks they become trailing.
553 data a;
data a; 554 x='abc';
x='abc'; 555 change=reverse(x);
change=reverse(x); 556 put change;
put change; 557 run;
run;
cba
NOTE: The data set WORK.A has 1 observations
and 2 variables.
NOTE: DATA statement used (Total process
time):
real time 0.35 seconds
cpu time 0.01 seconds
data b;
set a;
dose=reverse(substr(reverse(compress(trt)),1,3));
run;