0% found this document useful (0 votes)
76 views15 pages

SAS Report & Frequency Table Guide

The document discusses various procedures in SAS for generating reports from data. It covers the PROC PRINT procedure for listing variable values, PROC FREQ for generating frequency tables, PROC MEANS for calculating descriptive statistics, and the REPORT procedure for customized reporting. Examples are provided for each procedure demonstrating how to produce one-way, two-way, and multi-way tables, calculate statistics, format reports, and export results to new datasets.

Uploaded by

Aymen Kortas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views15 pages

SAS Report & Frequency Table Guide

The document discusses various procedures in SAS for generating reports from data. It covers the PROC PRINT procedure for listing variable values, PROC FREQ for generating frequency tables, PROC MEANS for calculating descriptive statistics, and the REPORT procedure for customized reporting. Examples are provided for each procedure demonstrating how to produce one-way, two-way, and multi-way tables, calculate statistics, format reports, and export results to new datasets.

Uploaded by

Aymen Kortas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 15

Topic: Generating Reports

1. Generating reports using the PRINT procedure

2. Generating frequency tables using PROC FREQ procedure

3. Generating report using MEANS procedure

4. Generating report using REPORT procedure

5. Enhancing report through the use of labels, SAS format, titles,


footnotes and SAS System reporting options

6. Introduction to ODS

1
1. PROC PRINT

PROC PRINT lists the values of the variables in a SAS data set in the output window.
The PROC PRINT procedure you are able to use prints the observations in a SAS data
set, using all or some of the variables.

Example 1 List containing all the variables and all the observations.

data a;
input year sales cost;
profit=sales-cost;
cards;
1981 12132 11021
1982 19823 12928
1983 16982 14002
1984 18432 14590
;
run;
proc print data=a;
title 'Simple PROC PRINT Report';
run;

Example 2 Removal of column OBS and impression limited to the first 5


observations.

data exprev;
input Region $ State $ Month monyy5.
Expenses Revenues;
format month monyy5.;
datalines;
Southern GA JAN95 2000 8000
Southern GA FEB95 1200 6000
Southern FL FEB95 8500 11000
Northern NY FEB95 3000 4000
Northern NY MAR95 6000 5000
Southern FL MAR95 9800 13500
Northern MA MAR95 1500 1000
;
proc print data=exprev(obs=5) noobs;
var month state expenses;
title 'Monthly Expenses for Offices in Each State';
run;

Example 3 Instruction ID with not sorted observations


data bid;
input id age;
datalines;
987 65
687 75
254 55
236 70
;

2
proc print;
run;

Example 4 Instruction ID, after sorting of the observations.


data bid2;
input pat_id age;
datalines;
987 65
687 75
254 55
236 70
;
proc sort;
by pat_id;
run;
proc print;
*var pat_id;
id pat_id;
run;

Example 5 Test of one or the other condition, instruction WHERE

data bid3;
input pat_id age;
datalines;
987 65
687 75
254 55
236 70
;
proc sort;
by pat_id;
run;
proc print noobs;
var age;
where age < 75;
run;

Example 6 Instruction SUM.


data trim;
input vendor $ mois quantite COMMA5. tot_vendor comma11.2;
datalines;
Hollingsworth 04 530 10,573.50
Hollingsworth 05 1,120 22,344.00
Hollingsworth 05 1,030 20,548.50
Jones 04 1,110 22,144.50
Jones 04 675 13,466.25
Smith 04 1,715 34,214.25
Smith 06 512 10,214.40
Smith 06 1,000 19,950.00
;
run;
PROC PRINT DATA=trim NOOBS;
TITLE 'Total of variables names';
VAR vendor mois quantite tot_vendor;

3
WHERE quantite>500 OR tot_vendor>20000;
SUM quantite tot_vendor;
RUN;

Example 7 Instructions SUM and BY statement


data trim2;
input vendor $ mois quantite COMMA5. tot_vendor comma11.2;
datalines;
Hollingsworth 04 530 10,573.50
Hollingsworth 05 1,120 22,344.00
Hollingsworth 05 1,030 20,548.50
Jones 04 1,110 22,144.50
Jones 04 675 13,466.25
Smith 04 1,715 34,214.25
Smith 06 512 10,214.40
Smith 06 1,000 19,950.00
;
run;
proc sort ;
by vendor;
run;
PROC PRINT DATA=trim2 NOOBS;
TITLE 'Summary by Vender';
by vendor;
VAR vendor mois quantite tot_vendor;
WHERE quantite>500 OR tot_vendor>20000;
SUM quantite tot_vendor;
RUN;

Example 8 Instruction PAGEBY and SUMBY


data trim3;
input vendor $ mois quantite COMMA5. tot_vendor comma11.2;
datalines;
Hollingsworth 04 530 10,573.50
Hollingsworth 05 1,120 22,344.00
Hollingsworth 05 1,030 20,548.50
Jones 04 1,110 22,144.50
Jones 04 675 13,466.25
Smith 04 1,715 34,214.25
Smith 06 512 10,214.40
Smith 06 1,000 19,950.00
;
run;
proc sort ;
by vendor;
run;
PROC PRINT DATA=trim3 NOOBS;
TITLE 'Summary by Vender';
by vendor;
VAR vendor mois quantite tot_vendor;
WHERE quantite>500 OR tot_vendor>20000;
SUM quantite tot_vendor;
SUMBY vendor;
PAGEBY vendor;
RUN;

4
Pageby option controls page ejects that occur before a page is full.
Sumby option limits the number of sums that appear in the report.

Example 9 Options with Label, Double, Title and Footnote


PROC PRINT DATA=trim5 NOOBS label double;
TITLE 'Summary by Vendor';
FOOTNOTE ' Results of Data-trim4';
VAR vendor mois quantite tot_vendor;
SUM quantite tot_vendor;
LABEL vendor = 'Vendor'
mois = 'Mois'
quantite = 'Quantity of Vendor'
tot_vendor = 'Total of Vendors';
RUN;

2. Generating frequency tables using PROC FREQ procedure


The FREQ procedure produces one-, two- , …n-way (crosstabulation) frequency tables.
The SAS procedure PROC FREQ is commonly used to produce summary data in tabular
form. It can be used on either character or numeric data.

Example 1: simple output


data freq_01;
input name $ sex $ age height ;
datalines;
Allan M 25 174
Ben M 23 176
Christine F 24 164
David M 32 174
Edwards M 36 178
Frank . 38 172
Glory F 28 166
Hill M 32 170
Jack M 33 172
Katy . 26 162
Lily F 24 169
Paula F . 163

;
proc freq data=freq_01;
title1 'PROC FREQ: example 01';
title2 'No keywords specified';
run;

Example 2: One-way table


proc freq data=freq_02;
tables sex age;
title '1-way tables for sex and age specified by TABLES keyword';
run;
Example 3: 2-way tables
proc freq data=freq_03;
tables sex*age;
title 'two-way tables for sex and age specified by TABLES keyword';
run;
Example 4: 2-way tables listing of the values of the variables side by side

5
proc freq data=freq_04;
tables sex*age/list missing;
title 'two-way tables using list and missing';
run;
Example 5: 3 way tables
proc freq data=freq_05;
tables name*sex*age ;
title '3 way table: height by weight, controlling for name';
/*The first variable represents the control variable*/;
run;
Example 6: Weight data
DATA freq_06 ;
INPUT rater1 rater2 count;
DATALINES;
1 1 5
1 2 2
1 3 1
2 1 1
2 2 7
2 3 2
3 1 1
3 2 2
3 3 9
;
PROC FREQ DATA = freq_06 ;
WEIGHT count;
TABLE rater1*rater2 ;
TITLE1 'weighted data';
RUN;
Example 7: Out option
proc freq data=freq_01;
tables age /out =freq_07;
title ' create new dataset using out option';
run;
Example 8: Missing / Missprint
proc freq data=freq_01;
tables sex /missing ;/*misprint*/
run;

Example 9: NLEVELS
data a;
input agegrp sex $ @@;
datalines;
1 F 1 F 1 M 2 M 2 M 3 F 3 F 3 F
;
proc freq data=a nlevels;
tables sex*agegrp/noprint;
run;

This option in PROC FREQ statement is used to display the “Number of Variable Levels”
. By using this options we can get the distinct count for each variable listed in the
TABLES statement. It is best to use NLEVELS and NNOPRINT options option to know
the number of levels of variable before printing the frequency table if the data set is too
large in size and if you unaware of what could be distinct count for each variable.

6
Example 10: CROSSLIST
displays crosstabulation tables in ODS column format. This option creates a table that
has a table definition that you can customize by using the TEMPLATE procedure.
proc freq data=a nlevels;
tables sex*agegrp/crosslist;
run;

Warning: you cannot specify both the LIST option and the CROSSLIST option in the
same TABLES statement.

The CROSSLIST option looks the same as LIST option, but there are few differences as
shown below.
CROSSLIST LIST
Statistics FREQUENCY FREQUENCY
PERCENT PERCENT
ROW PERCENT CUMULATIVE FREQUENCY
COLUMN PERCENT CUMULATIVE PERCENT

Totals Produces sub totals and final totals. No totals

Rows with Display the variable levels with Suppress the variable levels with
zero frequencies zero frequencies

3. Generating report using MEANS procedure


PROC MEANS only be performed on numeric data
 This procedure gives the following summary statistics (if no options are specified)
for each variable listed in the VAR statement: number of observations, mean,
standard deviation, minimum value, maximum value, and standard error of the
mean.
 Options: You can select which statistics you want to have printed from the
following list: N, NMISS (number of missing values), MEAN, STD, VAR, MIN,
MAX, SKEWNESS, KURTOSIS, lclm and uclm options produce the lower and
upper endpoints of a 100(1-alpha)% confidence interval for the mean.

Syntax:
PROC MEANS options;
VAR variables;
BY variables;
CLASS variables;
OUTPUT OUT= SAS dataset keyword<(var list)> =newvar(s) ... / options>..;

The following keywords can be used with PROC MEANS to compute statistics:
Descriptive Statistics
Keyword Description
CLM Two-sided confidence limit for the mean
CSS Corrected sum of squares
CV Coefficient of variation

7
KURTOSIS Kurtosis
LCLM One-sided confidence limit below the mean
MAX Maximum value
MEAN Average
MIN Minimum value
N Number of observations with nonmissing values
NMISS Number of observations with missing values
RANGE Range
SKEWNESS Skewness
STDDEV / STD Standard deviation
STDERR Standard error of the mean
SUM Sum
SUMWGT Sum of the Weight variable values.

UCLM One-sided confidence limit above the mean


USS Uncorrected sum of squares
VAR Variance

Quantile Statistics

Keyword Description
MEDIAN / P50 Median or 50th percentile
P1 1st percentile
P5 5th percentile
P10 10th percentile
Q1 / P25 Lower quartile or 25th percentile
Q3 / P75 Upper quartile or 75th percentile
P90 90th percentile
P95 95th percentile
P99 99th percentile
QRANGE Difference between upper and lower quartiles: Q3-Q1

Example 1: Create output using the default setting


proc means data= freq_01;
run;
Example 2: Specifies a series of keywords and optional statements
proc means data= freq_01 n mean min max sum nmiss maxdec=1;
/*Apply analysis only to "age" variable*/
var age;
/*Separate the analysis by values of sex*/
class sex;
run;
Example 3: Create new dataset a series of keywords and optional statements
proc sort data= freq_01;
by sex;

8
run;
proc means data=freq_01 n mean min max sum nmiss lclum uclm;
/*Separate the output by sex*/
var age;
by sex;
/*Create a temporary SAS data set containing the information generated by PROC MEANS */
output out=agedata;
run;

4. Generating report using REPORT procedure and Enhancing report through


the use of labels, SAS format, titles, footnotes and SAS System reporting options

4.1 Overview
The REPORT procedure combines features of the PRINT, MEANS, and TABULATE
procedures with features of the DATA step in a single report-writing tool that can produce
a variety of reports. It has features that allow the presentation of detail (individual
observations - list report) and summary data (summary report), incorporated into an
organized report.

4.2 Concept
Report writing is simplified if you approach it with a clear understanding of what you
want the report to look like. Once you understand the layout of the report, use the
COLUMN and DEFINE statements in PROC REPORT to construct the layout.

The COLUMN statement lists the items that appear in the columns of the report,
describes the arrangement of the columns, and defines headers that span multiple
columns.

The DEFINE statement (in the windowing environment) defines the characteristics of an
item in the report. These characteristics include how PROC REPORT uses the item in the
report, the text of the column header, and the format to use to display values.

Planning the desired layout of a list / summary report is key to learning to use REPORT
well. A report’s layout is determined in great part by the designation of variables into
various categories.

4.3 Usage of Variables in a Report


Much of a report's layout is determined by the usages that you specify for variables in the
DEFINE statements or DEFINITION windows. For data set variables, these usages are

DISPLAY ORDER ACROSS GROUP ANALYSIS

A report can contain variables that are not in the input data set. These variables must have
a usage of COMPUTED.

4.3.1 Display Variables


A row appears for every observation for one or more variables with this designation in
the input data set. Display variables do not affect the order of the rows in the report.

9
4.3.2 Order Variables
PROC REPORT orders the detail rows according to the ascending, formatted values of
the order variable. You can change the default order with ORDER= and DESCENDING
in the DEFINE statement. A row appears for every observation for variables with this
designation, as a display variable, but this designation will order by value.

4.3.3 Group Variables


If a report contains one or more group variables, PROC REPORT tries to consolidate into
one row all observations from the data set that have a unique combination of formatted
values for all group variables. This designation groups on variable values to determine
rows in the report, similar to using a CLASS variable in other procedures.

The following table compares the effects of using order variables and group variables.
ORDER GROUP
Rows are ordered yes yes
Repetitious printing of values is
yes yes
suppressed
Rows that have the same values
no yes
are collapsed
Type of report produced list summary

4.3.4 Analysis Variables


An analysis variable is a numeric variable that is used to calculate a statistic for all the
observations represented by a cell of the report. You associate a statistic with an analysis
variable in the variable's definition or in the COLUMN statement. By default, PROC
REPORT uses numeric variables as analysis variables that are used to calculate the Sum
statistic.

4.3.5 Across usage

Define the item, which must be a data set variable, as an across variable.

4.3.6 Computed Variables


Computed variables are variables that you define for the report. They are not in the input
data set, and PROC REPORT does not add them to the input data set. However,
computed variables are included in an output data set if you create one.

4.4 Procedure Syntax

PROC REPORT
BREAK location break-variable</ option(s)>;
RBREAK location </ option(s)>;
BY <DESCENDING> variable-1
<...<DESCENDING> variable-n> <NOTSORTED>;
COLUMN column-specification(s);

10
COMPUTE location <target>
</ STYLE=<style-element-name>
<[style-attribute -specification(s)]>>;
ENDCOMP;
COMPUTE report-item </ type-specification>;
DEFINE report-item / <usage>
Standard options: headline, headskip, center, nowd, split et al;

You can use the following statistics in PROC REPORT:

Statistic Definition
CSS Corrected sum of squares
USS Uncorrected sum of squares
CV Coefficient of variation
MAX Maximum value
MEAN Average
MIN Minimum value
N Number of observations with nonmissing values
NMISS Number of observations with missing values
RANGE Range
STD Standard deviation
STDERR Standard error of the mean
SUM Sum
SUMWGT Sum of the Weight variable values.

PCTN Percentage of a cell or row frequency to a total frequency


PCTSUM Percentage of a cell or row sum to a total sum
VAR Variance
T Student's t for testing the hypothesis that the population mean is 0
PRT Probability of a greater absolute value of Student's t

5. How PROC REPORT Builds a Report

Our purpose is to provide some methods, through a progressive series of examples, to


enhance the output created when using PROC REPORT.

Example 1: Simple report

data grocery;
input Sector $ Manager $ Department $ Sales @@;
datalines;
se 1 np1 50 se 1 p1 100 se 1 np2 120 se 1 p2 80
se 2 np1 40 se 2 p1 300 se 2 np2 220 se 2 p2 70
nw 3 np1 60 nw 3 p1 600 nw 3 np2 420 nw 3 p2 30

11
nw 4 np1 45 nw 4 p1 250 nw 4 np2 230 nw 4 p2 73
nw 9 np1 45 nw 9 p1 205 nw 9 np2 420 nw 9 p2 76
sw 5 np1 53 sw 5 p1 130 sw 5 np2 120 sw 5 p2 50
sw 6 np1 40 sw 6 p1 350 sw 6 np2 225 sw 6 p2 80
ne 7 np1 90 ne 7 p1 190 ne 7 np2 420 ne 7 p2 86
ne 8 np1 200 ne 8 p1 300 ne 8 np2 420 ne 8 p2 125
;
proc format;
value $sctrfmt 'se' = 'Southeast'
'ne' = 'Northeast'
'nw' = 'Northwest'
'sw' = 'Southwest';

value $mgrfmt '1' = 'Smith' '2' = 'Jones'


'3' = 'Reveiz' '4' = 'Brown'
'5' = 'Taylor' '6' = 'Adams'
'7' = 'Alomar' '8' = 'Andrews'
'9' = 'Pelfrey';

value $deptfmt 'np1' = 'Paper'


'np2' = 'Canned'
'p1' = 'Meat/Dairy'
'p2' = 'Produce';
run;
proc report data=grocery headline headskip nowd ;
column sector manager department sales;
define sector /order format =$sctrfmt.;
define manager/display format =$mgrfmt.;
define department /display format =$deptfmt.;
define sales /analysis format= dollar11.2;
run;

Example

2: Adding headline, headskip, and title statement et al options

proc report data=grocery headline headskip nowd ;


column sector manager department sales;
define sector /order format =$sctrfmt.;
define manager/group format =$mgrfmt.;
define department /group format =$deptfmt.;
define sales /analysis format= dollar11.2;
title ' All info of Sales for all Sectors';
run;

The difference between display variable and group variable is seen carefully in outputs.

Example 3

: Adding break after statement, width et al options

proc report data=grocery headline headskip nowd ;


column sector manager department sales;
define sector /order format=$sctrfmt. width=6;

12
define manager/group format =$mgrfmt. width=7;
define department /group format =$deptfmt. width=10;
define sales /analysis format= dollar11.2 width=8;
title ' All info of Sales for all Sectors';
break after sector/ page summarize skip;
run;

In this example, the paging effect could also be achieved by using the Break After
statement.
If you adjust width value, what happen? Width modifies each column or each variable.

Skip: write a blank line for the last break line of a break

Example 4

: Adding spacing modifier

proc report data=grocery headline headskip nowd ;


column sector manager department sales;
define sector /order
format=$sctrfmt. width=6 spacing= 14;
define manager/group format =$mgrfmt. width=7 spacing=10;
define department /group format =$deptfmt. width=10 spacing=8;
define sales /analysis format= dollar11.2 width=8 spacing=8;
title ' All info of Sales for all Sectors';
break after sector/ page summarize skip;
run;

Note: Spacing modifies between columns, making the display more balanced.
Please give different values with spacing modifier.

Example 5: Across usage


proc format;
value forqtr 1='1st' 2='2nd';
run;
proc report data=rept2.budget;
column dept account qtr,budget budget;
define dept/group format=$10. width=10 'Department';
define account/group format=$8. width=8 'Account';
define qtr/across format=forqtr12. width=12 '_QTR_';
define budget/sum format=dollar11.2 width=11 'BUDGET';
run;

Example 6: Adding computed variables

proc report data=grocery headline headskip nowd ;


column sector department sales profit;
define sector /group format=$sctrfmt. width=10 spacing= 10;
define department /group format =$deptfmt. width=10 spacing=6;
define sales /analysis format= dollar11.2 width=8 spacing=6;
define profit /computed format =dollar11.2 width=6 spacing=6;
compute profit;

13
if department = 'np1' or department ='np2'
then profit=0.4*sales.sum;
else profit = 0.25*sales.sum;
endcomp;
rbreak after/ dol dul summarize ;
compute after;
sector='TOTAL:';
endcomp;
title ' Report for Sectors';
run;

Note:
DOL: double overline each value;
DUL: double underline each value
SUMMARIZE: include a summary line as one of the break lines
AFTER : places the break lines at the end of the report.
BEFORE: places the break lines at the beginning of the report
Ex. Replace ‘ after’ with ‘before’ followed by rbreak statement to run;

Example 7: Creating and Processing an Output Data Set


proc report data=grocery nowd
out=temp( where=(sales gt 600) );
column manager sales;
define manager / group noprint;
define sales / analysis sum noprint;
run;

6. Introduction to Output Delivery System

6.1. Introduction.

6.1.1 What is ODS?


The Output Delivery System is a set of SAS commands that allow users to manage the
output of their programs and procedures. Using ODS, one can:
· create HTML files
· select and exclude output tables from display
· manipulate the layout and format of output tables
· create SAS datasets directly from output tables

6.1.2 Why ODS?


ODS provides an easy and convenient way to prepare SAS output for web and paper
publishing. It gives the user a high degree of control over the appearance of output.

6.1.3 Destinations.
Version 8 supports the following destinations: HTML, Printer, Listing (the usual SAS
output), RTF (rich-text format), and Output. The Listing destination is open by default
and can be closed by issuing:
ods listing close;

6.2. The HTML Destination.

14
The HTML destination can produce four different types of files: body, contents, page
(V8), and frame files. The body file contains the output, the other files are merely
auxiliary HTML code.
/*example: creating basic HTML output*/
data sample;
input x y z;
datalines;
8.9 4.2 9.3
5.7 6.0 2.0
6.4 5.5 2.7
9.1 6.7 1.0
7.3 3.5 0.2
6.3 8.0 8.4
1.7 5.6 6.6
1.9 2.5 0.3
8.0 3.0 3.2
0.1 5.0 3.5
;

Note:
BODY= identifies the file that contains the HTML output.
CONTENTS= identifies the file that contains a table of contents to the HTML output.
The contents file links to the body file.
FRAME= identifies the file that integrates the table of contents, the page contents, and
the body file. If you open the frame file, you see a table of contents, a table of pages, or
both, as well as the body file.
PAGE= identifies the file that contains a description of each page of the body file and
links to the body file.

/* SAS programming*/;
ods html body="c:\reg.htm";
proc reg data=sample;
model y=x;
run;
quit;
ods html body= "c:\reg.htm"
contents= "c:\regc.htm"
page= "c:\regp.htm"
frame= "c:\regf.htm";
proc reg data=sample;
model y=x;
run;
quit;
ods html close;
ods pdf file = "a:\reg.pdf";
proc reg data=sample;
model y=x;
run;
quit;

15

You might also like