0% found this document useful (0 votes)
4 views22 pages

BUP 02 Data Management

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views22 pages

BUP 02 Data Management

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Dr. Md. Abdus Salam Akanda Website: http://du.ac.

bd
Professor of Statistics, DU E-mail: [email protected]

Data Management

Data Management contains Data Manipulation and Data Transformation.

Data Manipulation I:
Data Manipulation: To create new data file from any existing data file according to the need
of the researcher.
Data Manipulation includes
A. Inserting Variables B. Inserting Cases
C. Go to Case / Variable D. Merging Files

A. Inserting Variables:
Suppose that you want to insert more variables in the data file just created. Inserting
variables in an existing data file is not a complex job.

To do this job first create a new data file or open an existing data file.
To insert a new variable in the existing data file we can follow any one of the following
techniques.

Technique 1: (Using Variable View)


Click the Variable view left below the Data Editor window.
Then clicking the right mouse button select a row where you want to see your new variable
and click on insert variable.
Now you can edit the variable name, label etc.

Technique 2: (Using data view)


Click on data view. Put the cursor on a cell of the column where you want your new variable
present.
Now clicking the right mouse button select a Column where you want to see your new
variable and click on insert variable.

Technique 3: (Using Toolbar)


You can use a Toolbar item to insert a variable. Put the cursor in a cell of the row/column

1
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

where you want the variable and then click the toolbar item.

Technique 4: (Using Manu Bar)


You can use a Manu Bar to insert a variable. Click on Edit (Menu Bar) and then click on
Insert Variable. Now you can edit the variable name, label etc.

B. Inserting Case:
Sometimes you may need to consider more cases anywhere in a data file after its creation.
You can do it by following any of the procedures.

Technique 1: (Using Data View)


Just by clicking the right mouse button select an entire row where you want the new case to
put in.
Then click on Insert Case. Now you can enter information on different variables for that case
(Individual).

Technique 2:
Put the cursor in a cell of the row where you want the case.
Now click on Edit (Menu Bar) and then click on Insert Case. Now you can enter information
on different variables for that case (Individual).

Technique 3:
You can use the Toolbar menu to insert a case. Put the cursor in a cell of the row where you
want the case. Then click on the Toolbar item. Now you can enter information on different
variables for that case (Individual).

C. Finding Cases, Variables:


The Go To dialog box finds the specified case (row) number or variable name in the Data
Editor.
Cases

For cases, from the menus choose:

Edit
Go to Case...

Enter an integer value that represents the current row number in Data View.

Note: The current row number for a particular case can change due to sorting and other
actions.
Variables

For variables, from the menus choose:

Edit
Go to Variable...

2
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Enter the variable name or select the variable from the drop-down list.

You can use the Toolbar menu to find cases, variables and imputations.

D. Merging Files:
Sometimes you may encounter the data files that contain (i) same variables under same
name/different names (ii) different variables. Then you need to take them in a single data
file. These can be done through a procedure known as Merge Files. To do these jobs, let us
create the following data files named spss-test1, spss-test2, spss-test3, spss-test4 and spss-
test5.
Data file: spss-test1
Time spent for
Education Age
Identification # Internet Use Marital Status
(in years) (in years)
(id) (in hours) (marista)
(edu) (age)
(inttime)
1 16 30 3 1
2 18 32 5 1
3 17 37 4 2
4 15 30 1 1
5 14 26 4 2
6 13 27 1 2
7 15 28 4 2
8 11 24 2 2
9 16 32 3 1
Marital Status: Married=1, Unmarried=2

Data file: spss-test2


Time Spent for
Education Age
Identification # Internet Use Marital Status
(in years) (in years)
(id) (in hours) (marista)
(edu) (age)
(inttime)
10 15 25 4 2
11 18 36 5 1
12 13 27 1 2
13 11 24 2 2
14 17 31 4 2
15 19 39 3 1
Marital Status: Married=1, Unmarried=2

Data file: spss-test3 (spss-test1 with different name of the variables)


Time Spent for
Education Age
Identification # Internet Use Marital Status
(in years) (in years)
(id) (in hours) (ms)
(ys) (ageyears)
(timeint)
1 16 30 3 1
2 18 32 5 1

3
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

3 17 37 4 2
4 15 30 1 1
5 14 26 4 2
6 13 27 1 2
7 15 28 4 2
8 11 24 2 2
9 16 32 3 1
Marital Status: Married=1, Unmarried=2

Data file: spss-test4


Identification # Monthly Salary (Tk.) Mobile User Monthly Expenditure (Tk.)
(monsal) Status (monexp)
(id) (mobsta)
1 12000 No 9000
2 26000 Yes 16000
3 21000 Yes 20000
4 19000 Yes 15000
5 24000 No 15000
6 23000 Yes 16000
7 30000 Yes 25000
8 12000 No 6000
9 13000 Yes 11000

Data file: spss-test5 (spss-test3 selected for the cases with age >=30)
Education Age Time Spent for
Identification # Marital Status
(in years) (in years) Internet Use
(id) (ms)
(ys) (ageyears) (in hours) (timeint)
1 16 30 3 1
2 18 32 5 1
9 16 32 3 1
3 17 37 4 2
4 15 30 1 1

Now these files can be merged in two ways:


(i) Merge by Adding cases
(ii) Merge by Adding variables

(i) Adding Cases:


You can add cases in two different ways.

A. Firstly, adding cases of two files with same number and same spelling of
variables.
Merge the files: spss-test1 & spss-test2
Keep a data file (spss-test1) open and follow the instructions given below.
Click on Data (Menu Bar)
Merge Files
Add Cases
Then you will have an option Browse. By which you have to select the data file from where
you want the cases to be included in the opened data file and click on Open.
Click Continue

4
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Then click on OK.


Then save the data file by different name (not existing file name).

B. Secondly, adding cases of two files with same number but different
spelling of variables.
For example, both ys in spss-test3 and edu in spss-test2 are same meaning. In such cases, you
can add cases by the following way:
Merge the files: spss-test3 & spss-test2
Keep a data file (spss-test3) open and follow the instructions given below.
Click on Data (Menu Bar)
Merge Files
Add Cases
Then you will have an option Browse. By which you have to select the data file from where
you want the cases to be included in the opened data file and click on Open.
Click Continue
Now select the variables (pair wise) in the box Variables in New Active Dataset. Then
click on OK.
Then save the data file by different name (not existing file name).

(ii) Adding variables


For merging variables, all of the variables of two different files should be different. Make
sure that the files to be matched are sorted by case id.

A. Firstly, adding variables of two files with same number of cases.


Merge the files: spss-test1 & spss-test4
Keep a data file (spss-test1) open and follow the instructions given below.
Click on Data (Menu Bar)
Merge Files
Add Variables
Then you will have an option Browse. By which you have to select the data file from where
you want the variables to be included in the opened data file and click on Open.
Click Continue
Then click on OK.
Then save the data file by different name (not existing file name).

B. Secondly, adding variables of two files with different number of cases.


Case1: Open file provides smaller number of cases and your study is interested on these
cases.
Merge the files: spss-test5 & spss-test4
Keep spss-test5 open and follow the instructions given below.
Click on Data (Menu Bar)
Sort cases
Sort by: id, Sort Order: Ascending
Click OK. [Note: You must save the file if we want to close the file.]
Click on Data (Menu Bar)
Merge Files
Add Variables
Then you will have an option Browse. By which you have to select the data file from where
you want the variables to be included in the opened data file and click on Open.

5
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Click Continue
Select: Match cases on key variables in sorted files
Select: Non-active data set is keyed table
Select: Excluded Variables (id) into the box Key Variables
Then click on OK.

You have received a Warning. Click OK.


Then save the data file by different name (not existing file name).

Case2: Open file provides larger number of cases and your study is interested in the file that
is not opened (it provides smaller cases than the opened file).
Merge the files: spss-test4 & spss-test5
At first open the file spss-test5 and follow the instructions given below.
Click on Data (Menu Bar)
Sort cases
Sort by: id, Sort Order: Ascending
Click OK. Now save the file.
Now keep spss-test4 open and follow the instructions given below.
Click on Data (Menu Bar)
Merge Files
Add Variables
Then you will have a window.
Select: spss-_test5.sav[DataSet2] in the box: An open dataset
By which you have to select the data file from where you want the variables to be included in
the opened data file and click on Open.
Click Continue
Select: Match cases on key variables in sorted files
Select: Active data set is keyed table
Select: Excluded Variables (id) into the box Key Variables
Then click on OK.
You have received a Warning. Click OK.
Then save the data file by different name (not existing file name).

End of Data Manipulation I !!!

6
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Data Manipulation II:


Data Manipulation II includes
A. Splitting File
B. Case Selection
C. Selecting a Random Sample

A. Splitting File:
You may encounter some situations where you need to analyze data based on the categories
of one or more categorical variables. For example, you might seek to know the picture of
income or education among the sexes either for comparison purposes or for organizing the
output by groups. To do this you have the SPSS tool Split File.
Split File splits the data file into separate groups for analysis based on the values of one or
more grouping variables. If you select multiple grouping variables, cases are grouped by each
variable within categories of the prior variable on the Groups Based On list. For example, if
you select Sex as the first grouping variable and Religion as the second grouping variable,
cases will be grouped by Religion classification within each Sex category.
Note that you can specify up to eight grouping variables and cases should be sorted by values
of the grouping variables.

For splitting file, Let us create the following data file or open an existing data file:

Subject Gender Age Religion Education Income per day


1 1 25 1 15 100
2 1 32 2 15 200
3 1 30 3 16 150
4 2 24 1 13 125
5 2 20 2 12 120
6 2 27 3 19 125
7 1 25 1 12 175
8 1 33 2 13 200
9 2 39 3 15 120
10 2 40 1 17 125
11 1 32 2 11 135
12 2 27 3 10 130
13 1 37 3 14 100
14 1 19 2 12 150
15 2 25 1 15 120
Gender: 1=Male, 2=Female
Religion: 1=Muslim, 2=Hindu, 3=Christian

1st approach: Group Comparison


Suppose you want to compare income of the Male and Female. Then it would be convenient
if the results are shown in a single table for both sexes separately. So to get output in that
format you should do splitting the data file. And you can split your data file just following the
instructions given below:

Click on Data (Menu Bar) and then Split File


Select Compare groups.

7
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Now select the variable Gender and send it to the box Groups Based on:
Click OK

Now if you carry out any analysis on this data file you will get the results in your desired
format.

Again you may seek to compare income among people of different religions and sexes at a
time. Repeat the above steps. Here with Sex variable select the variable Religion and send it
to the box Groups Based on: click OK.

2nd approach: Organizing output by groups


Suppose you want the results are displayed separately in different table for each split-file
group. For example, you might desire to summarize the information on education variable for
people of varied religions separately in different tables. To do these follow the instructions
given below:
Click on Data (Menu Bar) and then Split File
Select Organize output by groups
Now select the variable Religion and send it to the box Groups Based on:
Click OK.

Similarly try for other variables (say Gender).


You can use the Toolbar item to Split a data file. Just click on the Split File tool. And the
Split File box will appear. Now you can do all the steps as before.

B. Case Selection
Sometimes you may have special interest on particular cases only. To cope with this
situation, SPSS will have for you the Select Cases option. For instance, suppose you want to
analyze data considering only male respondents/cases. Then you have to select only male
cases. To do it, use the following instructions:
Click on Data (Menu Bar)
Select Cases
Select If condition is satisfied
Click on If

Select the variable Gender and send it to the expression box.


Click on = sign on the calculator pad, click on 1 and then Continue
Select Unselected Cases are filtered
Click OK.

Again suppose you want to study only Hindu and Christian people. Then select these cases
following the instruction given below:

Click on Data, Select Cases, Select If condition is satisfied


Click on If, send the variable Religion to the expression box
Click on  sign and 2 in the calculator pad
Click OK.
Again you might want to study only Muslim females. You can select the desired cases simply
by taking Sex and Religion variables in the expression box of Select cases: If and writing

8
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

them as follows:
gender=2 & religion=1
Now click on Continue and then OK.

If you want to study the individuals who are either Muslim or females, then you have to
select the desired cases simply by taking Gender and Religion variables in the expression box
of Select cases: If and writing them as follows:
gender=2 | religion=1
Now click on Continue and then OK.

C. Selecting a Random Sample


You can also take random sample of cases. The sample size may be an approximate
percentage of all cases or you can exactly mention the sample size. Suppose you will select a
random sample of size 40% of all cases in your data file. To do this follow the instructions
below:

Click on Data (Menu Bar) then select cases


Select Random sample of cases
Click on Sample
Select Approximately and write 40% in the box
Click on Continue and then OK.
Again suppose you will select a random sample of size 7 from all cases in your data file. To
do this, follow the instructions below:
Click on Data (Menu Bar)
Select Random sample of cases
Click on Sample
Select Exactly and write 7 in the 1st box and 15 in the 2nd box
Click on Continue and then OK.
You can also take a sample just mentioning the range of cases. For example, you may need to
study some consecutive cases from anywhere of the data file (say, cases from 5 to 12). To do
this, follow the instructions given below:
Click on Data (Menu Bar)
Select Based on time or case range
Click on Range, write 5 in the First Case box and write 12 in the Last Case box
Click on Continue and then OK.

Data Transformation:
Compute
Compute command is used to compute values for a variable based on numeric
transformations of other variables. Using this command we can create new variables or
replace the existing variables (for new variables we can also specify the variable type and
label). Note that we can compute values for numeric or string (alphanumeric) variables only.
We can also compute values selectively for subsets of data based on logical conditions. For
computation purposes we can use mathematical and / or logical operators. We can use over
70 built-in functions, including arithmetic functions, statistical functions and other functions.
The general expression of Compute command is as follows:
Compute newvar iable = Arithmetic or Logical expression.
The following steps are followed to compute variables:
I. From the menu choose

9
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Transform
Compute
Computer will show the compute variable dialogue box.
II. Type the name of a single target variable, it can be an existing variable or a new
variable .
III. Write an Arithmetic or Logical Expression in the Numeric Expression field.
To build an expression, either paste components of variable list into the Expression
field or then edit the name or type directly in the expression field. To build Numeric
Expression we can use Existing Variable Names, Arithmetic Operators, Constants and
Functions. Besides we can use Calculator Pad, Variable List and Function List.

Calculator Pad
We can use calculator pad to build Arithmetic or Logical Expression. For using the calculator
pad click the number on it using mouse. It is possible to make complex Expression using this
Calculator Pad. There are three types of operators and one function in calculator pad.

I. Arithmetic Operator: Arithmetic operator is used to make any numeric


expression. Besides to use negative sign we can use the mathematical operator.
The mathematical/arithmetical operators are:
Operator Meaning/use
+ Addition
- Subtraction(or negative sign)
/ Division
* Multiplication
** Exponentiations(To the power)

II. Relational Operator: Relational operators are used to compare the similar type
of elements/variables. For instance, a string variable is compared with another
string variable. Again a numeric variable/value can be compared with another
numeric variable/value. The relational operators are:
Operator Meaning/use
< Less than
> Greater than
 Greater than or equal
 Less than or equal
 or ~= Not equal
= Equal
III. Logical Operator: Logical operator is used to make relatively more Complex
Expression. Suppose we want the people whose age is greater than equal 25 and
less than equal 60, then we can write: Age  25 AND Age  60. The AND
used in this expression is a Logical operator. The Logical Operators are as
follows:

Operator use
AND, & When both the conditions are true
OR When one of the two conditions are true
and another condition is false
NOT When does not satisfy the condition

10
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Functions
There are more than 70 built-in functions, which includes:
 Arithmetic Functions
 Statistical Functions
 Logical Functions
 Missing Value Functions etc.

1) Arithmetic Functions: Some of the arithmetic functions are discussed below:


ABS (numexpr): This function is used to transform the value of a variable to its Absolute
Value. The Num expr stands here for Numeric Expression. For example, if the value of
the variable Scale is -4.7 then if we use the function ABS (Scale) then we will get the
answer 4.7. Also if then we use ABS (Scale) +5, then we will get the result 9.7.
EXP (numexpr): It is used to find the value of e raised to the power numexpr, where e is
the base of the natural logarithms and numexpr is the numerical expression.
SQRT (numexpr): It is used to find the positive square root of a numeric expression,
which can not be negative.
LN (numexpr): It is used to find the e-based logarithm of an expression, which must be
numeric and greater than 0.
LG10 (numexpr): It is used to find the base-10 logarithm of an expression, which must be
numeric and greater than 0.

2) Statistical Functions: Some of the statistical functions are discussed below:


SUM (numexpr,numexpr,……): It is used to find the sum of some arguments that have
valid values. The function requires two or more arguments, which are numeric.
MEAN(numexpr, numexpr,…): It is used to find the arithmetic mean of its arguments
that have valid values. This function requires two or more arguments, which must be
numeric.
SD (numexpr,numexpr,..): It is used to find the Standard Deviation of two or more
arguments which have valid values. This function requires two or more arguments, which
must be numeric.
VARIANCE (numexpr,numexpr,…): This is used to find the variance of its arguments
that have valid values. This function requires two or more arguments, which must be
numeric.
MAX (value,value,…): It is used to find the maximum value of its arguments that have
valid values. This function requires two or more arguments, which must be numeric.
MIN (value,value,…): It is used to find the minimum value of its arguments that have
valid values. This function requires two or more arguments, which must be numeric.

3) Random Number Functions:


NORMAL (stddev): This function is used to generate random number from Normal
Distribution, where standard deviation can be fixed. It creates random numbers from the
normal distribution.

Illustrative Examples

Example 1:
Suppose we want to compute the total marks obtained by the competitors in the written test
for a job in a firm, from the following data:

11
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

English Mathematics General Knowledge


20 45 23
18 40 22
16 35 19
21 41 21
15 38 20

To get the total marks, we will follow the following formula:


Total Marks = Marks in English + Marks in Mathematics + Marks in General
Knowledge.
We will denote the Total Marks as Tmarks. In order to do that we follow the following steps:
(a) Click Transform
(b) In the Compute variable dialog box type Tmarks in the Target variable box
appeared at the left-upper corner in the dialog box.
(c) Using either the calculator pad or the keyboard write English + Mathematics +
General Knowledge in the Numerical Expression box.
(d) Click left mouse button to OK.

Then we will see that a new variable Tmarks has automatically been created on the right-
most column of the data sheet. The data sheets now look like:

English Mathematics General Tmarks


Knowledge
20 45 23 88
18 40 22 80
16 35 19 70
21 41 21 83
15 38 20 73

Example 2:
Suppose we want to compute the Average marks obtained from 3 subjects on a test from the
following data:
Mathematics Statistics Economics
25 23 18
20 22 20
17 19 21
21 21 13
18 20 15
To get the Average marks, we will follow the following formula:
Average Marks = (Marks in Mathematics + Marks in Statistics +Marks in
Economics)/3.

We will denote the Average Marks as avmarks. In order to do that we follow the following
steps:
(a) Click Transform
(b) In the Compute variable dialog box type avmarks in the Target variable box
appeared at the left-upper corner in the dialog box.
(c) Using either the calculator pad or the keyboard write (Mathematics + Statistics +
Economics)/3 in the Numerical Expression box.

12
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

(d) Click left mouse button to OK.

Example 3:
Suppose we want to compute the yearly increment of the employee on the basis of their
salary from the following data using the formula:
Increment = (10% of the salary) + 1000

Salary
10000
15000
12000
13000
14000
17000
15500
16500
17500
To do this, we follow the following steps:
(a) Click Transform
(b) In the Compute variable dialog box type increm in the target variable box
appeared at the left-upper corner in the dialog box.
(c) Using either the calculator pad or the keyboard write (Salary*0.10) + 1000
(d) Click left mouse button to OK.

Data Management
CONDITIONAL TRANSFORMATIONS

Compute variable: If cases


Conditional transformation using If Cases dialog box allows us to apply data
transformations to selected subsets of cases to apply data transformation. A conditional
expression returns a value of true, false, or missing for each case.
 If the result of a conditional expression is true, the transformation is applied to the
case.
 If the result of a conditional expression is false or missing, the transformation is
not applied to the case.
 Most of the conditional expressions use one or more of the relational and logical
operators (discussed earlier).
To fix the conditional expression, click the If in the Compute Variable dialog box, then
the computer will show the If cases dialog box. Then select the option Include If Cases
Satisfies Condition.

13
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Illustrative Examples
Example 4:
Suppose we want to find the deduction from the salary for the transport facility for the
employees of a firm, from the following data of salary. It is given that 5% of salary is
deducted if salary is greater than 12,000 taka.
Salary
10000
15000
12000
13000
14000
17000
15500
16500
17500
To perform this computation, we follow the following steps:
(a) From the menus choose
Transform
Compute……..
Then the Compute Variable dialog box will be open.

(b) In the Compute Variable dialog box type deduct in the Target Variable box appeared at
the left corner in the dialog box.

(c) Using either the calculator pad or the keyboard write

salary * 0.05

(d) Click If box appeared below the calculator pad. This will open
Compute Variable: If Cases dialog box

(e) Select include if case satisfies condition, which is appeared on the upper horizontal wider
bar.

(f) Using either the calculator pad or the keyboard write


salary  12000

in the Compute Variable : IF Cases dialog box.


(This condition specifies that the new variable deduct will be computed only for
cases/records for whom the value of the variable salary is greater than 12000. The cases that
do not satisfy this condition, the new variable deduct will be equal to the system-missing
value.)

(g) Click left mouse button to continue box to return the Compute Variable dialog box.
(h) Click OK.

Now it is seen that a new variable deduct has automatically been created on the right most
column on our data sheet. The data sheet now looks like-

14
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Salary deduct
10000 .
15000 750
12000 .
13000 650
14000 700
17000 850
15500 775
16500 825
17500 875

 Also estimate missing value.


To perform this computation, we follow the following steps:
(a)From the menus choose
Transform
Compute……..
Then the Compute Variable dialog box will be open.

(b) In the Compute Variable dialog box type deduct in the Target Variable box appeared at
the left corner in the dialog box.
(c) Using either the calculator pad or the keyboard write:
‘0’ at the numeric expression
(d) Click If box appeared below the calculator pad. This will open
Compute Variable: If Cases dialog box
(e) Select include if case satisfies condition, which is appeared on the upper horizontal wider
bar.
(f) Using either the calculator pad or the keyboard write
salary <= 12000
in the Compute Variable : IF Cases dialog box.
(g) Click left mouse button to continue box to return the Compute Variable dialog box.
(h) Click OK.

Example 5:
Suppose we want to compute the yearly increment of the employee of an institution who
satisfies the following condition from the following data:
Condition: Increment = 15% of the salary if Job Category = 3 and Experience is greater or
equal to 5 years.
salary jobcat exper
10000 1 5
15000 2 6
17000 3 7
21000 3 8
18000 3 5
Now to find the Increment we follow the following steps:

(a)From the menus choose


Transform
Compute……..

15
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Then the Compute Variable dialog box will be open.


(b) In the Compute Variable dialog box type increm in the Target Variable box appeared at
the left corner in the dialog box.
(c) Using either the calculator pad or the keyboard write
salary * 0.15
(d) Click If box appeared below the calculator pad. This will open
Compute Variable: If Cases dialog box
(e) Select include if case satisfies condition, which is appeared on the upper horizontal wider
bar.
(f) Using either the calculator pad or the keyboard write
jobcat = 3 & exper  5
in the Compute Variable : IF Cases dialog box.
(g) Click left mouse button to continue box to return the Compute Variable dialog box.
(h) Click OK.
Recoding Values
We can modify the data values by recoding them. This is particularly useful for collapsing or
combining categories. We can recode the values within existing variables, or we can create
new variables based on the recoded values of existing variables. That is two types of recoding
is possible:

1. Recode into same variable


2. Recode into different variable

1. Recode into same variable

Recode into Same Variables reassigns the values of existing variables or collapses ranges of
existing values into new values. For example, we can collapse Marks into Marks Range
categories. We can recode numeric and string variables, but we cannot recode numeric and
string variables together. If we select multiple variables, they must be all of the same type.

To recode values of a variable into same variable we follow the following steps:
(a)From the menus choose:
Transform
Recode
Into Same Variables

(b) Select the variable which we want to recode. (If we select multiple variables, they must all
be of the same type, numeric or string).

(c) Click Old and New Values.


(d) We shall see the Recode into Same Variables: Old and New Values dialog box.

We can define values to recode in this dialog box. All value specifications must be the same
data type (numeric or string) as the variables selected in the main dialog box. The variable
whose value is to be recoded is defined as Old Value and after fixing its new value we click
the Add button. We can recode more than one Old Values to one New Value, but we can not
recode one Old Value into more than one new value.

16
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Old Value: The values to be recoded. We can recode single values, range of values.
New Value: The single value into which each old value or range of values is recoded.
Old  New: The list of specifications that will be used to recode the variable(s).We can add,
change and remove specifications from the list.

Illustrative Examples
Example 1:
Suppose we want to define ‘Educational Status’ on the basis of ‘year of schooling’ from the
following data using the following specifications:

Year of schooling new value(code) meaning(value label)

0 =1 = Illiterate
1-5 =2 = Primary
6-10 =3 = Secondary
11-12 =4 = Higher Secondary
13-16 =5 = Graduate
17 =6 = Post Graduate
18+ =7 = Higher

yearsch
15
7
14
8
13
0
18
6
20
11
10
5
12
16

Now, to recode this data into new values, we follow the following steps:
(a)From the menus choose :
Transform
Recode
Into same variables……….
This will open the Recode Into Same Variable dialog box.

(b) Select yearsch from the variable list (left window) and then click the arrow on the vertical
bar of the dialog box with the left mouse.
(c) Then we click on Old and New Values option.
(d) We shall see the Recode Into Same Variables: Old and New Values dialog box.

17
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

(e) Using the Old value and New value options we recode the variable in our desired format.
(f) Click left mouse button to Continue box to return the Recode Into Same Variable dialog
box.
(g) Click OK.

We shall see that the variable yearsch has automatically been recoded on the existing
variable.

Example 2:
Suppose we want to define the ‘Social Status’ on the basis of Income Variable given below
using the following specifications:

Income(Monthly) new value(code) meaning(value label)

Less than 3000 =1 = Lower Class


3001-10000 =2 = Lower Middle Class
10001- 25000 =3 = Middle Class
25001- 100,000 =4 = Higher Middle Class
100001+ =5 = Higher Class

income
20000
1800
35000
56000
3200
17000
78000
22000
900
7000
32000
125000
45000
245000

Now, to recode this data into new values, we follow the following steps:

(a)From the menus choose :


Transform
Recode
Into same variables……….

This will open the Recode Into Same Variable dialog box.

(b) Select income from the variable list (left window) and then click the arrow on the vertical
bar of the dialog box with the left mouse.

18
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

(c) Then we click on Old and New Values option.

(d) We shall see the Recode Into Same Variables: Old and New Values dialog box.

(e) Using the Old value and New value options we recode the variable in our desired format.

(f) Click left mouse button to Continue box to return the Recode Into Same Variable dialog
box.

(g) Click OK.

We shall see that the variable income has automatically been recoded on the existing
variable.
2. Recode into Different Variables

Recode into Different Variables reassigns the values of existing variables or collapses ranges
of existing values into new values for a new variable. For example, we can collapse Marks
into a new variable containing Marks-Range categories. We can recode numeric and string
variables, but we cannot recode numeric and string variables together. If we select multiple
variables, they must be all of the same type. Also we can recode numeric variables into string
variables and string variables into numeric variables.

To recode values of a variable into different variable we follow the following steps:

(a)From the menus choose:


Transform
Recode
Into Different Variables

(b) Select the variable which we want to recode. (If we select multiple variables, they must all
be of the same type, numeric or string).

(c) Enter an output (new) variable name for each new variable and click Change.
(d) Click Old and New Values and specify how to recode values.
 We can define values to recode in the Old and New Value dialog box. All value
specifications must be the same data type (numeric or string) as the variables selected
in the main dialog box.
 We can recode more than one Old Values to one New Value, but we cannot recode
one Old Value into more than one new value.
 If we want to recode a numeric variable into a string variable, you must also select
Output variables are strings.
 Any old values that are not specified are not included in the new variable, and cases
with those values will be assigned the system-missing value for the new variable. To
include all old values that do not require recoding , select All Other Values for the old
value and Copy old value(s) for the new value.

19
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Illustrative Examples

Example 3:
Suppose we want to define the ‘Social Status’ on the basis of Income Variable given below
using the following specifications:

Income(Monthly) new value(code) meaning(value label)

Less than 3000 =1 = Lower Class


3001-10000 =2 = Lower Middle Class
10001- 25000 =3 = Middle Class
25001- 100,000 =4 = Higher Middle Class
100001+ =5 = Higher Class

income
20000
1800
35000
56000
3200
17000
78000
22000
900
7000
32000
125000
45000
245000

Now, to recode this data into new values, we follow the following steps:
(a)From the menus choose :
Transform
Recode
Into Different variables……….
This will open the Recode Into Different Variable dialog box.
(b) Select income from the variable list (left window) and then click the arrow on the vertical
bar of the dialog box with the left mouse.
(c) We shall see the variable income  ? In the Numeric Variables  Output Variable box.
(d) Write the name of new variable i.e. nincome on Output Variable Name : box.
(e) Now label the new variable using the Output Variable Label: box.
(f) Click on Change option.
(c) Then we click on Old and New Values option.
(d) We shall see the Recode Into Different Variables: Old and New Values dialog box.
(e) Using the Old value and New value options we recode the variable in our desired format.
(f) Click left mouse button to Continue box to return the Recode Into Same Variable dialog
box.
(g) Click OK.

20
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Example 4:
Suppose we want to define ‘Educational Status’ on the basis of ‘year of schooling’ from the
following data using the following specifications:
Year of schooling new value(code) meaning(value label)
0 =1 = Illiterate
1-5 =2 = Primary
6-10 =3 = Secondary
11-12 =4 = Higher Secondary
13-16 =5 = Graduate
17 =6 = Post Graduate
18+ =7 = Higher

yearsch
15
7
14
8
13
0
18
6
20
11
10
5
12
16

Now, to recode this data into new values, we follow the following steps:

(a) From the menus choose :


Transform
Recode
Into Different variables……….

This will open the Recode Into Different Variable dialog box.

(b) Select yearsch from the variable list (left window) and then click the arrow on the vertical
bar of the dialog box with the left mouse.

(c) We shall see the variable yearsch  ? in the Numeric Variables  Output Variable box.

(d) Write the name of new variable i.e. scstatus on Output Variable Name : box.

(e) Now label the new variable using the Output Variable Label: box.
(f) Click on Change option.
(c) Then we click on Old and New Values option.

21
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

(d) We shall see the Recode Into Different Variables: Old and New Values dialog box.

(e) Using the Old value and New value options we recode the variable in our desired format.

(f) Click left mouse button to Continue box to return the Recode Into Same Variable dialog
box.
(g) Click OK.
We will see that the variable yearsch has automatically been recoded to a new variable.

…………………………………

22

You might also like