SAS - Statistical Analysis System

SAS – Statistical Analysis System Domain: Retail Name of the author: Shalini Balasubramani ([email protected]

Views 135 Downloads 0 File size 871KB

Report DMCA / Copyright

DOWNLOAD FILE

Recommend stories

Citation preview

SAS – Statistical Analysis System Domain: Retail Name of the author: Shalini Balasubramani ([email protected]) Date created: 04/28/2009

AGENDA • Statistical Analysis System • Components of SAS • DATA and PROC step flow • DATA step • Input statement • Output statement • PROC step

18 September 2009

2

SAS – Statistical Analysis System • It acts as powerful system for statistical analysis and data manipulation • It provides an extensive usage in spreadsheets and graphical analysis • It includes a complete programming language as well as modules for - Econometric and time series analysis - Project management, engineering and statistical research - Linear programming - Operation research

• It provides multidimensional data analysis (OLAP – On Line Analytical Processing), query and reporting, EIS (Executive Information System), data mining and data visualization

18 September 2009

3

Components of SAS

• DATA and PROC steps acts as a building blocks of the SAS program • A typical program starts with either DATA step or the combination of DATA and PROC step

• DATA step creates data sets and pass the data to the PROC step for the further data manipulations

• DATA step contains the information about the declared variables within the data set

18 September 2009

4

DATA and PROC step flow Input data

RAW DATA

Variable declaration

DATA step Data manipulation as per the function

SAS DATASET

PROC step

Output data

REPORTS

18 September 2009

5

SAS – DATA step • The step begins with DATA statement • DATA sets are produced by DATA step. • DATA step contains the information about the declared variables including its name, type (character, numeric), length (storage size), and position (starting position) within the data set • It passes the data to the PROC step for the further manipulation • It reads and modify the data Syntax: E.g.:

DATA data-set name; DATA data1;

18 September 2009

6

Instream data • It reads the data in the DATA step. • The data are passed in free format within the DATA step Syntax: DATA dataset name; INPUT [variable] [format]; CARDS; value[1-n]; RUN;

18 September 2009

7

INSTREAM DATA – INPUT CARD (E.g.)

18 September 2009

8

INSTREAM DATA – SAS LOG

18 September 2009

9

Reading data from external file • INPUT keyword declares the variables with format, length in a file

• INFILE statement is used to read the data from the external file Syntax: DATA datastep; INPUT [variable] [format]; RUN;

18 September 2009

10

SAS – DATA step (E.g.)

18 September 2009

11

Output statement

• PUT statement writes the data in both the external and SAS log. • The PUT statement writes the data into the SAS log by default in the absence of external file.

Syntax: PUT variable-name Format.;

18 September 2009

12

OUTPUT DATA – EXTERNAL FILE (E.g.)

18 September 2009

13

SAS PROC step • PROC step receives the data passed by the SAS DATA step • It manipulates the received data as per the function Syntax: PROC PRINT DATA=‘data-set’; [TITLE] ; RUN;

18 September 2009

14

PROC PRINT (E.g.)

18 September 2009

15

PROC – SORT • SORT proc sorts the data either in ascending or descending order • It sorts the data set by the input variables as a key variable • It sorts the data set in ascending order as a default Syntax: PROC SORT DATA=‘input SAS data set’; OUT=‘Output SAS data set’; BY ‘key variable’; OPTIONS RUN;

18 September 2009

16

PROC – SORT (E.g.)

18 September 2009

17

PROC MEANS

• MEANS

procedure produces the simple descriptive statistics for numeric

variables. Syntax: PROC MEANS DATA = FILE1; Variable(1-n); RUN;

18 September 2009

18

PROC MEANS (E.g.)

18 September 2009

19

MEANS – SAS LOG

18 September 2009

20

PROC FREQ

• FREQ statement calculates the frequency by key variable of the SAS data set Syntax: PROC FREQ DATA=dataset name; TABLES variable; RUN;

18 September 2009

21

PROC – FREQ (E.g.)

18 September 2009

22

FREQ – SAS LOG

18 September 2009

23

MERGE statement • It combines the SAS data sets and match the observations based on an identifier Syntax: DATA data-set; MERGE data-set1 data-set2; RUN;

18 September 2009

24

MERGE (E.g.)

18 September 2009

25

MERGE - SAS LOG

18 September 2009

26

MERGE statement (Contd..,) Dataset A – Record 1 Key value = “A”

Dataset B – Record 1 key value = “A”

Dataset A – Record 2 Key value = “B”

Dataset B – Record 2 Key value = “B”

Dataset A – Record 3 Key value = “C”

Dataset B – Record 3 Key value = “B”

Dataset A – Record 4 Key value = “C”

Dataset B – Record 4 Key value = “C”

DATASET A

DATASET B

MERGE TYPE

Dataset A - Record 1 Key value = “A”

Dataset B - Record 1 Key value = “A”

1 – 1 Merge

Dataset A - Record 2 Key Value = “B”

Dataset B - Record 2 Key Value = “B”

1 – Many Merge

Dataset A - Record 2 Key Value = “B”

Dataset B - Record 3 Key Value = “B”

1 – Many Merge

Dataset A - Record 3 Key Value = “C”

Dataset B - Record 4 Key Value = “C”

Many – 1 Merge

Dataset A - Record 4 Key Value = “C”

Dataset B - Record 4 Key Value = “C”

Many – 1 Merge

18 September 2009

27

UPDATE statement • UPDATE statement performs a modified version of a horizontal merge, in which values on the original records are overlaid with new information.

• The UPDATE statement can avoid overlaying any given value in the master dataset with the value in the transaction dataset by setting the corresponding value in the transaction dataset to missing. Syntax: DATA data-set; UPDATE data-set1 data-set2; BY key value(optional); RUN;

18 September 2009

28

UPDATE (E.g.)

18 September 2009

29

UPDATE – SAS LOG

18 September 2009

30

MODIFY statement • It extends the capabilities of the DATA step, enabling you to manipulate a SAS data set in place without creating an additional copy Syntax: DATA data-set; MODIFY dataset; BY key-value; RUN;

18 September 2009

31

MODIFY (E.g.)

18 September 2009

32

MODIFY – SAS LOG

18 September 2009

33

THANK YOU

18 September 2009

34