SAS graphs and example

 

Hello today the blog is about SAS graphs on SAS OnDemand Academics.

Graphs using Gplot procedure:

Code:


  •                          Gplot to plot simple scatter plots.
  •                         Plot statement says plot y-axis variable * x-axis variable


Output

Code

  •         There is something symbol option.
  •         It is a global statement just like title and footnote statements
  •         The i = join it turns scatter plot into line plot.
  •         The value option changes the shape of data points
  •         The colour/c option changes the colour of data point
  •         Go reset options reset the symbol statement basically in next step of code you can apply new symbols rather than using the same symbol for another graph.

Code for symbol statement
Output when we mention symbol statement


Example dataset: You can find the same example in a book that I had referenced at the end of blog

Univariate procedure and scatterplot/ dataset:

Glimpse of example dataset

Code and tasks in comment section


Code:

We have two questions to answer one is relationship that is scatter plot on the variable mortality and hardness. Another one is hypothesis that is whether the mortality and hardness of water value differs in cities that are located at north and south side of country?

  •       Plan is plot scatter plot simple
  •        Then generate basic statistics and histogram to know distribution of these variable
  •       Get probability plots (probplot)


  •         We got basically from univariate procedure histogram shows mortality variable approximately follows normal distribution.
  •         Hardness of water follows more skewed distribution so either we have to follow non-parametric test to answer second question or some transformation is need that is log transformation.
  •         You know by condition statement at first, we created a geofactor variable telling which city is located on which side referencing to location attribute that was present in raw data.
  •         Just a simple scatter plot
For simple scatter plot code mortality versus hardness of water


Output graph mortality versus hardness of water

  •         Plot statement  plot y-axis variable * x-axis variable = categorical variable.

Sorting and Correlation of variables:

  •         Sort the data by that north-south geofactor variable.
  •         Then use correlation procedure to generate correlation of variables in south and north sides.
Below code for getting correlation between variables in both south and north side of cites.

·



Correlation procedure output for northern cities
Correlation procedure output for southern cities

Interpretation of Correlation Procedure:

In north side of cities, the variables share negative relationship with spearman coefficient (-0.40)
In south side of cities, the variables share negative relationship with spearman coefficient (-0.59)


Transformation of hardness variable for normal approximation:

  •         Transformation of hardness variable by creating a new variable.
  •         Univariate procedure to look into distribution of new log transformed variable.

Code for log transformation of variable and univariate procedure



  •         Transformed one approximately follows normal distribution.

        Now we move on to our hypothesis

Code for T test and Wilcoxon test

Output for T test procedure - 1

Distribution plot in output for mortality variable
Output of T test procedure for log hardness (transformed) variable- 2



Distribution of log transformed hardness variable

Interpretation of T test procedure:

From above outputs and plots we can say that for mortality rate there is significant mean difference that is the mean value of mortality for south cities is different than north cities in which north side cites have higher mortality rate than south cities (<0.001)

For log transformed hardness variable there is significant difference between mean hardness values (0.0003). From distribution plot the log transformed hardness value is higher in south cities than in north cities.

Wilcoxon test: non parametric test for untransformed hardness of water variable

Code for Wilcoxon test


Output for Wilcoxon test for hardness of water - 1


Plot for untransformed hardness of variable

Interpretation of Wilcoxon test:

From table and plot we can see that there is significant difference in the hardness value for south cites and north cites. 

Reference

  • A Handbook of Statistical Analyses using SAS Second Edition By Geoff Der and Brian S. Everitt Section 2.1 Chapter Two

Comments

Post a Comment