Analyzing and Visualizing Data
Analyzing and Visualizing Data
Selecting a Graph
Selecting a Graph
Pie Charts
Compare a certain sector to the total.
Useful when there are only two sectors, for example yes/no or queued/finished.
Instant understanding of proportions when few sectors are used as dimensions.
When you use 10 sectors, or less, the pie chart keeps its visual efficiency.
Selecting a Graph cont.
Bar Charts/Plots
Ordinal and nominal data sets
Compare things between different groups or to track changes over time
Measure change over time, bar graphs are best when the changes are larger
Display and compare the number, frequency or other measure (e.g. mean) for different discrete categories of data
Flexible chart type and there are several variations of the standard bar chart including horizontal bar charts, grouped or component charts, and stacked bar charts.
Frequency for each category of a categorical variable
Relative frequency (%) for each category
Selecting a Graph cont.
Histograms
It plots the frequencies that data appears within certain ranges.
Underlying frequency distribution (shape) of a set of continuous data.
Underlying distribution (e.g., normal distribution), outliers, skewness, etc
Plot the frequency of score occurrences in a continuous data set that has been divided into classes
Selecting a Graph cont.
Box plots
Distribution of a continuous measure by some grouping variable
They measure the spread of the data, sort of like standard deviation.
The line in the middle of the box is the median.
The box itself represents the middle 50% of the data.
The box edges are the 25th and 75th percentiles.
The vertical size of the boxes are the interquartile range, or IQR.
The tops and bottoms of the boxes are referred to as “hinges”.
Whiskers: They represent the reasonable extremes of the data. That is, these are the minimum and maximum values that do not exceed a certain distance from the middle 50% of the data.
If no points exceed that distance, then the whiskers are simply the minimum and maximum values.
Outliers, data points that are too big or too small compared to the rest of the data.
Selecting a Graph cont.
Scatter plots
View the potential relationship of two continuous variables
Graphical view of the relationship between two sets of numbers.
Scatter plots show how much one variable is affected by another.
The relationship between two variables is called their correlation.
Find potential relationships between values, and to find outliers in data sets.