Load Dataset
We will continue using the simulated dataset credit for
demonstration. The dataset contains the information of a large number of
credit card holders.
credit <- read.csv('https://albums.yuanting.lu/sta126/data/credit.csv')
Dataset Overview
- How large is this dataset?
dim(credit) # The dimension (rows by columns) of the dataset
- What are the variables (columns) in the dataset?
- Which of the variables are categorical?
Which ones are numerical? We can take a peek of the dataset
by viewing its first a few rows (e.g., first 10 rows).
head(credit, 10) # The head function shows the first a few rows of the dataset.
Dataset Summary
- Create a frequency table for a
variable.
To convert the frequency table to a relative frequency
table, we can divide the numbers in the table by the
total number of customers in the data set. If you remember the total
number of customers is 400, do
table(credit$Student) / 400. Otherwise, we can use
the length function to calculate the total so that we do
table(credit$Student) / length(credit$Student).
- Create a pie chart.
freq <- table(credit$Student)
pie(freq, main = "Student Status")

- The code is equivalent to
pie(table(credit$Sstudent)).
- The arrow sign (<-) in the first line of this code chunk deposits
contents on the right-hand side of the arrow (i.e., the student status
frequency table) to R variable on the left-hand side of the
arrow (i.e.,
freq), so that from now on if we need the
frequency table, we simply call freq instead of typing
table(credit$Student) over and over again.
- The option main = “Student Status” adds a title to the
graph.
- Create a bargraph. We have already
assigned
freq to be the frequency table in the previous
code chunk, so we can continue using it.
barplot(freq, main = "Student Status")

Practice Make the y-axis display relative
frequency.
[Plus] Graphic Skills +
- Customized colors.
- The col = c(“lightblue”, “lightcoral”) option provides
two colors to the graph.
- The c function combines its arguments (e.g., “lightblue”,
“lightcoral”) and creates a list.
- Customized group tags.
- The labels option allows us to change the group tags in a
pie chart.
- The names option allows us to change the group tags in a
bar graph.
- Customize text labels on the y-axis.
- The ylab option allows us to change the text label on the
y-axis.
- Use lines to shade bars and slices.
- The density option provides the density of line segments
to each bar or slice.
- The angle option sets the angles of the line segments (in
degrees).
freq <- table(credit$Student) / length(credit$Student)
pie(freq,
col = c("lightblue", "lightcoral"),
labels = c("Non-student", "Student"),
main = "Student Status")

barplot(freq,
density = c(20, 10),
angle = 60,
names = c("Non-student", "Student"),
ylab = "Relative Frequency",
main = "Student Status")
