Time Series Plots
Parag Verma
Introduction
The only analysis where plots drive the specific model components is time series. Just by looking at the plots, one can comment about the seasonality ,trend and get an idea of how to create a model around the data. In this blog we will look at how to plot a time series variable using ggplot.The dataset we are going to take for our analysis is-Winning times (in minutes) for the Boston Marathon Men’s Open Division. 1897-2016. which is available from fpp2 package
Installing the library: dplyr,tidyr and Ecdat package
package.name<-c("dplyr","tidyr","fpp2","ggplot2")
for(i in package.name){
if(!require(i,character.only = T)){
install.packages(i)
}
library(i,character.only = T)
}
# fpp2 package has the 'marathon' data
data(marathon)
df<-marathon
head(df)
Time Series:
Start = 1897
End = 1902
Frequency = 1
[1] 175.1667 162.0000 174.6333 159.7333 149.3833 163.2000
Lets create the data frame and plot the time series attribute
interim.df<-data.frame(TimeUnits=seq(start(df)[1],end(df)[1],by=frequency(df)),
Values=df)
ggplot(interim.df, aes(x = TimeUnits, y = Values)) +
geom_line(aes(color = "orange"), size = 1) +
scale_color_manual(values = c("orange")) +
theme_minimal()
Look how we created a data frame from the time series attribute by using start,end and frequency property of time series data
Area Plot
Lets also try and plot the time series variable using area plots. This is also used to represent long standing view of stock prices,inventory levels, asset value,etc
ggplot(interim.df, aes(x = TimeUnits, y = Values)) +
geom_area(aes(color = "orange",fill="orange"),
alpha = 0.5, position = position_dodge(0.8))+
scale_color_manual(values = c("orange")) +
scale_fill_manual(values = c("orange"))
Final Comments
We can see that just by using ggplot, we can leverage our understanding of the data.
Link to Previous R Blogs
List of Datasets for Practise
https://vincentarelbundock.github.io/Rdatasets/datasets.html