Ifelse Tricks in R
Parag Verma
Introduction
In this Blog we will look at how to write ifelse conveniently. Most people have difficulty navigating through ifelse statment when there are multiple conditions
Installing the library: dplyr,tidyr and stringr
if(!require("dplyr")){
install.packages("dplyr")
}else{
library(dplyr)
}
if(!require("tidyr")){
install.packages("tidyr")
}else{
library(tidyr)
}
if(!require("stringr")){
install.packages("stringr")
}else{
library(stringr)
}
Ifelse for a simple example
A simple example consists of not more than two conditions
x<- c(1:10)
y<-ifelse(x > 8,"More","Less")
df<-data.frame(X=x,Y=y,stringsAsFactors = F)
df
X Y
1 1 Less
2 2 Less
3 3 Less
4 4 Less
5 5 Less
6 6 Less
7 7 Less
8 8 Less
9 9 More
10 10 More
Ifelse for a Complex example
Let us now look at Medical Expenses dataset from Ecdat library
if(!require("Ecdat")){
install.packages("Ecdat")
}else{
library(Ecdat)
}
data(MedExp)
df<-MedExp
head(df)
med lc idp lpi fmde physlim ndisease health linc lfam
1 62.07547 0 yes 6.907755 0 no 13.73189 good 9.528776 1.386294
2 0.00000 0 yes 6.907755 0 no 13.73189 excellent 9.528776 1.386294
3 27.76280 0 yes 6.907755 0 no 13.73189 excellent 9.528776 1.386294
4 290.58220 0 yes 6.907755 0 no 13.73189 good 9.528776 1.386294
5 0.00000 0 yes 6.109248 0 no 13.73189 good 8.538699 1.098612
6 2.39521 0 yes 6.109248 0 yes 13.00000 good 8.538699 1.098612
educdec age sex child black
1 12 43.87748 male no no
2 12 17.59138 male yes no
3 12 15.49966 female yes no
4 12 44.14305 female no no
5 12 14.54962 female yes no
6 12 16.28268 female yes no
Lets create AgeType based on Age Variable
The AgeType variable will have values based on the below rules:
- Kids: Less than 10 years
- Teenagers: Between 10 and 20
- Adults: 21-40
- Senior: 41-60
- Retirees: More than 60
These are just for illustrative purpose. This will give you a sense of how to handle case related to multiple classes
condition<-c("age < 10",
"age <= 20",
"age <= 40",
"age <= 60")
value<-c("kids",
"Teenagers",
"Adults",
"Seniors",
"Retirees")
# Retirees is a default value
# Generating the Evaluation Expression
ls.exp<-list()
for(i in 1:length(value)){
if(i==length(value)){
ls.exp[[i]]<-paste0("'",toString(value[i]),"'")
}else{
ls.exp[[i]]<-paste0("ifelse(",toString(condition[i]),", ",paste0("'",toString(value[i]),"'"),",")
}
}
# Combining the elements in the list
ifclause<-paste0(do.call(paste,ls.exp),gsub(",","",toString(rep(")",length(condition)))))
# Using eval and parse functions to execute the ifelse statement
interim.df<-df%>%
mutate(X=eval(parse(text = ifclause)))%>%
select(age,X)%>%
head()
interim.df
age X
1 43.87748 Seniors
2 17.59138 Teenagers
3 15.49966 Teenagers
4 44.14305 Seniors
5 14.54962 Teenagers
6 16.28268 Teenagers
‘condition’ and ‘value’ vectors were assigned proper values. Post the initialisation, a for loop is written which creates the ifelse expression using ls.exp
The real trick is in using eval(parse(text=)) function in the mutate step to execute the above expression
Final Comments
Multiple conditions in ifelse often leads to lengthy code which are error prone.The above hack can come in handy when we are deal with such a scenario and can make life easy for the programmer.
Link to Previous R Blogs
List of Datasets for Practise
https://hofmann.public.iastate.edu/data_in_r_sortable.html
https://vincentarelbundock.github.io/Rdatasets/datasets.html
No comments:
Post a Comment