Wednesday, March 25, 2020

Blog 16: Ifelse tricks

Ifelse Tricks in R


Introduction

In this Blog we will look at how to write ifelse conveniently. Most people have difficulty navigating through ifelse statment when there are multiple conditions


Installing the library: dplyr,tidyr and stringr

if(!require("dplyr")){
  
  install.packages("dplyr")
}else{
  
  library(dplyr)
}

if(!require("tidyr")){
  
  install.packages("tidyr")
}else{
  
  library(tidyr)
}

if(!require("stringr")){
  
  install.packages("stringr")
}else{
  
  library(stringr)
}


Ifelse for a simple example

A simple example consists of not more than two conditions

x<- c(1:10)
y<-ifelse(x > 8,"More","Less")
df<-data.frame(X=x,Y=y,stringsAsFactors = F)
df
    X    Y
1   1 Less
2   2 Less
3   3 Less
4   4 Less
5   5 Less
6   6 Less
7   7 Less
8   8 Less
9   9 More
10 10 More


Ifelse for a Complex example

Let us now look at Medical Expenses dataset from Ecdat library

if(!require("Ecdat")){
  
  install.packages("Ecdat")
}else{
  
  library(Ecdat)
}

data(MedExp)
df<-MedExp
head(df)
        med lc idp      lpi fmde physlim ndisease    health     linc     lfam
1  62.07547  0 yes 6.907755    0      no 13.73189      good 9.528776 1.386294
2   0.00000  0 yes 6.907755    0      no 13.73189 excellent 9.528776 1.386294
3  27.76280  0 yes 6.907755    0      no 13.73189 excellent 9.528776 1.386294
4 290.58220  0 yes 6.907755    0      no 13.73189      good 9.528776 1.386294
5   0.00000  0 yes 6.109248    0      no 13.73189      good 8.538699 1.098612
6   2.39521  0 yes 6.109248    0     yes 13.00000      good 8.538699 1.098612
  educdec      age    sex child black
1      12 43.87748   male    no    no
2      12 17.59138   male   yes    no
3      12 15.49966 female   yes    no
4      12 44.14305 female    no    no
5      12 14.54962 female   yes    no
6      12 16.28268 female   yes    no


Lets create AgeType based on Age Variable

The AgeType variable will have values based on the below rules:

  • Kids: Less than 10 years
  • Teenagers: Between 10 and 20
  • Adults: 21-40
  • Senior: 41-60
  • Retirees: More than 60

These are just for illustrative purpose. This will give you a sense of how to handle case related to multiple classes

condition<-c("age < 10",
             "age <= 20",
             "age <= 40",
             "age <= 60")

value<-c("kids",
         "Teenagers",
         "Adults",
         "Seniors",
         "Retirees")

# Retirees is a default value 

# Generating the Evaluation Expression
ls.exp<-list()
for(i in 1:length(value)){
  
  
  if(i==length(value)){
    
    ls.exp[[i]]<-paste0("'",toString(value[i]),"'")
    
  }else{
    
    ls.exp[[i]]<-paste0("ifelse(",toString(condition[i]),", ",paste0("'",toString(value[i]),"'"),",")
  }
  
}

# Combining the elements in the list
ifclause<-paste0(do.call(paste,ls.exp),gsub(",","",toString(rep(")",length(condition)))))

# Using eval and parse functions to execute the ifelse statement
interim.df<-df%>%
  mutate(X=eval(parse(text = ifclause)))%>%
  select(age,X)%>%
  head()

interim.df
       age         X
1 43.87748   Seniors
2 17.59138 Teenagers
3 15.49966 Teenagers
4 44.14305   Seniors
5 14.54962 Teenagers
6 16.28268 Teenagers


‘condition’ and ‘value’ vectors were assigned proper values. Post the initialisation, a for loop is written which creates the ifelse expression using ls.exp

The real trick is in using eval(parse(text=)) function in the mutate step to execute the above expression


Final Comments

Multiple conditions in ifelse often leads to lengthy code which are error prone.The above hack can come in handy when we are deal with such a scenario and can make life easy for the programmer.

Web Scraping Tutorial 4- Getting the busy information data from Popular time page from Google

Popular Times Popular Times In this blog we will try to scrape the ...