Match string patterns
Parag Verma
02 April, 2022
Introduction
There are certain situations where we have to flag whether a certain word/text is present in a string.For instance, in drug regimen administered to a patient, there are situations where we have to check whether a drug is given as a standalong or in combination with the other.In this blog we will see how this can be done
package.name<-c("dplyr","stringr")
for(i in package.name){
if(!require(i,character.only = T)){
install.packages(i)
}
library(i,character.only = T)
}
Step 1: Creating dummy data frame
df<-data.frame(Patient=1:3,
Line_of_Therapy=c("Drug 1 | Drug 2","Drug 1 + Drug 3 | Drug 4 + Drug 5 | Drug 3","Drug 1"))
df
Patient Line_of_Therapy
1 1 Drug 1 | Drug 2
2 2 Drug 1 + Drug 3 | Drug 4 + Drug 5 | Drug 3
3 3 Drug 1
We can see that Drug 1 is given as a combination for Patient 1 and 2 while it is given as a standalone drug for patient 3
Step 2: Pattern Matching
df2<-df%>%
mutate(Mono_Indicator=sapply(Line_of_Therapy, function(x){
# x<-"Drug 5 | Drug 1 + Drug 2"
# x<-"Drug 1 | Drug 2"
# Splitting the individual drugs within particular line
y<-str_split(x,"[|]")[[1]]
z<-trimws(y)
# Checking if Drug 1 is present as a Mono or not and returning the position
z1<-which(z=="Drug 1")[1]
# If Drug 1 is not present, then indicate "Not present"
if(is.na(z1)){
z2<-"Not Mono"
}else{
z2<-"Mono"
}
return(z2)
}),
Combo_Indicator=sapply(Line_of_Therapy, function(x){
# Splitting the individual drugs within particular line
y<-str_split(x,"[|]")[[1]]
z<-trimws(y)
# Checking if Drug 1 is present as a Combo or not and returning the position
z1<-which(str_detect(z,"Drug 1") & str_detect(z,"['+']"))[1]
# If Drug 1 is not present as Combo, then indicate "Not present"
if(is.na(z1)){
z2<-"Not Combo"
}else{
z2<-"Combo"
}
return(z2)
})
)
df2
Patient Line_of_Therapy Mono_Indicator
1 1 Drug 1 | Drug 2 Mono
2 2 Drug 1 + Drug 3 | Drug 4 + Drug 5 | Drug 3 Not Mono
3 3 Drug 1 Mono
Combo_Indicator
1 Not Combo
2 Combo
3 Not Combo
Parting Comments
In this blog we looked at a very simple example of how We can use dplyr and stringr library to check presence of a string in a column within a data frame
R Complete Guide
Python Complete Guide
https://www.aimlmadeeasy.com/2021/09/python-complete-guide.html