Showing posts with label pharma. Show all posts
Showing posts with label pharma. Show all posts

Saturday, April 2, 2022

Determine Mono and Combo therapies using stringr

Match string patterns


Introduction

There are certain situations where we have to flag whether a certain word/text is present in a string.For instance, in drug regimen administered to a patient, there are situations where we have to check whether a drug is given as a standalong or in combination with the other.In this blog we will see how this can be done


package.name<-c("dplyr","stringr")

for(i in package.name){

  if(!require(i,character.only = T)){

    install.packages(i)
  }
  library(i,character.only = T)

}


Step 1: Creating dummy data frame

df<-data.frame(Patient=1:3,
               Line_of_Therapy=c("Drug 1 | Drug 2","Drug 1 + Drug 3 | Drug 4 + Drug 5 | Drug 3","Drug 1"))

df
  Patient                            Line_of_Therapy
1       1                            Drug 1 | Drug 2
2       2 Drug 1 + Drug 3 | Drug 4 + Drug 5 | Drug 3
3       3                                     Drug 1

We can see that Drug 1 is given as a combination for Patient 1 and 2 while it is given as a standalone drug for patient 3


Step 2: Pattern Matching

df2<-df%>%
  mutate(Mono_Indicator=sapply(Line_of_Therapy, function(x){
    
    # x<-"Drug 5 | Drug 1 + Drug 2"
    # x<-"Drug 1 | Drug 2"
    
    # Splitting the individual drugs within particular line
    y<-str_split(x,"[|]")[[1]]
    z<-trimws(y)
    
    # Checking if Drug 1 is present as a Mono or not and returning the position
    z1<-which(z=="Drug 1")[1]
    
    # If Drug 1 is not present, then indicate "Not present"
    if(is.na(z1)){
      
      z2<-"Not Mono"
      
    }else{
      
      z2<-"Mono"
      
    }
    
    return(z2)
    
    
  }),
        Combo_Indicator=sapply(Line_of_Therapy, function(x){
    
    # Splitting the individual drugs within particular line
    y<-str_split(x,"[|]")[[1]]
    z<-trimws(y)
    
    # Checking if Drug 1 is present as a Combo or not and returning the position
    z1<-which(str_detect(z,"Drug 1") & str_detect(z,"['+']"))[1]
    
    # If Drug 1 is not present as Combo, then indicate "Not present"
    if(is.na(z1)){
      
      z2<-"Not Combo"
      
    }else{
      
      z2<-"Combo"
      
    }
    
    return(z2)
    
    
  })
  )
  
df2
  Patient                            Line_of_Therapy Mono_Indicator
1       1                            Drug 1 | Drug 2           Mono
2       2 Drug 1 + Drug 3 | Drug 4 + Drug 5 | Drug 3       Not Mono
3       3                                     Drug 1           Mono
  Combo_Indicator
1       Not Combo
2           Combo
3       Not Combo


Parting Comments

In this blog we looked at a very simple example of how We can use dplyr and stringr library to check presence of a string in a column within a data frame

My Youtube Channel

Web Scraping Tutorial 4- Getting the busy information data from Popular time page from Google

Popular Times Popular Times In this blog we will try to scrape the ...