De-compressing Rows
Parag Verma
11th Nov, 2022
Introduction
In this blog we will we can de-compress aggregated customer records.
package.name<-c("dplyr","stringr")
for(i in package.name){
if(!require(i,character.only = T)){
install.packages(i)
}
library(i,character.only = T)
}
Step 1: Creating a data frame
Lets create a dummy data frame with compressed shopping history for a customer.
df<-data.frame(Customer_Num="Customer 1",
Product_Name=c("A|B|AB|C|X")
)
head(df)
Customer_Num Product_Name
1 Customer 1 A|B|AB|C|X
Just by looking at the data we cant get a hang of whats happening.What we require here is a granular view of things. Maybe something like the below might give us better idea.
Now lets try and get the same summary using dplyr and toString
Step 2: Creating granular records at Customer Level
l1<-list()
l1<-lapply(1:nrow(df), function(x){
x2<-str_split(df[x,"Product_Name"],"['|]")[[1]]
interim.df<-data.frame(Customer = df[x,"Customer_Num"],
Product = x2)
return(interim.df)
})
final.df<-do.call(rbind.data.frame,l1)
final.df
Customer Product
1 Customer 1 A
2 Customer 1 B
3 Customer 1 AB
4 Customer 1 C
5 Customer 1 X