Machine Learning Made Easy: optimal sequences

Sales Roadmap for an Asset Management firm using Markov Chain

Introduction

Companies that provide financial instruments(Bank of America,Capital One, Wells Fargo) such as bonds, mutual funds, etc need the services of broker dealers such as Morgan Stanley, Metlife Securities etc to spread awareness and enable market adoption. To facilitate this process, sales reps from these companies get in regular touch with the brokers so that they can aprise them with latest launch details, hot instruments, retirement plans etc.All these activities can be bucketed under two sets of marketing strategies:Direct and Indirect.

Direct Marketing is when the rep goes to the advisor in person and briefs him about the product details
Indirect Marketing is when the advisor is sent mails, ads, brochures etc about the product

All the above activities incur expenses and with the growing need to optimize marketing spend, companies are now beginning to rationalize marketing budgets.This naturally puts up a question as to how should a sales rep go about his routine and daily activities. Can we come up with a certain plan of action that will enable the sales rep to perform X activity as a certain time t1.Specifically, can we create a plan or a roadmap that will answer the following questions:

Create a detailed marketing plan which includes what activity to be performed at what time
Time gap between two marketing activities
Sales gain from each sequence(a sequence consists of various activities performed one after the other) so that we can shortlist the sequence with maximum gain
What are the optimal number of activities in each sequence.Should it contain 7 activities or 8 activities…?

For this blog, we will be using:

Data: There is a ChannelAttribution library that has masked activity data which is sufficient to gain understanding of the entire thing
Model: We will be using Markov chain to model various marketing activities of the sales reps.Since the number of activities are fixed and the reps generally perform these activities one after the other, markov chain is a good fit

Again this blog is not a detailed explanation of Markov chain.There are several books on this subject and still there are concepts that are not covered there. My aim from this blog is to provide a simple step by step guide of how we can leverage a powerful model to answer some business questions without going into the technicalities and maths of the process.Lets start by importing the libraries and the dataset.

Step 1: Read Data

The dataset is part of ChannelAttribution package. Packages markovchain is imported to create a markov chain model. Lets import the dataset and look at the first few records

package.name<-c("dplyr","tidyr","stringr","ChannelAttribution","markovchain")

for(i in package.name){

  if(!require(i,character.only = T)){

    install.packages(i)
  }
  library(i,character.only = T)

}


# The dataset
data(PathData)
df<-Data
head(df[["path"]])

[1] "eta > iota > alpha > eta"                                                                                                                                      
[2] "iota > iota > iota > iota"                                                                                                                                     
[3] "alpha > iota > alpha > alpha > alpha > iota > alpha > alpha > alpha > alpha > alpha"                                                                           
[4] "beta > eta"                                                                                                                                                    
[5] "iota > eta > theta > lambda > lambda > theta > lambda"                                                                                                         
[6] "alpha > eta > alpha > alpha > eta > iota > iota > iota > alpha > alpha > alpha > alpha > alpha > alpha > theta > alpha > alpha > alpha > alpha > alpha > alpha"

It is a simulated data set containing information on ten thousand sales reps.Each row indicates marketing activities of a sales rep. For instance, for Sales rep1, the activities are ‘eta > iota > alpha > eta’.This means that he/she does the following:

eta first
iota as second activity
alpha as third activity
again eta as fourth activity

so in this sequence, there are 4 activities only.Other activities can be understood similarly.Our goal is to to find optimal sequences with detail around what activities will be part of the sequence.

Step 2: Understanding data engineering requirements for the Model

To model data into markov chain, we need to create the data into a specific form. We just need two columns in the table:Present and Next

Markov chain is based on the logic that next activity of the sequence depends only only the immediate previous activity and not the activities that happened before that.Hence we need to use just Present and Next activity in our analysis.But how exactly we transform the data to meed our requirements.The below illustration represents the steps to create the dataset that goes into the model

We will have marketing activity per rep separated by ‘>’
The sequences will be then divided into Present and Next activity column with repeating activities indicated by the counter suffixed at the end like eta1, eta2 etc
The data is then group at Present and Next activity level to get the total count of such instances where the combination is present
The table is then pivoted on Next Activity such that row names represents Present Activity and column names represents Next Activity
The resulting data is then divided by Row Sum for each row.Now the total sum of rows for each record will be equal to 1.Once this matrix is generated, we call it transition Matrix
The Transition Matrix is given as an input to Markov chain model

Data Engineering Steps

Step 3: Developing the logic for Present and Next Activity columns based on sample activity and training the Markov Chain Model

In this section we will take a marketing activity “eta > iota > alpha > eta” and develop logic for converting this sequence into required data frame.We will use sapply and dplyr library to build this logic.The steps are summarised below

Step 3a: Creating Present Activity and Next Activity

Split the sequence into individual elements using str_split function
Removing spaces
Creating an index column based on resulting elements.So the index for “eta > iota > alpha > eta” will be 1,1,1,2. We have 2 here as eta is repeating
Filtering out index greater than 5. The assumption here is that we dont want to keep repetitions(any where in the sequence) of a single activity more than 5
Merging Index with the remaining activities to get activity suffixed with the repetition
Creating a new column called q ‘Next Activity’ which is equal to lead of the Present Activity column
removing records where Next Activity is equal to NA

< br>

y<-"eta > iota > alpha > eta > lamda > iota"
l1<-list() # Initializing the list

l1<-lapply(y,function(x){
  
  y1<-gsub(" ","",str_split(x,">")[[1]])
  z<-data.frame(First_Activity=y1)%>%
    group_by(First_Activity)%>%
    mutate(Index=1:n(),
           First_Activity_v1=str_c(First_Activity,Index))%>%
    ungroup()
  
  z1<-z%>%
    filter(Index < 6)%>%
    select(First_Activity_v1)%>%
    rename(Present_Activity=First_Activity_v1)%>%
    mutate(Next_Activity=lead(Present_Activity))%>%
    filter(!is.na(Next_Activity))
  
  return(z1)
  
})
interim.df<-do.call(rbind.data.frame,l1)
interim.df

# A tibble: 5 x 2
  Present_Activity Next_Activity
  <chr>            <chr>        
1 eta1             iota1        
2 iota1            alpha1       
3 alpha1           eta2         
4 eta2             lamda1       
5 lamda1           iota2

Step 3b: Pivoting on Next Activity Column to create transition matrix

interim.df2<-interim.df%>%
  group_by(Present_Activity ,Next_Activity)%>%
  dplyr::summarise(Total_Count=n())%>%
  ungroup()%>%
  spread(Next_Activity,Total_Count,0)%>%
  as.data.frame()

interim.df2

  Present_Activity alpha1 eta2 iota1 iota2 lamda1
1           alpha1      0    1     0     0      0
2             eta1      0    0     1     0      0
3             eta2      0    0     0     0      1
4            iota1      1    0     0     0      0
5           lamda1      0    0     0     1      0

Step 3c: Creating the transition matrix

# Creating the row names and then deleting the column containing the Present Activity values
row.names(interim.df2)<-interim.df2$Present_Activity
interim.df2$Present_Activity<-NULL

interim.df2

       alpha1 eta2 iota1 iota2 lamda1
alpha1      0    1     0     0      0
eta1        0    0     1     0      0
eta2        0    0     0     0      1
iota1       1    0     0     0      0
lamda1      0    0     0     1      0

# Creating the row sum column and then dividing each row with its rowsum values

interim.df3<-interim.df2%>%
  mutate(Row_Sum=rowSums(.))%>%
  filter(Row_Sum!=0)%>%
  mutate_at(vars(-Row_Sum), funs(. / Row_Sum))%>%
  select(-Row_Sum)%>%
  as.matrix()

interim.df3

       alpha1 eta2 iota1 iota2 lamda1
alpha1      0    1     0     0      0
eta1        0    0     1     0      0
eta2        0    0     0     0      1
iota1       1    0     0     0      0
lamda1      0    0     0     1      0

Step 3d: Training Markov Chain Model

ActivityMc_Sample <- new("markovchain", transitionMatrix = interim.df3,name = "markov_chain")

Step 3e: Plotting Markov Chain Model

plot(ActivityMc_Sample, vertex.size = 40)

Step 3f: Generate sequences using the Markov Chain Model

Once the model is trained, we can generate a sequence of any length provided we give a starting point.In the below example, we are trying to generate a sequence with:

eta1 as the starting point
length=10
t0 in the argument refers to the intial/starting point
include.t0=T indicates that we want to show the starting point also in the sequence

sample_seq<-rmarkovchain(n = 5, object = ActivityMc_Sample, t0 = "lamda1",include.t0=T)
sample_seq

[1] "lamda1" "iota1"  "alpha1" "eta1"   "eta2"   "lamda1"

Every time we generate a sequence with eta1 as the starting point, it will result in a difference sequence.This is because of the stochastic(random) nature of the markov chains

Step 3g: Time taken from one activity to another in a generated sequence

meanFirstPassageTime function provides the mean time required to go from one state of the sequence to another.

time_transition<-meanFirstPassageTime(ActivityMc_Sample)
time_transition

       alpha1 eta1 eta2 iota1 lamda1
alpha1      0    1    2     4      3
eta1        4    0    1     3      2
eta2        3    4    0     2      1
iota1       1    2    3     0      4
lamda1      2    3    4     1      0

time_taken_df<-data.frame(Present_Activity=sample_seq)%>%
  mutate(Next_Activity=lead(Present_Activity))%>%
  filter(!is.na(Next_Activity))%>%
  mutate(Combined=str_c(Present_Activity,"_",Next_Activity))%>%
  mutate('Time_Taken(day(s))'=sapply(Combined, function(x){
                                  
        present_activity<-str_split(x,"_")[[1]][1]                          
        next_activity<-str_split(x,"_")[[1]][2]
        time_taken<-time_transition[present_activity,next_activity]
        return(time_taken)
                                }))

time_taken_df

  Present_Activity Next_Activity     Combined Time_Taken(day(s))
1           lamda1         iota1 lamda1_iota1                  1
2            iota1        alpha1 iota1_alpha1                  1
3           alpha1          eta1  alpha1_eta1                  1
4             eta1          eta2    eta1_eta2                  1
5             eta2        lamda1  eta2_lamda1                  1

If the unit of time in the data(if the activity is done at a daily level) is day, then the transition time will be reported in days.If it is in weeks, then it will be reported in weeks based on value from the above matrix.Also note that diagonal values are zero as there is no time taken to transition within the same state(activity)

Model on complete data in the next blog

In the next blog, we will look at

markov chain model for the entire dataset
How to select optimal sequence between competing sequences with the same starting activity
Determine the optimal length of the generated sequence

Link to Previous R Blogs

R complete Guide Python Coding Exercise

Link to Youtube Channel

https://www.youtube.com/playlist?list=PL6ruVo_0cYpV2Otl1jR9iDe6jt1jLXgAq

Machine Learning Made Easy

Friday, February 11, 2022

Sales Roadmap for an Asset Management Firm using Machine Learning