Reset a Running Counter based on a condition
Parag Verma
15 June, 2022
Introduction
There are certain situations where we have to identify instances where:
- Sales has reduced consecutively
- Stock Prices have declined
- A certain medicine has been consumed prior to another drug
These are some of the examples where we need to leverage the usage of a running counter and reset it based on the condition/criteria specified.We will take an example and explain how it can done.
package.name<-c("dplyr","stringr","data.table")
for(i in package.name){
if(!require(i,character.only = T)){
install.packages(i)
}
library(i,character.only = T)
}
Step 1: Creating dummy data frame
We will create a data frame with week number and weekly sales data. The goal is to identify most weeks of consecutive weekly decline.
df<-data.frame(Product=rep("A",10),
Week=1:10,
Sales=c(10,9,8,5,12,12,11,10,10,15))
df
Product Week Sales
1 A 1 10
2 A 2 9
3 A 3 8
4 A 4 5
5 A 5 12
6 A 6 12
7 A 7 11
8 A 8 10
9 A 9 10
10 A 10 15
Step 2: Difference between Sales and Lagged Sales
interim.df<-df%>%
mutate(Lagged_Sales=lag(Sales))%>%
mutate(Diff_Sales=Sales-Lagged_Sales)%>%
mutate(Flag=ifelse(Diff_Sales < 0 ,1,0))%>%
mutate(Flag2=ifelse(is.na(Flag),0,Flag))%>%
select(-Flag)%>%
rename(Flag = Flag2)
interim.df
Product Week Sales Lagged_Sales Diff_Sales Flag
1 A 1 10 NA NA 0
2 A 2 9 10 -1 1
3 A 3 8 9 -1 1
4 A 4 5 8 -3 1
5 A 5 12 5 7 0
6 A 6 12 12 0 0
7 A 7 11 12 -1 1
8 A 8 10 11 -1 1
9 A 9 10 10 0 0
10 A 10 15 10 5 0
Step 3: Create a Counter based on Flag
We will be creating a grouping variable using cumsum and lag of Flag variable and assign row_number as counter for every group.Whenever there is a transition to 1 either from a 0 or 1, the Counter changes value and gets incremented by 1.For other values, it remains as it is
interim.df2<-interim.df%>%
select(-Lagged_Sales,-Diff_Sales)%>%
mutate(Counter = cumsum(Flag == 1))
interim.df2
Product Week Sales Flag Counter
1 A 1 10 0 0
2 A 2 9 1 1
3 A 3 8 1 2
4 A 4 5 1 3
5 A 5 12 0 3
6 A 6 12 0 3
7 A 7 11 1 4
8 A 8 10 1 5
9 A 9 10 0 5
10 A 10 15 0 5
We can see that Counter variable gets incremented by 1 whenver the value of Flag is 1.For ther values, it takes the first lagged value
Step 4: Getting weeks of Most consecutive Sales Decline
We can use the rleid function to get this information
interim.df3<-interim.df2%>%
mutate(Counter2 = rleid(Flag))
interim.df3
Product Week Sales Flag Counter Counter2
1 A 1 10 0 0 1
2 A 2 9 1 1 2
3 A 3 8 1 2 2
4 A 4 5 1 3 2
5 A 5 12 0 3 3
6 A 6 12 0 3 3
7 A 7 11 1 4 4
8 A 8 10 1 5 4
9 A 9 10 0 5 5
10 A 10 15 0 5 5
interim.df3%>%
group_by(Counter2)%>%
summarise(Total_Such_Weeks=n())%>%
arrange(desc(Total_Such_Weeks))
# A tibble: 5 x 2
Counter2 Total_Such_Weeks
<int> <int>
1 2 3
2 3 2
3 4 2
4 5 2
5 1 1
We can see that most week of consecutive sales decline is 3
Parting Comments
In this blog we looked at a very simple example of how We can use a simple strategy to reset a counter in R
R Complete Guide
Python Complete Guide
https://www.aimlmadeeasy.com/2021/09/python-complete-guide.html