Wednesday, June 15, 2022

Reset a running counter based on a value in R

Reset a Running Counter based on a condition


Introduction

There are certain situations where we have to identify instances where:

  • Sales has reduced consecutively
  • Stock Prices have declined
  • A certain medicine has been consumed prior to another drug

These are some of the examples where we need to leverage the usage of a running counter and reset it based on the condition/criteria specified.We will take an example and explain how it can done.


package.name<-c("dplyr","stringr","data.table")

for(i in package.name){

  if(!require(i,character.only = T)){

    install.packages(i)
  }
  library(i,character.only = T)

}


Step 1: Creating dummy data frame

We will create a data frame with week number and weekly sales data. The goal is to identify most weeks of consecutive weekly decline.

df<-data.frame(Product=rep("A",10),
               Week=1:10,
               Sales=c(10,9,8,5,12,12,11,10,10,15))
df
   Product Week Sales
1        A    1    10
2        A    2     9
3        A    3     8
4        A    4     5
5        A    5    12
6        A    6    12
7        A    7    11
8        A    8    10
9        A    9    10
10       A   10    15


Step 2: Difference between Sales and Lagged Sales

interim.df<-df%>%
  mutate(Lagged_Sales=lag(Sales))%>%
  mutate(Diff_Sales=Sales-Lagged_Sales)%>%
  mutate(Flag=ifelse(Diff_Sales < 0 ,1,0))%>%
  mutate(Flag2=ifelse(is.na(Flag),0,Flag))%>%
  select(-Flag)%>%
  rename(Flag = Flag2)

interim.df
   Product Week Sales Lagged_Sales Diff_Sales Flag
1        A    1    10           NA         NA    0
2        A    2     9           10         -1    1
3        A    3     8            9         -1    1
4        A    4     5            8         -3    1
5        A    5    12            5          7    0
6        A    6    12           12          0    0
7        A    7    11           12         -1    1
8        A    8    10           11         -1    1
9        A    9    10           10          0    0
10       A   10    15           10          5    0


Step 3: Create a Counter based on Flag

We will be creating a grouping variable using cumsum and lag of Flag variable and assign row_number as counter for every group.Whenever there is a transition to 1 either from a 0 or 1, the Counter changes value and gets incremented by 1.For other values, it remains as it is

interim.df2<-interim.df%>%
  select(-Lagged_Sales,-Diff_Sales)%>%
  mutate(Counter = cumsum(Flag == 1))

interim.df2
   Product Week Sales Flag Counter
1        A    1    10    0       0
2        A    2     9    1       1
3        A    3     8    1       2
4        A    4     5    1       3
5        A    5    12    0       3
6        A    6    12    0       3
7        A    7    11    1       4
8        A    8    10    1       5
9        A    9    10    0       5
10       A   10    15    0       5

We can see that Counter variable gets incremented by 1 whenver the value of Flag is 1.For ther values, it takes the first lagged value


Step 4: Getting weeks of Most consecutive Sales Decline

We can use the rleid function to get this information

interim.df3<-interim.df2%>%
  mutate(Counter2 = rleid(Flag))

interim.df3
   Product Week Sales Flag Counter Counter2
1        A    1    10    0       0        1
2        A    2     9    1       1        2
3        A    3     8    1       2        2
4        A    4     5    1       3        2
5        A    5    12    0       3        3
6        A    6    12    0       3        3
7        A    7    11    1       4        4
8        A    8    10    1       5        4
9        A    9    10    0       5        5
10       A   10    15    0       5        5


interim.df3%>%
  group_by(Counter2)%>%
  summarise(Total_Such_Weeks=n())%>%
  arrange(desc(Total_Such_Weeks))
# A tibble: 5 x 2
  Counter2 Total_Such_Weeks
     <int>            <int>
1        2                3
2        3                2
3        4                2
4        5                2
5        1                1

We can see that most week of consecutive sales decline is 3

Parting Comments

In this blog we looked at a very simple example of how We can use a simple strategy to reset a counter in R

My Youtube Channel

Web Scraping Tutorial 4- Getting the busy information data from Popular time page from Google

Popular Times Popular Times In this blog we will try to scrape the ...