Consecutive count of values based on multiple conditions

4 views (last 30 days)
I have a very big table consisting of approximately 3 million rows. The table consists of four columns [ID period amount pbehind], see the following screenshot including the output column I want.
For every ID, I want to count the consecutive amount of times of behind payment (pbehind). But there are a few conditions to this:
  1. when pbehind <= 0.5, then output = 0
  2. when pbehind > 1.5, but value in previous period is 0, then output = 0;
  3. when pbehind > 0.5, but amount in next period (which is last period) is zero, then output=0.
There are a couple of things to note:
  1. period does not have to start at 1 and increase by 1 (see ID = 2). However it is ordered by ID and period using sortrows(table,[1,2]).
  2. An ID can make multiple payments for a certain period, however pbehind will always be equal to for that period (see ID=3).
I know for the first condition that I can just do the following:
output = pbehind;
output(output<=0.5)=0;
But I cannot figure out how to efficiently implement the other conditions. Does anyone know how to do this?

Answers (2)

Peter Perkins
Peter Perkins on 19 Nov 2020
Put your data in a table, and use rowfun with ID as the grouping variable. Write a function that does all the logic you need to enforce. Your function will be applied to one group of rows at a time, and should return a column vector of output values. Here's a simple example:
>> t = table([1;1;1;2;2],rand(5,1),rand(5,1),'VariableNames',{'ID' 'X' 'Y'})
t =
5×3 table
ID X Y
__ ________ ________
1 0.82202 0.26854
1 0.041591 0.63908
1 0.91635 0.031734
2 0.17678 0.33395
2 0.92236 0.8908
>> t2 = rowfun(@(x,y) x - mean(y),t,'GroupingVariable','ID')
t2 =
5×3 table
ID GroupCount Var3
__ __________ ________
1 3 0.50891
1 3 -0.27152
1 3 0.60323
2 2 -0.4356
2 2 0.30998
>> t.Output = t2.Var3
t =
5×4 table
ID X Y Output
__ ________ ________ ________
1 0.82202 0.26854 0.50891
1 0.041591 0.63908 -0.27152
1 0.91635 0.031734 0.60323
2 0.17678 0.33395 -0.4356
2 0.92236 0.8908 0.30998
Obviously you will need to write your own function.
  1 Comment
mtango
mtango on 19 Nov 2020
The problem is that I cannot figure out how to create code to count the consecutive amount of times of behind payment (pbehind) for each ID...

Sign in to comment.


Setsuna Yuuki.
Setsuna Yuuki. on 19 Nov 2020
you should try with a series of if..else. For example:
for i = length(output)
if(pbehind(i) > 1.5 && pbehind(i-1) == 0)
output(i) =0;
elseif(condition)
...
end

Categories

Find more on Tables in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!