Generate hourly timeseries data from "START,END,VALUE" data
Show older comments
Hello everyone,
I got the attached data file of events with the following structure:
- START_DATE (dd.mm.yyyy)
- START_TIME (hh:mm:ss) [24h]
- END_DATE (dd.mm.yyyy) can be > than START_DATE (event goes form one day to another)
- END_TIME (hh:mm:ss) [24h]
- VALUE for the event [< 0]
I assume that the Value is equal distributed during the time. I would like the transform the Data in hourly data and have one sum of value for every hour. Maybe this makes it a little bit more clear.

The duration of the events can be very short (not less than one minute) and long (more than one day overnight).
Has anyone here any Idea how to do that with matlab? I tried to generate a timeseries with every possible hour and then sum if the Data points fits in that timeslot but this doesn’t really work out. I’m really looking forward to your answers.
Best wishes,
Lukas
2 Comments
dpb
on 9 May 2022
Hmmm.....this is seemingly simple problem, but I don't see a ready-build solution in MATLAB that "just works" out of the box, at least not OTOMH.
It doesn't quite fit the timetable; it might be closer to the timeseries with events, but neither is set up to handle the input form very directly.
I've got an idea that could work -- it would entail building a timetable for each even, then synchronize() all of those to build the one composite timetable. That done, then retime would be able to do the dirty for the output summation hourly.
Let me play around here a little and see if this goes anywhere...
dpb
on 9 May 2022
There is more time on an hourly basis that doesn't have an event than there is that does so your graph above is somewhat misleading -- the following is the beginning of the above idea for the first eight (8) records in the data file (I just used head(ttT) as a working dataset after creating a time table from each row by using the start/end times and duplicating the VALUE.
tT=readtable('exemple_Data_points.xlsx'); % read as a table
tmp=head(tT); % just save a few for testing ideas
tmp.RT1=tmp.START_DATE+days(tmp.START_TIME); % got to turn into full datetimes
tmp.RT2=tmp.END_DATE+days(tmp.END_TIME);
for i=1:height(tmp) % turn each record into a timetable of two times
ttT{i}=table2timetable( ...
table([tmp.RT1(i);tmp.RT2(i)],repmat(tmp.VALUE(i),2,1), ...
'VariableNames',{'Time','Value'}),'RowTimes','Time');
end
ttTT=ttT{1}; % now start with the first to combine/synchronize
for i=2:numel(ttT) % add all the others at their times; catenates VALUE horizontally
ttTT=synchronize(ttTT,ttT{i});
end
ttTT.Properties.VariableNames=compose('V%d',1:numel(ttTT)); % just some short variable names
stackedplot(ttTT,'Color','k','LineWidth',3) % show what the first eight look like

Which shows the sizable gaps between events...
So, that raises Q? of what to do in between when we retime (which does work, btw, to fill in between the start/stop times) -- the default filler is missing value or NaN which is what shows up as empty space on above plot; a zero would give a solid line at zero for each set.
Also, how much granularity is needed? The original Q? asked for hourly; that will truncate items of less length than that with the normal synchronize method I believe although the shortest in the above data is 2 hr
>> tmp.DURATION=tmp.RT2-tmp.RT1
tmp =
8×8 table
START_DATE START_TIME END_DATE END_TIME VALUE RT1 RT2 DURATION
___________ __________ ___________ ________ _____ ____________________ ____________________ ________
02-Apr-2013 0.54167 02-Apr-2013 0.66667 800 02-Apr-2013 13:00:00 02-Apr-2013 16:00:00 03:00:00
04-Apr-2013 0.10417 04-Apr-2013 0.33333 485 04-Apr-2013 02:30:00 04-Apr-2013 08:00:00 05:30:00
04-Apr-2013 0.10417 04-Apr-2013 0.33333 495 04-Apr-2013 02:30:00 04-Apr-2013 08:00:00 05:30:00
05-Apr-2013 0.5625 05-Apr-2013 0.80208 288 05-Apr-2013 13:30:00 05-Apr-2013 19:15:00 05:45:00
06-Apr-2013 0.083333 06-Apr-2013 0.33333 475 06-Apr-2013 02:00:00 06-Apr-2013 08:00:00 06:00:00
06-Apr-2013 0.11458 06-Apr-2013 0.25 325 06-Apr-2013 02:45:00 06-Apr-2013 06:00:00 03:15:00
06-Apr-2013 0.25 06-Apr-2013 0.33333 100 06-Apr-2013 06:00:00 06-Apr-2013 08:00:00 02:00:00
07-Apr-2013 0.072917 07-Apr-2013 0.20833 16 07-Apr-2013 01:45:00 07-Apr-2013 05:00:00 03:15:00
>>
is the created initial table from which the timetables were created.
Accepted Answer
More Answers (0)
Categories
Find more on Logical in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!