Determine data cells in Excel

1 view (last 30 days)
Erik Eriksson
Erik Eriksson on 26 Apr 2024
Commented: Sam Chak on 30 May 2024
I have a large Excel-file (frequencydata.xls) containing data for grid frequency. The file contains data for one day (24 h) with 0.1s resulotion. Meaning that the file has 864 000 rows. Each row looks like this: 2023-01-01 00:00:00:0 50.0000
I know want to determine each rows that has the frequency decrease below 49.9000 AND also whihc rows has a frequency greater than 50.1000.
SO the desired output would be something like this:
Frequency > 50.1000: 2023-01-03 00:00:00:0 50.1500
2023-01-03 00:00:00:1 50.1510
2023-01-03 00:00:00:2 50.1520
2023-02-05 06:00:00:2 50.1620 etc
Frequency < 49.9000: 2023-05-03 00:00:00:0 49.8500
2023-05-03 00:00:00:1 49.8510
2023-07-03 00:50:00:2 49.8520
2023-07-05 06:00:00:2 49.81620 etc
If it would be possible to get the duration for each consecutive deviation as output also it would be awesome! Appriciate all help!
  1 Comment
Saurav
Saurav on 26 Apr 2024
Can you attach your "frequencydata.xls" file if possible for better help?

Sign in to comment.

Answers (1)

sai charan sampara
sai charan sampara on 26 Apr 2024
Edited: sai charan sampara on 30 Apr 2024
Hello Erik,
The following code might help you. I have used random data in the Excel file to verify the code. To get the timestamps that satisfy the required condition on frequency you can use logical indexing as shown below:
data = readtable("frequencydata.xlsx");
timestamps = data{:, 1};
frequencies = data{:, 2};
idx1=(frequencies > 50.1);
timestamps_gt_50 = timestamps(idx1);
disp(timestamps_gt_50);
{'2023-01-03 00:00:00:0' } {'2023-01-03 00:00:00:3' } {'2023-01-03 00:00:00:4' } {'2023-01-03 00:00:00:5' } {'2023-01-03 00:00:00:6' } {'2023-01-03 00:00:00:10'} {'2023-01-03 00:00:00:12'} {'2023-01-03 00:00:00:14'} {'2023-01-03 00:00:00:18'} {'2023-01-03 00:00:00:21'} {'2023-01-03 00:00:00:22'} {'2023-01-03 00:00:00:23'} {'2023-01-03 00:00:00:24'}
To get the duration you can use the "datetime" data type in MATLAB. But to use that you must convert the data into the required format. I have done that using a function "convtodatetime". This function uses regular expressions to extract the required data and define the new "datetime" variable. Once the data is in "datetime" format you can use the "between" function to get the duration between 2 consecutive deviations.
for i=1:length(timestamps_gt_50)-1
disp(between(conv2datetime(timestamps_gt_50{i}),conv2datetime(timestamps_gt_50{i+1})))
end
0h 0m 0.003s 0h 0m 0.001s 0h 0m 0.001s 0h 0m 0.001s 0h 0m 0.004s 0h 0m 0.002s 0h 0m 0.002s 0h 0m 0.004s 0h 0m 0.003s 0h 0m 0.001s 0h 0m 0.001s 0h 0m 0.001s
function out=conv2datetime(str)
expression ='[\s:-]';
splitStr = regexp(str,expression,'split');
newstr=str2double(splitStr);
out=datetime(newstr(1),newstr(2),newstr(3),newstr(4),newstr(5),newstr(6),newstr(7));
end
Similar steps can be followed to get timestamps and duartions for frequencies less than 49.9.
  4 Comments
Erik Eriksson
Erik Eriksson on 30 Apr 2024
I have managed to get my code to write out the amount of seconds for each hour that the deviation is great or less than 50.1 and 49.9 respectively. However my code save everything in a map of 31x1 cell containing 31 maps of 24x1 of data. However i would like to have all this data in one signle map, that means that it only saves one map for values "timestamp_gt_50_all" and "timestamps_lt_49_all".
Hope you understands.
this is my code:
clear all; clc
% Collects a list of all files in folder "2023-01"
fileList = dir('2023-01/*.csv');
% Creates cell arrays to store timestamps for each day.
timestamps_gt_50_all = cell(numel(fileList), 1);
timestamps_lt_49_all = cell(numel(fileList), 1);
% Loopa over each file in the list
for i = 1:numel(fileList)
% Creat the complete file path for the actual file.
filename = fullfile('2023-01/', fileList(i).name);
% Read in data from file
data = readtable(filename);
% Extract timestamps and frequency from data
timestamps = data{:, 1};
frequencies = data{:, 2};
% Calculate index for frequencies over 50.1 and beloq 49.9.
idx1 = (frequencies > 50.1);
idx2 = (frequencies < 49.9);
% Extract timestamps for frequencies over 50.1 and below 49.9
timestamps_lt_49 = timestamps(idx2);
% Split timestamps for each hour.
hourly_timestamps = split_timestamps(timestamps);
% Calculate the amount of seconds for deviations over 50.1 for each hour.
gt_50_seconds_per_hour = count_seconds(hourly_timestamps, timestamps_gt_50);
% Calculate the amount of seconds for deviations under 49.9 for each hour.
lt_49_seconds_per_hour = count_seconds(hourly_timestamps, timestamps_lt_49);
% Svae the results for each day.
timestamps_gt_50_all{i} = gt_50_seconds_per_hour;
timestamps_lt_49_all{i} = lt_49_seconds_per_hour;
end
% Function for splitting the timestamps for each hour.
function hourly_timestamps = split_timestamps(timestamps)
hourly_timestamps = cell(24, 1); % Creat a cell array for each hour.
for h = 0:23
hour_start = datetime(year(timestamps(1)), month(timestamps(1)), day(timestamps(1)), h, 0, 0);
hour_end = hour_start + hours(1) - seconds(1);
idx = timestamps >= hour_start & timestamps <= hour_end;
hourly_timestamps{h+1} = timestamps(idx);
end
end
% Function for calculating amount of seconds with devaiotns for each hour.
function seconds_per_hour = count_seconds(hourly_timestamps, timestamps)
seconds_per_hour = zeros(24, 1);
for h = 0:23
if isempty(hourly_timestamps{h+1})
continue;
end
hour_start = datetime(year(hourly_timestamps{h+1}(1)), month(hourly_timestamps{h+1}(1)), day(hourly_timestamps{h+1}(1)), h, 0, 0);
hour_end = hour_start + hours(1) - seconds(1);
idx = timestamps >= hour_start & timestamps <= hour
Sam Chak
Sam Chak on 30 May 2024
Is there any update on this unresolved issue? If you wish to pursue it further, providing feedback would enable @sai charan sampara or other interested users to review and update the old solution.

Sign in to comment.

Products


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!