How to detect repetition in data?

7 views (last 30 days)
Jacqueline
Jacqueline on 2 Jul 2013
Hello,
So I have various data files containing different information, such as engine speed, engine torque, etc. Each file has about 10,000 points, one for each second (so the data was gathered for over two hours). I'm trying to analyze the data such that if for 60 seconds, the data is the same, then there is an error with the data. For example, if the engine speed was 79.356 for 60 data points, there is an error..
How do I go about doing this?

Accepted Answer

Evan
Evan on 2 Jul 2013
Edited: Evan on 2 Jul 2013
Do you only want to identify adjacent points of repitition for your data, or any points that are not unique? If you're wanting the former, you could try loading in the data into a numerical matrix and using the "diff" command across your vector. Any point where the difference is zero would be a repeated value. In this way you could determine the beginning, extent, etc. of data reptition for whatever conditions you need to meet to throw an error.
Example:
data = 100*rand(1,10000); %random dataset
data(1,50:120) = 79.356; %set some data to constant value
datarep = ~diff(data);
Now, you can count the run-length of each set of repeated data. There might be other ways of doing it, but for run lengths I often convert to a string and use "regexp."
s = regexprep(num2str(datarep),' ',''); %convert to string, remove spaces
[ids runs] = regexp(s,'1+','start','match');
l = cellfun('length',runs);
In this way, ids will tell you where each set of repeated values starts, and l will tell you the length of each. This will give you enough information for seeing if your error conditions are met.
  1 Comment
Jacqueline
Jacqueline on 3 Jul 2013
I kind of get what you're doing. I understand the diff function, but you lost me with the rest. When I use the diff function on a variable, it gives me a long list of numbers. I want to know if/where there are 60 zeros in a row, because that is where the data has not changed for 60 seconds. How do I do that?

Sign in to comment.

More Answers (1)

Kwen
Kwen on 2 Jul 2013
I would use a loop and the unique function.
I'm not sure you problem is consistent though-you can possibly have values that add above 60 even with the unique function but that would not necessarily cause an error?

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!