How do I identify close matched regions from this vector?

4 views (last 30 days)
Hello,
I have this 1D vector: posted in Pastebin .
Just by looking at it I can tell that I have 4 different regions, 0 to 122 then 375 to 563, 1145 to 1292 and 1697 to 2242. This is based on how much one region "jumps" to another.
Is there any way in Matlab in which I can identify these regions from this vector?
Thank you

Accepted Answer

Star Strider
Star Strider on 20 Mar 2014
Edited: Star Strider on 21 Mar 2014
There may be more efficient ways to do what you want, but this at least seems to work:
v = [ 9 18 21 58 59 60 63 66 69 70 72 74 ...
dv = diff([0 v]); % Create difference vector
dvs = [mean(dv) std(dv)]; % Determine mean & std
dvd = [0 find(dv > 1.96*dvs(2)+dvs(1)) length(v)]; % Use ‘dvs’ to detect discontinuities & create index reference vector
for k1 = 1:length(dvd)-1
vs{k1} = v(dvd(k1)+1:dvd(k1+1)-1); % Create cell array of regions-of-interest
vi{k1} = [dvd(k1)+1 dvd(k1+1)-1]; % Create reference array of start-end indices for ‘vs’ regions
end
The v vector are your data, the vi cell array contains the beginning and end indices of your segments-of-interest, and the vs cell array contains the data in your segments-of-interest. A cell array is necessary for vs because the vectors are of different lengths. (See the documentation for cell2mat to convert cell arrays back into doubles.)
  9 Comments
Star Strider
Star Strider on 28 Mar 2014
Edited: Star Strider on 28 Mar 2014
Thank you, Image Analyst.
Faraz, I don’t respond to MATLAB Answers e-mail for the reasons Image Analyst listed (and others). I believe everything should be kept posted here for the sake of continuity. I check my profile page to see if there has been any activity in anything I’ve answered.
The difference vector does two things: it (1) detects the approximate slope of the data between the discontinuities thereby removing the offset, and (2) creates ‘spikes’ at the discontinuities, making the discontinuities much easier to detect. The 95% confidence limits are simply an adaptive way of making the code work for a large number of different data sets. In the dvd statement, my code looks for the beginning of each segment and the end as defined by the discontinuities. The lower limits are offset by 1, so I started with 0, so the code picks data from index 1 to the first discontinuity, continues to the last discontinuity to the end of the vector.
This works for your data because they are not noisy, and is not a general solution. Noisy data or data with significant variations between the step discontinuities would pose different problems.
Probability and statistics are fascinating areas, and I wish I knew more about them than I do. I certainly suggest you take courses in them.
Faraz
Faraz on 28 Mar 2014
@StarStrider and @ImageAnalyst
Thanks a lot, you guys were very helpful in your explanations. I believe I have a firm grip of the solution now.
Duly noted and agreed with the email thing, will post here from now on.

Sign in to comment.

More Answers (1)

Image Analyst
Image Analyst on 20 Mar 2014
What do you mean by identify? It seems like you just did when you described the range of values that each takes. Do you want the indexes of each class? Do you want those intensity ranges specified automatically depending on each vector, like as if you use kmeans() or something? You (or someone) tagged it with image Processing so do you want to do connected components analysis (useful if some regions in an intensity range are not touching each other)?
  14 Comments
Star Strider
Star Strider on 26 Mar 2014
I had to go read about regionprops and related functions, since I don’t do much image processing. (I intend to explore the File Exchange for demos and tutorials, but not just now.)
I feel as though I managed to jump into the middle of something here without knowing the wider context, and I’m still not certain I do. I always assume that Faraz and others who post here have designed their studies carefully, knowing how they intend to acquire and analyse their data, and post here for help in dealing with unanticipated problems. (Too often that is not the situation, and people decide to design their studies after they have gathered their data, but I did not get that impression here.)
I’m glad I could help, and I apologise for contributing to any confusion.
Image Analyst
Image Analyst on 26 Mar 2014
You've been very helpful. It can get confusing when people don't give the whole context so we don't know what the big picture is. Even more frustrating is when you know the big picture but people are dead set on going down a path that you know is a dead end , and when you suggest a workable approach they continue to try their dead end approach.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!