How to extract specific rows from big data table.(87000x35)
Show older comments
I have big data in spreadsheet and i imported it in MATLAB as a table. Now I want to filter data according to status of the machine i.e RUNNING. How to extract rows which will give me only the details i.e of all rows, when the machine is running.
4 Comments
Bob Thompson
on 5 Mar 2018
Most basic way is to run for loop to examine each row and include an if statement. You can also do logic checks like ismember() or isnan() if you have NaN values somewhere. Have you tried any of these things? Have you run into any problems? Is there some more specific problem you are facing?
Rashdeep Singh
on 5 Mar 2018
Edited: Rashdeep Singh
on 5 Mar 2018
Bob Thompson
on 6 Mar 2018
Are you only examining the one column or are you doing multiple checks? If you need multiple things one of the things you can do is run a for loop for each line and check for multiple things. It will get a little bit messy in the if statement, but it will allow you to only look through the table once, rather than multiple times.
Walter Roberson
on 6 Mar 2018
Processing a row at a time will be slower than needed. MATLAB tables are arranged as having one data type attribute per variable rather than for each row entry of the variable, so if you access an entire column (variable) you only have the data type checking overhead once per variable, but if you access a row at a time, you have the data type checking for each variable for each row. When memory is available it is more efficient to create a logical mask or group index for each column and join the results with & or | to create logical masks . Or consider using findgroups() and rowfun().
Accepted Answer
More Answers (2)
Rashdeep Singh
on 5 Mar 2018
Edited: Rashdeep Singh
on 5 Mar 2018
0 votes
2 Comments
Bob Thompson
on 6 Mar 2018
What do you mean by "create the table variable"? From your code posted above you have HPperformance as the matlab table which is your entire excel data. You shouldn't need to reinvoke the readtable() command unless you are looking at a new excel file.
If you are asking if you can avoid reading the entire HPperformance table then my answer is more situational.
The reality is you have 87000 pieces of data in one column, even with the best of matlab that still a fair bit of data, and is going to take some time to read through.
Rashdeep Singh
on 6 Mar 2018
Peter Perkins
on 8 Mar 2018
0 votes
In addition to what othjers have suggested, it's almost certain that you want to convert the machine status to a categorical variable rather than leaving it as text. Smaller and faster.
Categories
Find more on Spreadsheets in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!