How to extract specific rows from big data table.(87000x35)

I have big data in spreadsheet and i imported it in MATLAB as a table. Now I want to filter data according to status of the machine i.e RUNNING. How to extract rows which will give me only the details i.e of all rows, when the machine is running.

4 Comments

Most basic way is to run for loop to examine each row and include an if statement. You can also do logic checks like ismember() or isnan() if you have NaN values somewhere. Have you tried any of these things? Have you run into any problems? Is there some more specific problem you are facing?
yes i am using ismember() but because the data is too large, MATLAB is taking too much time to execute the code. Anyways it'd be great if you can check and comment on my code.
HPperformance = readtable('HPperformacne.xlsx'); %reading the excel file into HPperformance%
col_seapassage = HPperformance.State; %reading only the State coloumn%
seapassage_rows = ismember(col_seapassage,'Sea passage'); %checking the rows which has Sea passage as its state%
seapassage_table = HPperformance(seapassage_rows,:); %storing the filtered data in seapassage_table%
Are you only examining the one column or are you doing multiple checks? If you need multiple things one of the things you can do is run a for loop for each line and check for multiple things. It will get a little bit messy in the if statement, but it will allow you to only look through the table once, rather than multiple times.
Processing a row at a time will be slower than needed. MATLAB tables are arranged as having one data type attribute per variable rather than for each row entry of the variable, so if you access an entire column (variable) you only have the data type checking overhead once per variable, but if you access a row at a time, you have the data type checking for each variable for each row. When memory is available it is more efficient to create a logical mask or group index for each column and join the results with & or | to create logical masks . Or consider using findgroups() and rowfun().

Sign in to comment.

 Accepted Answer

More Answers (2)

Instead of reading the table everytime, can't i create the table variable for my spreadsheet and keep invoking that as and when required in the code?? (just to reduce the processing time)

2 Comments

What do you mean by "create the table variable"? From your code posted above you have HPperformance as the matlab table which is your entire excel data. You shouldn't need to reinvoke the readtable() command unless you are looking at a new excel file.
If you are asking if you can avoid reading the entire HPperformance table then my answer is more situational.
The reality is you have 87000 pieces of data in one column, even with the best of matlab that still a fair bit of data, and is going to take some time to read through.

Sign in to comment.

In addition to what othjers have suggested, it's almost certain that you want to convert the machine status to a categorical variable rather than leaving it as text. Smaller and faster.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!