MATLAB Answers

0

Matlab Novice here - struggling to truncate a column array

Asked by Alex Herron on 10 Jun 2019
Latest activity Edited by Jan
on 11 Jun 2019
Currently working with a large data set, where one of the columns is a time series of the following format: 1234567.0002.2018.0113
The first 7 digits are an ID number, the next 4 are hour and min, next 4 are year, and the last 4 are month and date. I want to shave this column down to just the month, as this is the only important data point at the moment.
After taking the column out of the original data table, I was able to shorten a single data point, but couldn't apply the same thing to the entire column.
(C is the column)
ex: C{1,1}(18:19) yields the 01 for the first month
However, C{:}(18:19) yields the following error message: "Expected one output from a curly brace or dot indexing expression, but there were 78024 results."
Furthermore, C{1:78024, 18:19} yields "Index in position 2 exceeds array bounds (must not exceed 1)."
Not quite sure how to proceed, and any help is much appreciated!

  2 Comments

Are you cell contents stored as strings? I.e. is 123456.0002.2018.0113 actually '1234567.0002.2018.0113'? If so, I think you would be better off using regexp to split this all at once.
Unfortunately, the error you're running into is because you have your data stored in cells. Any time you have data elements that have non-singular contents working with multiple elements at once is tricky.
So the cell contents are stored as 1x1 cell arrays, meaning they are indeed in the '1234567.0002.2018.0113' format.
Regexp is new to me - but it seems like most files need to be in character format or strings, is there a way to use cells, or do I need to change the data type?
It has become increasingly clear that the data is quite difficult to work with, since it is stored in cells. I've have been taking the original data (which is in table form), using table2array (which has been turning it into cells), then using str2double (to turn it back into doubles). Is there an easier/simpler way to do this? I feel like I'm doing extra steps here.
Furthermore, is there anything else I should keep an eye out for with working with non-singular contents/multiple elements?
Thanks for the help, I really appreciate it!

Sign in to comment.

2 Answers

Answer by Steven Lord
on 10 Jun 2019
Edited by Steven Lord
on 10 Jun 2019
 Accepted Answer

If the data is of a fixed with convert it from a cell array containing char vectors into a string array and use extractBetween.
data = {'1234567.0002.2018.0113'; ...
'8901234.0103.2018.0214'; ...
'8675309.0211.2018.0315'}
dataString = string(data)
monthdata = extractBetween(dataString, 19, 20)
You could convert the string array monthdata into a char array or even a double array.
Another alternative, depending on what you need to do with your time data, would be to extract the appropriate sections from dataString and turn them into a datetime array then ask for the month of that datetime array.
timeAndDateData = extractBetween(dataString, 9, 22)
dt = datetime(timeAndDateData, 'InputFormat', 'HHmm.yyyy.MMdd')
month(dt)
Once you have your datetime array you could consider turning your table into a timetable using table2timetable. This would allow you to perform certain date and time-related operations on the timetable like using retime to change the time basis (for instance, to make your data uniformly spaced in time.)

  1 Comment

Thank you so much! This had been driving me nuts for a bit!

Sign in to comment.


Answer by Jan
on 10 Jun 2019

cellfun(@(x) x(18:19), C, 'UniformOutput', false)

  5 Comments

example_of_inputs.PNG
Here's an example of what the input data looks like. Multiple data points for a given time. In the format of IDNUMBER.HRMIN.YEARMONTHDAY. I've scrolled through the data and have not been able to find any empty elements in the cells (however this data set consists of 78,000 data points so I'm certainly not sure).
Use:
find(cellfun(@isempty, data))
to locate cells that are empty. Or if you just want to eliminate those empty cells, omit the find and use the resulting logical array to delete the empty cells or keep the non-empty cells.
notEmpty = ~cellfun('isempty', C);
Data = cellfun(@(x) x(18:19), C(notEmpty), 'UniformOutput', false)
The error tells you, that some elements are empty. You have to find out why and how you want to treat them.

Sign in to comment.