File Ensemble Datastore with Measured Data
In predictive-maintenance algorithm design, you often work with large sets of data collected from operation of your system under varying conditions. The fileEnsembleDatastore
object helps you manage and interact with such data. For this example, create a fileEnsembleDatastore
object that points to ensemble data on disk. Configure it with functions that read data from and write data to the ensemble.
Structure of the Data Files
For this example, you have two data files containing healthy operating data from a bearing system, baseline_01.mat
and baseline_02.mat
. You also have three data files containing faulty data from the same system, FaultData_01.mat
, FaultData_02.mat
, and FaultData_03.mat
. In practice you might have many more data files.
Each of these data files contains one data structure, bearing
. Load and examine the data structure from the first healthy data set.
unzip fileEnsData.zip % extract compressed files load baseline_01.mat bearing
bearing = struct with fields:
sr: 97656
gs: [5000x1 double]
load: 270
rate: 25
The structure contains a vector of accelerometer data gs
, the sample rate sr
at which that data was recorded, and other data variables.
Create and Configure File Ensemble Datastore
To work with this data for predictive maintenance algorithm design, first create a file ensemble datastore that points to the data files in the current folder.
fensemble = fileEnsembleDatastore(pwd,'.mat');
Before you can interact with data in the ensemble, you must create functions that tell the software how to process the data files to read variables into the MATLAB® workspace and to write data back to the files. For this example, use the following provided functions:
readBearingData
— Extract requested variables from a structure,bearing
, and other variables stored in the file. This function also parses the file name for the fault status of the data. The function returns a table row containing one table variable for each requested variable.writeBearingData
— Take a structure and write its variables to a data file as individual stored variables.
Assign these functions to the ReadFcn
and WriteToMemberFcn
properties of the ensemble datastore, respectively.
fensemble.ReadFcn = @readBearingData; fensemble.WriteToMemberFcn = @writeBearingData;
Finally, set properties of the ensemble to identify data variables and condition variables.
fensemble.DataVariables = ["gs";"sr";"load";"rate"]; fensemble.ConditionVariables = ["label";"file"];
Examine the ensemble. The functions and the variable names are assigned to the appropriate properties.
fensemble
fensemble = fileEnsembleDatastore with properties: ReadFcn: @readBearingData WriteToMemberFcn: @writeBearingData DataVariables: [4x1 string] IndependentVariables: [0x0 string] ConditionVariables: [2x1 string] SelectedVariables: [0x0 string] ReadSize: 1 NumMembers: 5 LastMemberRead: [0x0 string] Files: [5x1 string]
Read Data from Ensemble Member
The functions you assigned tell the read
and writeToLastMemberRead
commands how to interact with the data files that make up the ensemble datastore. Thus, when you call the read
command, it uses readBearingData
to read all the variables in fensemble.SelectedVariables
.
Specify variables to read, and read them from the first member of the ensemble. The read
command reads data from the first ensemble member into a table row in the MATLAB workspace. The software determines which ensemble member to read first.
fensemble.SelectedVariables = ["file";"label";"gs";"sr";"load";"rate"]; data = read(fensemble)
data=1×6 table
label file gs sr load rate
________ ______________ _______________ _____ ____ ____
"Faulty" "FaultData_01" {5000x1 double} 48828 0 25
Write Data to Ensemble Member
Suppose that you want to analyze the accelerometer data gs
by computing its power spectrum, and then write the power spectrum data back into the ensemble. To do so, first extract the data from the table and compute the spectrum.
gsdata = data.gs{1};
sr = data.sr;
[pdata,fpdata] = pspectrum(gsdata,sr);
pdata = 10*log10(pdata); % Convert to dB
You can write the frequency vector fpdata
and the power spectrum pdata
to the data file as separate variables. First, add the new variables to the list of data variables in the ensemble datastore.
fensemble.DataVariables = [fensemble.DataVariables;"freq";"spectrum"]; fensemble.DataVariables
ans = 6x1 string
"gs"
"sr"
"load"
"rate"
"freq"
"spectrum"
Next, write the new values to the file corresponding to the last-read ensemble member. When you call writeToLastMemberRead
, it converts the data to a structure and calls fensemble.WriteToMemberFcn
to write the data to the file.
writeToLastMemberRead(fensemble,'freq',fpdata,'spectrum',pdata);
You can add the new variable to fensemble.SelectedVariables
or other properties for identifying variables, as needed.
Calling read
again reads the data from the next file in the ensemble datastore and updates the property fensemble.LastMemberRead
.
data = read(fensemble)
data=1×6 table
label file gs sr load rate
________ ______________ _______________ _____ ____ ____
"Faulty" "FaultData_02" {5000x1 double} 48828 50 25
You can confirm that this data is from a different member by the load
variable in the table. Here, its value is 50, while in the previously read member, it was 0.
Batch-Process Data from All Ensemble Members
You can repeat the processing steps to compute and append the spectrum for this ensemble member. In practice, it is more useful to automate the process of reading, processing, and writing data. To do so, reset the ensemble datastore to a state in which no data has been read. (The reset
operation does not change fensemble.DataVariables
, which contains the two new variables you already added.) Then loop through the ensemble and perform the read, process, and write steps for each member.
reset(fensemble) while hasdata(fensemble) data = read(fensemble); gsdata = data.gs{1}; sr = data.sr; [pdata,fpdata] = pspectrum(gsdata,sr); writeToLastMemberRead(fensemble,'freq',fpdata,'spectrum',pdata); end
The hasdata
command returns false
when every member of the ensemble has been read. Now, each data file in the ensemble includes the spectrum
and freq
variables derived from the accelerometer data in that file. You can use techniques like this loop to extract and process data from your ensemble files as you develop a predictive-maintenance algorithm. For an example illustrating in more detail the use of a file ensemble datastore in the algorithm-development process, see Rolling Element Bearing Fault Diagnosis. That example also shows the use of Parallel Computing Toolbox™ to speed up the processing of a larger ensemble.
To confirm that the derived variables are present in the file ensemble datastore, read them from the first and second ensemble members. To do so, reset the ensemble again, and add the new variables to the selected variables. In practice, after you have computed derived values, it can be useful to read only those values without rereading the unprocessed data, which can take significant space in memory. For this example, read selected variables that include the new variables but do not include the unprocessed data, gs
.
reset(fensemble) fensemble.SelectedVariables = ["label","load","freq","spectrum"]; data1 = read(fensemble)
data1=1×4 table
label load freq spectrum
________ ____ _______________ _______________
"Faulty" 0 {4096x1 double} {4096x1 double}
data2 = read(fensemble)
data2=1×4 table
label load freq spectrum
________ ____ _______________ _______________
"Faulty" 50 {4096x1 double} {4096x1 double}
See Also
fileEnsembleDatastore
| read
| writeToLastMemberRead