Main Content

This example shows how to predict the remaining useful life (RUL) of engines by using deep convolutional neural networks (CNN) [1]. The advantage of a deep learning approach is that there is no need for manual feature extraction or feature selection for your model to predict RUL. Furthermore, prior knowledge of machine health prognostics and signal processing is not required for developing a deep learning based RUL prediction model.

This example uses the Turbofan Engine Degradation Simulation Dataset (C-MAPSS) [2]. The ZIP-file contains run-to-failure time-series data for four different sets (namely FD001, FD002, FD003, FD004) simulated under different combinations of operational conditions and fault modes.

This example uses only the FD001 dataset which is further divided into training and test subsets. The training subset contains simulated time series data for 100 engines. Each engine has several sensors whose values are recorded at a given instance in a continuous process. Hence the sequence of recorded data varies in length and corresponds to a full run-to-failure (RTF) instance. The test subset contains 100 partial sequences and corresponding values of the remaining useful life at the end of each sequence.

Download the Turbofan Engine Degradation Simulation dataset to a file named “CMAPSSData.zip” and unzip it to a folder called `“data”`

in the current directory.

filename = "CMAPSSData.zip"; if ~exist(filename,'file') url = "https://ti.arc.nasa.gov/c/6/"; websave(filename,url); end dataFolder = "data"; if ~exist(dataFolder,'dir') mkdir(dataFolder); end unzip(filename,dataFolder)

The data folder now contains text files with 26 columns of numbers, separated by spaces. Each row is a snapshot of data taken during a single operational cycle, and each column represents a different variable:

Column 1: Unit number

Column 2: Time-stamp

Columns 3–5: Operational settings

Columns 6–26: Sensor measurements 1–21

Load the data using the function `localLoadData`

. The function extracts the data from `filenamePredictors`

and returns a table which contains the training predictors and corresponding response (i.e., RUL) sequences. Each row represents a different engine.

```
filenameTrainPredictors = fullfile(dataFolder,"train_FD001.txt");
rawTrain = localLoadData(filenameTrainPredictors);
```

Examine run-to-failure data for one of the engines.

head(rawTrain.X{1},8)

`ans=`*8×26 table*
id timeStamp op_setting_1 op_setting_2 op_setting_3 sensor_1 sensor_2 sensor_3 sensor_4 sensor_5 sensor_6 sensor_7 sensor_8 sensor_9 sensor_10 sensor_11 sensor_12 sensor_13 sensor_14 sensor_15 sensor_16 sensor_17 sensor_18 sensor_19 sensor_20 sensor_21
__ _________ ____________ ____________ ____________ ________ ________ ________ ________ ________ ________ ________ ________ ________ _________ _________ _________ _________ _________ _________ _________ _________ _________ _________ _________ _________
1 1 -0.0007 -0.0004 100 518.67 641.82 1589.7 1400.6 14.62 21.61 554.36 2388.1 9046.2 1.3 47.47 521.66 2388 8138.6 8.4195 0.03 392 2388 100 39.06 23.419
1 2 0.0019 -0.0003 100 518.67 642.15 1591.8 1403.1 14.62 21.61 553.75 2388 9044.1 1.3 47.49 522.28 2388.1 8131.5 8.4318 0.03 392 2388 100 39 23.424
1 3 -0.0043 0.0003 100 518.67 642.35 1588 1404.2 14.62 21.61 554.26 2388.1 9052.9 1.3 47.27 522.42 2388 8133.2 8.4178 0.03 390 2388 100 38.95 23.344
1 4 0.0007 0 100 518.67 642.35 1582.8 1401.9 14.62 21.61 554.45 2388.1 9049.5 1.3 47.13 522.86 2388.1 8133.8 8.3682 0.03 392 2388 100 38.88 23.374
1 5 -0.0019 -0.0002 100 518.67 642.37 1582.8 1406.2 14.62 21.61 554 2388.1 9055.1 1.3 47.28 522.19 2388 8133.8 8.4294 0.03 393 2388 100 38.9 23.404
1 6 -0.0043 -0.0001 100 518.67 642.1 1584.5 1398.4 14.62 21.61 554.67 2388 9049.7 1.3 47.16 521.68 2388 8132.9 8.4108 0.03 391 2388 100 38.98 23.367
1 7 0.001 0.0001 100 518.67 642.48 1592.3 1397.8 14.62 21.61 554.34 2388 9059.1 1.3 47.36 522.32 2388 8132.3 8.3974 0.03 392 2388 100 39.1 23.377
1 8 -0.0034 0.0003 100 518.67 642.56 1583 1401 14.62 21.61 553.85 2388 9040.8 1.3 47.24 522.47 2388 8131.1 8.4076 0.03 391 2388 100 38.97 23.311

Examine the response data for one of the engines.

rawTrain.Y{1}(1:8)

`ans = `*8×1*
191
190
189
188
187
186
185
184

Visualize time-series data for some of the predictors.

stackedplot(rawTrain.X{1},[3,5,6,7,8,15,16,24],'XVariable','timeStamp')

**Remove Features with Less Variability**

Features that remain constant for all time steps can negatively impact the training. Use the `prognosability`

function to measure the variability of features at failure.

`prog = prognosability(rawTrain.X,"timeStamp");`

It can be observed that for some features, prognosability is equal to zero or NaN. Discard these features.

idxToRemove = prog.Variables==0 | isnan(prog.Variables); featToRetain = prog.Properties.VariableNames(~idxToRemove); for i = 1:height(rawTrain) rawTrain.X{i} = rawTrain.X{i}{:,featToRetain}; end

**Normalize Training Predictors**

Normalize the training predictors to have zero mean and unit variance.

[~,Xmu,Xsigma] = zscore(vertcat(rawTrain.X{:})); preTrain = table(); for i = 1:numel(rawTrain.X) preTrain.X{i} = (rawTrain.X{i} - Xmu) ./ Xsigma; end

**Clip Responses**

This step is optional. In order for network to focus on the part of the data where engines are more likely to fail (end of the engine's life), clip the responses at the threshold of 150. This makes the network treat instances with higher RUL values as equal.

clipResponses = true; if clipResponses rulThreshold = 150; for i = 1:numel(rawTrain.Y) preTrain.Y{i} = min(rawTrain.Y{i},rulThreshold); end end

This figure shows the first observation and the corresponding clipped response.

The CNN architecture used in this example takes input data in 2D format (similar to an image) where one dimension represents the sequence length and the other dimension represents the number of features. However, the time-series data for each engine varies in length. Therefore, sequences of fixed length are extracted from time-series data to make it compatible for CNN training.

**Generate Sequences of Fixed Length**

Use `localGenerateSequences`

function to generate sequences of fixed window length from the time series data of each engine. The corresponding response variable for each sequence represents the RUL at the last time-stamp of that sequence.

`WindowLength`

describes the length of the time-series data used for predicting RUL at a given time stamp`Stride`

is the measure of shift between two consecutive windows of length`WindowLength`

To understand better how `localGenerateSequences`

works, consider a sequence data of length 10. If the chosen parameters `WindowLength`

of 4 and `Stride`

of 1 are used to create sequences then `localGenerateSequences`

would return a total of 7 sequences of length 4 in a table`.`

`WindowLength`

of 40 and `Stride`

of 1 was chosen based on an experimentation. They can be set to other values to investigate improved model performance.

WindowLength = 40; Stride = 1; dataTrain = localGenerateSequences(preTrain,WindowLength,Stride);

**Reshape Sequence Data for imageInputLayer**

The `imageInputLayer`

(Deep Learning Toolbox) used in the CNN architecture expects the size of the input data, specified as a row vector of integers `[h w c]`

, where `h`

, `w`

, and `c`

correspond to the height, width, and number of channels respectively. Therefore, reshape the data to have 3 dimensions as `[WindowLength numFeatures 1]`

.

numFeatures = size(dataTrain.X{1},2); InputSize = [WindowLength numFeatures 1]; for i = 1:size(dataTrain,1) dataTrain.X{i} = reshape(dataTrain.X{i},InputSize); end

The deep convolutional neural network architecture used for RUL estimation is described in [1].

Input data is prepared in 2D format, with the first dimension representing the length of the time sequence and the second dimension representing the number of selected features. Convolutional layers are stacked for feature extraction, followed by a fully connected layer.

The selected network architecture applies 1D convolution along the time sequence direction only. This implies that the order of features in the `InputSize `

will not impact the training and only trends in one feature at a time are considered.

Define the network architecture. Create a CNN that consists of four consecutive sets of a CNN layer and a relu layer, with `filterSize`

and `numFilters`

as the two input arguments for `convolution2dLayer`

, followed by a fully connected layer of size `numHiddenUnits `

and a dropout layer with dropout probability of 0.5. Setting the value of the second dimension of `filterSize`

to 1, results in 1D convolution. Since the network is predicting the remaining useful life (RUL) of the turbofan engine, set `numResponses`

to 1.

filterSize = [5, 1]; numHiddenUnits = 100; numResponses = 1; layers = [ imageInputLayer(InputSize) convolution2dLayer(filterSize,10) reluLayer() convolution2dLayer(filterSize,20) reluLayer() convolution2dLayer(filterSize,10) reluLayer() convolution2dLayer([3 1],5) reluLayer() fullyConnectedLayer(numHiddenUnits) reluLayer() dropoutLayer(0.5) fullyConnectedLayer(numResponses) regressionLayer()];

Specify `trainingOptions`

(Deep Learning Toolbox).** **Train for 20 epochs with minibatches of size 512 using the ‘adam’ solver. Set `LearnRateSchedule `

to` piecewise`

to drop the learning rate during training and `LearnRateDropFactor`

to 0.3 as the dropping factor for the training.** **Specify the learning rate 0.003 to slow down learning, and shuffle the data every epoch to make the network robust. Turn on the training progress plot, and turn off the command window output (`Verbose`

).

maxEpochs = 20; miniBatchSize = 512; options = trainingOptions('adam', ... 'LearnRateSchedule','piecewise',... 'LearnRateDropFactor',0.3,... 'MaxEpochs',maxEpochs, ... 'MiniBatchSize',miniBatchSize, ... 'InitialLearnRate',0.003, ... 'Shuffle','every-epoch',... 'Plots','training-progress',... 'Verbose',0);

Train the network using `trainNetwork`

.

net = trainNetwork(dataTrain,layers,options);

Plot the layer graph of the network to visualize the underlying network architecture.

figure; lgraph = layerGraph(net.Layers); plot(lgraph)

The test data contains 100 partial sequences and corresponding values of the remaining useful life at the end of each sequence.

filenameTestPredictors = fullfile(dataFolder,'test_FD001.txt'); filenameTestResponses = fullfile(dataFolder,'RUL_FD001.txt'); dataTest = localLoadData(filenameTestPredictors,filenameTestResponses);

Prepare the test dataset for predictions by performing the same pre-processing steps that were done for the training dataset.

for i = 1:numel(dataTest.X) dataTest.X{i} = dataTest.X{i}{:,featToRetain}; dataTest.X{i} = (dataTest.X{i} - Xmu) ./ Xsigma; if clipResponses dataTest.Y{i} = min(dataTest.Y{i},rulThreshold); end end

The network will not be able to make predictions for time-series whose length is less than the `WindowLength`

. Therefore, remove test units with sequences smaller than `WindowLength`

.

unitLengths = zeros(numel(dataTest.Y),1); for i = 1:numel(dataTest.Y) unitLengths(i) = numel(dataTest.Y{i,:}); end dataTest(unitLengths<WindowLength+1,:) = [];

Create a table for storing the predicted response (`YPred`

) along with the true response (`Y`

).

predictions = table('Size',[height(dataTest) 2],'VariableTypes',["cell","cell"],'VariableNames',["Y","YPred"]); for i=1:height(dataTest) unit = localGenerateSequences(dataTest(i,:),WindowLength,Stride); predictions.Y{i} = unit.Y; predictions.YPred{i} = predict(net,unit,'MiniBatchSize',miniBatchSize); end

**Performance Metrics**

Compute root-mean-square error (RMSE) across all time-cycles of the test sequences to compare how well the network performed on the test data.

for i = 1:size(predictions,1) predictions.RMSE(i) = sqrt(mean((predictions.Y{i} - predictions.YPred{i}).^2)); end

The following histogram helps in visualizing the distribution of RMSE values across all test engines.

figure; histogram(predictions.RMSE,'NumBins',10); title("RMSE ( Mean: " + round(mean(predictions.RMSE),2) + " , StDev: " + round(std(predictions.RMSE),2) + " )"); ylabel('Frequency'); xlabel('RMSE');

Additionally, to see how the network predictor is performing throughout the given sequence of data in the test engines. Use the `localLambdaPlot`

function to plot the predicted RUL against the true RUL of any test engine.

```
figure;
localLambdaPlot(predictions,"random");
```

The result shows that the CNN deep learning architecture for estimating RUL of the turbo engine data is an alternative approach to predict RUL. The RMSE values at all time-stamps indicates that the network was able to perform well towards the end of the given test sequence data. This suggests that it is important to have some small history of the sensor values when trying to predict RUL.

`Load Data function`

This function loads run-to-failure data from the provided text file, groups time-series data and its corresponding RUL values in a table as predictors and responses.

function data = localLoadData(filenamePredictors,varargin) if isempty(varargin) filenameResponses = []; else filenameResponses = varargin{:}; end %% Load the text file as a table rawData = readtable(filenamePredictors); % Add variable names to the table VarNames = {... 'id', 'timeStamp', 'op_setting_1', 'op_setting_2', 'op_setting_3', ... 'sensor_1', 'sensor_2', 'sensor_3', 'sensor_4', 'sensor_5', ... 'sensor_6', 'sensor_7', 'sensor_8', 'sensor_9', 'sensor_10', ... 'sensor_11', 'sensor_12', 'sensor_13', 'sensor_14', 'sensor_15', ... 'sensor_16', 'sensor_17', 'sensor_18', 'sensor_19', 'sensor_20', ... 'sensor_21'}; rawData.Properties.VariableNames = VarNames; if ~isempty(filenameResponses) RULTest = dlmread(filenameResponses); end % Split the signals for each unit ID IDs = rawData{:,1}; nID = unique(IDs); numObservations = numel(nID); % initialize a table for storing data data = table('Size',[numObservations 2],... 'VariableTypes',{'cell','cell'},... 'VariableNames',{'X','Y'}); for i=1:numObservations idx = IDs == nID(i); data.X{i} = rawData(idx,:); if isempty(filenameResponses) % calculate RUL from time column for train data data.Y{i} = flipud(rawData.timeStamp(idx))-1; else % use RUL values from filenameResponses for test data sequenceLength = sum(idx); endRUL = RULTest(i); data.Y{i} = [endRUL+sequenceLength-1:-1:endRUL]'; %#ok<NBRAK> end end end

`Generate Sequences function`

This function generate sequences from time-series data given the `WindowLength`

and `Stride`

. The Output is a table of sequences as matrices and corresponding RUL values as vectors.

function seqTable = localGenerateSequences(dataTable,WindowLength,Stride) getX = @(X) generateSequences(X,WindowLength,Stride); getY = @(Y) Y(WindowLength:Stride:numel(Y)); seqTable = table; temp = cellfun(getX,dataTable.X,'UniformOutput',false); seqTable.X = vertcat(temp{:}); temp = cellfun(getY,dataTable.Y,'UniformOutput',false); seqTable.Y = vertcat(temp{:}); end % sub-function function seqCell = generateSequences(tsData,WindowLength,Stride) % returns a cell array of sequences from time-series data using WindowLength and Stride % create a function to extract a single sequence given start index getSeq = @(idx) tsData(1+idx:WindowLength+idx,:); % determine starting indices for sequences idxShift = num2cell(0:Stride:size(tsData,1)-WindowLength)'; % extract sequences seqCell = cellfun(getSeq,idxShift,'UniformOutput',false); end

`Lambda Plot function`

This helper function accepts the `predictions`

table and a `lambdaCase`

, plots the predicted RUL against the true RUL throughout its sequence (at every time-stamp) for the better visualization of how the prediction is changing with every time-stamp. The second argument `lambdaCase`

to this function can be the test engine number or a set of valid strings to find an engine number like "random", "best", "worst" or "average".

function localLambdaPlot(predictions,lambdaCase) if isnumeric(lambdaCase) idx = lambdaCase; else switch lambdaCase case {"Random","random","r"} idx = randperm(height(predictions),1); %Randomly choose a test case to plot case {"Best","best","b"} idx = find(predictions.RMSE == min(predictions.RMSE)); %Best case case {"Worst","worst","w"} idx = find(predictions.RMSE == max(predictions.RMSE)); %Worst case case {"Average","average","a"} err = abs(predictions.RMSE-mean(predictions.RMSE)); idx = find(err==min(err),1); end end y = predictions.Y{idx}; yPred = predictions.YPred{idx}; x = 0:numel(y)-1; plot(x,y,x,yPred) legend("True RUL","Predicted RUL") xlabel("Time stamp (Test data sequence)") ylabel("RUL (Cycles)") title("RUL for Test engine #"+idx+ " ("+lambdaCase+" case)") end

X. Li, Q. Ding, and J.-Q. Sun, “Remaining useful life estimation in prognostics using deep convolution neural networks,”

*Reliability Engineering & System Safety*, vol. 172, pp. 1–11, Apr. 2018Saxena, Abhinav, Kai Goebel. "Turbofan Engine Degradation Simulation Data Set."

*NASA Ames Prognostics Data Repository*https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/#turbofan, NASA Ames Research Center, Moffett Field, CA

`prognosability`

| `imageInputLayer`

(Deep Learning Toolbox) | `trainingOptions`

(Deep Learning Toolbox)

- Learn About Convolutional Neural Networks (Deep Learning Toolbox)
- Sequence-to-Sequence Regression Using Deep Learning (Deep Learning Toolbox)
- Similarity-Based Remaining Useful Life Estimation