**You are now following this question**

- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.

# Effect of Humidity on Electrochemical Sensor Signal and Seeking Correction Approaches

5 views (last 30 days)

Show older comments

Hi everyone,

I've uploaded my dataset, which consists of a timetable with a model predictions, REF values (which my model should ideally match closely), humidity, and temperature readings. I've also included the plots I've generated based on this data.

The dataset I'm working with represents corrected signals from an electrochemical sensor. I've applied regression modeling and temperature correction to this data. Additionally, for the purpose of this analysis, I've removed outliers to ensure the data's reliability. Upon examining the plots, it's evident that as humidity levels fluctuate, they influence the overall signal. I believe there's a correlation here, especially since the impact spans over several hours, necessitating hours of data monitoring for accurate prediction.

While the sensor's output closely follows the trend of the reference signal, noticeable discrepancies arise in line with humidity changes. Specifically, the signal tends to overestimate when humidity increases and underestimate as it decreases. I've attempted to address this using the Regression Learner App and by comparing the error with humidity readings. However, I've not yet been successful in deriving a suitable correction factor or model. The challenge seems to be the need to assess prolonged data periods and discern if the humidity is rising or falling during those times.

Could anyone provide guidance or suggest an approach to tackle this issue?

### Accepted Answer

Star Strider
on 24 Oct 2023

Relative humidity will of course be affected by temperature, inbcreases as the temperature decreases, and it can never exceed 100%. It might be better to express humidity in terms of the dewpoint (the environmental temperature can never fall below the dewpoint) or absolute humidity, although this is also a function of temperature.

Plotting the data (‘Temperature’, ‘Humidity’, and the difference between ‘Predicted_DATA’ and ‘REF’) fails (to me,a t least) to reveal any patterns I might use to model them.

LD = load('REG_Model_Data.mat');

T1 = LD.cleaned_timetable_REG

VN = T1.Properties.VariableNames;

delta = T1{:,3}-T1{:,2};

Lv = delta >= 0;

figure

scatter3(T1{Lv,4}, T1{Lv,5}, delta(Lv), '.g')

hold on

scatter3(T1{~Lv,4}, T1{~Lv,5}, delta(~Lv), '.r')

patch([xlim flip(xlim)], [[1 1]*min(ylim) [1 1]*max(ylim)], zeros(1,4), [1 1 1]*0.5, 'FaceAlpha',0.25)

hold off

grid on

xlabel(VN{4})

ylabel(VN{5})

zlabel([VN{3} ' - ' strrep(VN{2},'_','\_')])

view(-60,30)

figure

scatter(T1.Temperature, T1.Humidity,'.')

grid

xlabel('Temperature (°C)')

ylabel('Relative Humidity (%)')

% delta = TT1.Predicted_DATA - TT1.REF;

% pred = [TT1.Humidity TT1.Temperature];

% mdl = fitlm(pred, delta)

I would include humidity as an additional independent variable in your model (if you have not already) and proceed from there. I doubt that there is any way to predict it, other than noting the rate and direction of change in both humidity and temperature.

.

##### 23 Comments

Dharmesh Joshi
on 24 Oct 2023

Star Strider
on 24 Oct 2023

Edited: Star Strider
on 25 Oct 2023

That most likely has to do with the physical dynamics of the sensor. It might be possible to model the humidity changes in real time, and then have the sensor do minor corrections based on that, however it is not obvious to me how to incorporate that otherwise. Perhaps using a state-space representation of the system, where inputs are the instantaneous humidity and NOx concentrations, and the output is a correct NOx concentration could work. The state-space model would take into consideration the changing dynamics and correct for them.

Beyond that, I have no suggestions other than that the System Identification Toolbox could be helpful in estimating that system from the known inputs and known output.

EDIT — (25 Oct 2023 at 01:22)

I gave some thought to the state space model of temperature and humidity to predict the ‘REF’ data and came up with this preliminary result. The model is large for such models (this one is order 8, and a smaller system might do reasonably well — your decision, not mine) however if you are interested in this sort of approach, this should get you started.

The idea here is to account for the dynamics of the sytem as a function of time, not simply the instantaneous values of the input variables. Since you want to track the changes in humidity, similar approaches (such as a Kalman filter) might also be appropriate to estimate the directions of the various inputs to your system. (I do not have sufficient experience with Kalman filters or knowledge of your system and instrumentation to attempt that.)

The inputs here are ‘Humidity’ and ‘Temperature’ (in that order), and the output is.‘REF’.

A preliminary approach —

LD = load('REG_Model_Data.mat');

T1 = LD.cleaned_timetable_REG

T1 = 5305×5 table

Time Predicted_DATA REF Humidity Temperature
____________________ ______________ ____ ________ ___________
16-Jan-2023 13:00:00 29.283 18.2 84.169 6.0208
16-Jan-2023 14:00:00 25.142 23.9 81.138 5.8618
16-Jan-2023 15:00:00 26.77 22.3 82.871 5.2763
16-Jan-2023 16:00:00 32.312 21.8 85.829 3.8196
16-Jan-2023 17:00:00 38.172 25.6 86.509 2.4975
16-Jan-2023 18:00:00 42.949 34.4 87.315 1.4005
16-Jan-2023 19:00:00 46.67 40.3 87.347 0.57429
16-Jan-2023 20:00:00 48.916 43.5 80.602 0.0084591
16-Jan-2023 21:00:00 53.266 47.8 80.225 -0.74979
16-Jan-2023 22:00:00 55.718 56.3 81.496 -1.2985
16-Jan-2023 23:00:00 55.289 54.3 82.048 -1.6424
17-Jan-2023 00:00:00 58.95 64.5 87.314 -1.5495
17-Jan-2023 01:00:00 59.48 66.4 91.808 -1.9128
17-Jan-2023 02:00:00 60.96 65.5 93.916 -2.2129
17-Jan-2023 03:00:00 58.813 62.3 95.008 -2.4581
17-Jan-2023 04:00:00 59.082 59.2 95.457 -2.5598

figure

plot3(T1.Humidity, T1.Temperature, T1.REF)

grid on

xlabel('Humidity')

ylabel('Temperature')

zlabel('REF')

figure

plot(T1.Time, T1{:,[4 5 3]})

grid

legend('Humidity','Temperature','REF', 'Location','best')

SSData = iddata(T1{:,3}, T1{:,[4 5]}, 1, 'TimeUnit','hours')

SSData =
Time domain data set with 5305 samples.
Sample time: 1 hours
Outputs Unit (if specified)
y1
Inputs Unit (if specified)
u1
u2

sys = ssest(SSData, 8)

sys =
Continuous-time identified state-space model:
dx/dt = A x(t) + B u(t) + K e(t)
y(t) = C x(t) + D u(t) + e(t)
A =
x1 x2 x3 x4 x5 x6 x7 x8
x1 -0.1013 0.032 0.2238 -0.186 0.2464 -0.2537 0.4956 -0.5676
x2 0.01809 -0.02022 -0.4512 0.1267 -0.05051 0.1583 -0.3409 0.3828
x3 -0.0647 0.4163 -0.08732 0.3096 -0.3656 0.101 -0.2997 0.3619
x4 -0.006939 -0.02292 -0.1742 -0.05125 0.8283 -0.1018 0.2725 -0.1963
x5 -0.03864 0.02864 0.06362 -0.2127 0.2571 0.8126 -2.245 1.823
x6 -0.002079 0.0209 0.05216 -0.02102 0.2297 0.4657 -2.605 -0.1297
x7 -0.03093 0.0767 -0.04126 -0.05281 1.005 2.616 -0.9631 0.827
x8 -0.002164 -0.02393 -0.04675 0.01628 -0.1768 0.283 0.1163 -0.2318
B =
u1 u2
x1 -9.759e-05 -1.763e-05
x2 5.337e-05 -0.0003011
x3 0.0001201 0.0002809
x4 -0.0001012 -0.0003922
x5 0.0002635 0.001872
x6 0.0002405 0.001803
x7 0.0004984 0.001369
x8 6.601e-05 -8.003e-05
C =
x1 x2 x3 x4 x5 x6 x7 x8
y1 -789.5 46.67 -29.02 -27.38 -7.18 -11.43 -0.5373 8.824
D =
u1 u2
y1 0 0
K =
y1
x1 -0.001253
x2 0.0001764
x3 -5.457e-05
x4 -6.103e-05
x5 -2.063e-05
x6 -5.142e-05
x7 -8.944e-05
x8 3.302e-05
Parameterization:
FREE form (all coefficients in A, B, C free).
Feedthrough: none
Disturbance component: estimate
Number of free coefficients: 96
Use "idssdata", "getpvec", "getcov" for parameters and their uncertainties.
Status:
Estimated using SSEST on time domain data "SSData".
Fit to estimation data: 64.69% (prediction focus)
FPE: 21.82, MSE: 21.56

figure

compare(SSData, sys)

grid

.

Dharmesh Joshi
on 25 Oct 2023

Thank you; I will review the suggested documents.

In terms of humidity, its effects differ from temperature as they are not immediate but may take a few hours to manifest. This delay depends on whether the humidity is increasing or decreasing and at what intensity. Consequently, incorporating humidity into the standard model doesn't significantly impact, based on my experience. Visually comparing the rise and fall of humidity over a span of several hours reveals a correlation. To illustrate this more clearly, I have attached a screenshot of the data collected over a few days.

Star Strider
on 25 Oct 2023

I am not certain that this analysis adds much, however it might be of interest —

LD = load('REG_Model_Data.mat');

T1 = LD.cleaned_timetable_REG

T1 = 5305×5 table

Time Predicted_DATA REF Humidity Temperature
____________________ ______________ ____ ________ ___________
16-Jan-2023 13:00:00 29.283 18.2 84.169 6.0208
16-Jan-2023 14:00:00 25.142 23.9 81.138 5.8618
16-Jan-2023 15:00:00 26.77 22.3 82.871 5.2763
16-Jan-2023 16:00:00 32.312 21.8 85.829 3.8196
16-Jan-2023 17:00:00 38.172 25.6 86.509 2.4975
16-Jan-2023 18:00:00 42.949 34.4 87.315 1.4005
16-Jan-2023 19:00:00 46.67 40.3 87.347 0.57429
16-Jan-2023 20:00:00 48.916 43.5 80.602 0.0084591
16-Jan-2023 21:00:00 53.266 47.8 80.225 -0.74979
16-Jan-2023 22:00:00 55.718 56.3 81.496 -1.2985
16-Jan-2023 23:00:00 55.289 54.3 82.048 -1.6424
17-Jan-2023 00:00:00 58.95 64.5 87.314 -1.5495
17-Jan-2023 01:00:00 59.48 66.4 91.808 -1.9128
17-Jan-2023 02:00:00 60.96 65.5 93.916 -2.2129
17-Jan-2023 03:00:00 58.813 62.3 95.008 -2.4581
17-Jan-2023 04:00:00 59.082 59.2 95.457 -2.5598

mdl = fitlm(T1.Humidity, T1.REF)

mdl =

Linear regression model:
y ~ 1 + x1
Estimated Coefficients:
Estimate SE tStat pValue
________ ________ _______ ___________
(Intercept) -1.4102 0.7868 -1.7923 0.073146
x1 0.26959 0.010118 26.644 6.7542e-147
Number of observations: 5305, Error degrees of freedom: 5303
Root Mean Squared Error: 12.4
R-squared: 0.118, Adjusted R-Squared: 0.118
F-statistic vs. constant model: 710, p-value = 6.75e-147

figure

plot(mdl) % x1 = Humidity, y = REF

grid

[r, lags] = xcorr(T1.Humidity, T1.REF, 'coeff'); % Return Correlation Coefficient

[rmax,idx] = max(r);

Delay = lags(idx)

Delay = 1

figure

stairs(lags, r)

grid

xlabel('Hours')

ylabel('R_{x,y}')

text(lags(idx), rmax, sprintf('\\leftarrow R_{x,y} = %.3f\n \\Delta = %d hours', rmax, Delay), 'Horiz','left', 'Vert','top')

There does appear to be a correlation, and no significant delay.

If these are not the variables you want to examine, substitute the correct variables.

.

Dharmesh Joshi
on 25 Oct 2023

Thanks i will give that a go.

For your information this what is mentioned on the datasheet. But as humdity is change constantly, this transient issue is much longer.

Star Strider
on 25 Oct 2023

My pleasure.

The state space approach should have a way to deal with the transients, however it needs to have them as part of its data in order to model them effectively. That (or some version of it) may do what you want.

Dharmesh Joshi
on 30 Oct 2023

Hi

I have cleaned up my original signal (per minute data sample) as mentioned in my previous post.MATLAB Tools to Evaluate and Mitigate Sensor Noise - MATLAB Answers - MATLAB Central (mathworks.com)

For your reference, I have attached both the figure and raw data. The raw data comprises the REF data (hourly), raw data (per minute), and temperature/humidity (hourly). The goal is to correct the raw data so that, in principle, it looks similar to the REF signal; I am not expecting it to match 100%. I am currently evaluating the original 1-minute signal before it undergoes any type of regression modeling. Previously, I had provided the hourly averaged signal, but it was altered by a regression model. This time, it's the raw signal that I aim to correct due to humidity. Please see the plots below. In the plot, the blue represents REF data, black represents the raw data, and cyan represents humidity.

As you can observe, the raw signal decreases as humidity increases, and vice versa: it increases as humidity decreases. The intensity of these changes also correlates with the fluctuations in the raw signal.

To address this issue, I am exploring methods to correct the signal using a compensation function rather than creating a regression model at this stage. What approach would you recommend for developing such a compensation function? Ideally using the REF as abousulte signal, but use it as reference shaping signal

I belive after this correction, we could so a simple linear regression to the REF data for final value.

Star Strider
on 30 Oct 2023

Star Strider
on 31 Oct 2023

I have been looking at this for a while (interrupted by computer problems), and note that ‘smooth_difference’ lags ‘REF’ by 15 hours, and leads ‘RH’ by 9.

In estimating this using the System Identification Toolbox (which is the only approach I consider appropriate here), what should the input (or inputs) be, and what should the output be? I do not completely understand what you want to do in that respect.

I used retime to resample (interpolate) ‘smooth_diffference’ to the time vectors shared by all the other signals (I checked to be sure that they all share the same time, so I could use it with the other signals), and created a new structure for it

My analysis so far —

LD = load('plotData_needs...orrection.mat')

LD = struct with fields:

plot1: [1×1 struct]
plot2: [1×1 struct]
plot3: [1×1 struct]
plot4: [1×1 struct]

plot1 = LD.plot1

plot1 = struct with fields:

Time: [2112×1 datetime]
RH: [2112×1 double]
DisplayName: 'Humidity'

plot2 = LD.plot2

plot2 = struct with fields:

Time: [129599×1 datetime]
smooth_difference: [129599×1 double]
DisplayName: 'RAW Signal WE-Baseline'

plot3 = LD.plot3

plot3 = struct with fields:

Time: [2112×1 datetime]
REF: [2112×1 double]
DisplayName: 'REF DATA'

plot4 = LD.plot4

plot4 = struct with fields:

Time: [2112×1 datetime]
T: [2112×1 double]
DisplayName: 'Temperture'

P2 = timetable(plot2.Time, plot2.smooth_difference); % Resample 'plot2' To Common 'Time'

P2r = retime(P2, plot1.Time);

plot2r.Time = P2r.Time;

plot2r.smooth_difference = P2r{:,1} % Create New Structure

plot2r = struct with fields:

Time: [2112×1 datetime]
smooth_difference: [2112×1 double]

figure

hold on

plot(plot1.Time, plot1.RH, 'DisplayName',plot1.DisplayName)

plot(plot2r.Time, plot2r.smooth_difference, 'DisplayName',plot2.DisplayName)

plot(plot3.Time, plot3.REF, 'DisplayName',plot3.DisplayName)

plot(plot4.Time, plot4.T, 'DisplayName',plot4.DisplayName)

hold off

grid

legend('Location','northoutside', 'NumColumns',2)

D1 = finddelay(plot1.RH, plot2r.smooth_difference) % Units: Hours

D1 = -9

D2 = finddelay(plot3.REF, plot2r.smooth_difference) % Units: Hours

D2 = 15

D3 = finddelay(plot3.REF, plot1.RH) % Units: Hours

D3 = 1

% maxlag = 24;

% Dn = finddelay(plot3.REF, [plot1.RH; plot2r.smooth_difference; plot4.T], maxlag)

figure

hold on

% plot(plot1.Time, plot1.RH, 'DisplayName',plot1.DisplayName)

plot(plot2r.Time, plot2r.smooth_difference*4, 'DisplayName',plot2.DisplayName)

plot(plot3.Time, plot3.REF, 'DisplayName',plot3.DisplayName)

% plot(plot4.Time, plot4.T, 'DisplayName',plot4.DisplayName)

hold off

grid

xlim(plot1.Time(1)+[0 caldays(14)])

legend('Location','northoutside', 'NumColumns',2)

I believe it to be appropriate to model this as a dynamic system in state space using system identification, based on my understanding of what you are doing (that may not be correct).

.

Dharmesh Joshi
on 31 Oct 2023

Thanks, In regards to your question.

I am currently working on a project where my primary objective is to utilize the REF signal as a reference to correct the smooth_difference signal. I have observed that the peaks of these signals rise and fall, and my aim is to align their general trends and shapes. While they do not need to be identical, achieving a close resemblance would be ideal.

The smooth_difference is derived from my sensor’s output, while REF serves as a reference sensor. Both sensors are simultaneously monitoring the same elements. Over a 24-hour period, one can observe fluctuations in humidity, with corresponding changes in the smooth_difference signal. Specifically, as humidity increases, the smooth_difference tends to decrease, and vice versa. By visually inspecting the data, it becomes apparent that the behavior of smooth_difference is inversely related to humidity levels.

My goal is to correct the discrepancies in the amplitude of the smooth_difference signal, particularly during periods of rising and falling humidity. To accurately assess and address this issue, a detailed analysis is required, potentially focusing on small increments of data and considering the rate of change of humidity, rather than its absolute value.

For data alignment purposes, I am considering averaging the smooth_difference signal over one-hour intervals, ensuring that it matches the time samples of the REF signal.

Do you also see this pattern?

Star Strider
on 31 Oct 2023

I am having a difficult time visualising this. It seems that the input to the system are all four variables collected simultaneously, and thge output is a correction signal that is (‘REF’ - ‘smooth_difference’).

Testing that hypothesis —

LD = load('plotData_needs...orrection.mat')

LD = struct with fields:

plot1: [1×1 struct]
plot2: [1×1 struct]
plot3: [1×1 struct]
plot4: [1×1 struct]

plot1 = LD.plot1

plot1 = struct with fields:

Time: [2112×1 datetime]
RH: [2112×1 double]
DisplayName: 'Humidity'

plot2 = LD.plot2

plot2 = struct with fields:

Time: [129599×1 datetime]
smooth_difference: [129599×1 double]
DisplayName: 'RAW Signal WE-Baseline'

plot3 = LD.plot3

plot3 = struct with fields:

Time: [2112×1 datetime]
REF: [2112×1 double]
DisplayName: 'REF DATA'

plot4 = LD.plot4

plot4 = struct with fields:

Time: [2112×1 datetime]
T: [2112×1 double]
DisplayName: 'Temperture'

P2 = timetable(plot2.Time, plot2.smooth_difference); % Resample 'plot2' To Common 'Time'

P2r = retime(P2, plot1.Time);

plot2r.Time = P2r.Time;

plot2r.smooth_difference = P2r{:,1} % Create New Structure

plot2r = struct with fields:

Time: [2112×1 datetime]
smooth_difference: [2112×1 double]

figure

hold on

plot(plot1.Time, plot1.RH, 'DisplayName',plot1.DisplayName)

plot(plot2r.Time, plot2r.smooth_difference, 'DisplayName',plot2.DisplayName)

plot(plot3.Time, plot3.REF, 'DisplayName',plot3.DisplayName)

plot(plot4.Time, plot4.T, 'DisplayName',plot4.DisplayName)

hold off

grid

legend('Location','northoutside', 'NumColumns',2)

SysData = iddata((plot2r.smooth_difference-plot3.REF), [plot1.RH plot2r.smooth_difference plot3.REF plot4.T], 1, 'TimeUnit','hours')

SysData =
Time domain data set with 2112 samples.
Sample time: 1 hours
Outputs Unit (if specified)
y1
Inputs Unit (if specified)
u1
u2
u3
u4

SSsys = ssest(SysData, 12)

SSsys =
Continuous-time identified state-space model:
dx/dt = A x(t) + B u(t) + K e(t)
y(t) = C x(t) + D u(t) + e(t)
A =
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12
x1 0.5106 -0.1781 1.967 -0.245 -0.5576 -1.439 -0.03361 1.26 -0.00693 -0.2496 0.2962 0.847
x2 -8.465 0.7301 -0.4308 -1.135 -1.487 -1.381 -1.127 1.713 -1.455 0.04107 -0.6515 2.224
x3 -14.24 1.996 -7.355 0.05811 0.7366 0.6082 -0.4565 -1.35 -1.55 1.612 -1.279 0.004257
x4 -12.51 2.84 -7.793 -1.042 1.185 -0.4785 -0.6123 0.7043 -1.058 1.428 -0.5808 1.231
x5 -7.215 0.4767 0.7689 -1.871 -1.207 1.017 -1.271 0.8841 -0.3676 -0.1248 -0.2788 1.105
x6 7.603 -5.861 0.2214 -0.6416 -3.373 -3.481 -0.2554 2.188 0.838 1.164 1.457 0.7437
x7 -0.7314 1.012 1.612 0.3901 -0.8523 -1.465 -0.4914 1.613 -0.8073 -0.5706 -0.7314 0.24
x8 -8.92 2.512 -2.153 -0.7739 -0.4705 0.5273 -1.402 -0.3049 -1.267 1.061 -0.6433 0.132
x9 0.9532 4.903 4.468 -0.01212 -1.665 -0.008972 0.4311 1.978 -0.4806 -2.714 -0.2778 0.2569
x10 14.27 -1.472 6.738 0.6501 1.599 1.075 2.017 -1.492 3.412 -1.762 1.585 -2.498
x11 4.806 1.286 8.059 1.536 1.223 0.5706 1.755 -0.03797 1.173 -2.143 0.07518 -1.906
x12 4.752 -1.638 3.204 1.102 1.532 1.441 1.512 -0.7283 1.423 -0.6985 0.6571 -1.974
B =
u1 u2 u3 u4
x1 -2.647e+13 9.436e+13 2.432e+12 -3.549e+13
x2 -3.951e+13 1.797e+14 7.539e+12 -3.443e+13
x3 -2.706e+13 1.581e+14 1.199e+12 -1.967e+12
x4 -6.722e+13 3.155e+14 -1.277e+13 -3.144e+13
x5 -1.247e+09 -3.182e+13 3.339e+12 3.047e+12
x6 -1.203e+14 5.267e+14 -4.952e+12 -1.563e+14
x7 1.129e+13 -6.149e+13 1.477e+13 -1.052e+13
x8 -1.674e+13 5.739e+12 5.641e+12 -5.855e+13
x9 9.862e+13 -3.782e+14 1.003e+12 1.09e+14
x10 6.192e+13 -3.121e+14 -8.459e+12 7.024e+13
x11 1.032e+14 -4.889e+14 1.467e+13 9.936e+13
x12 5.611e+13 -2.488e+14 3.851e+12 7.662e+13
C =
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12
y1 -8.233e-14 -2.502e-13 1.17e-14 -4.361e-14 -9.762e-14 1.047e-13 -8.7e-14 -3.142e-14 1.083e-14 4.99e-14 -3.908e-14 3.084e-14
D =
u1 u2 u3 u4
y1 0 0 0 0
K =
y1
x1 -8.228e+11
x2 2.444e+12
x3 3.085e+12
x4 2.816e+11
x5 1.674e+12
x6 -4.18e+12
x7 7.485e+11
x8 2.66e+12
x9 1.516e+12
x10 -4.087e+12
x11 2.65e+12
x12 3.619e+11
Parameterization:
FREE form (all coefficients in A, B, C free).
Feedthrough: none
Disturbance component: estimate
Number of free coefficients: 216
Use "idssdata", "getpvec", "getcov" for parameters and their uncertainties.
Status:
Estimated using SSEST on time domain data "SysData".
Fit to estimation data: 48.47% (prediction focus)
FPE: 15.83, MSE: 14.79

figure

compare(SysData, SSsys)

grid

Tsim = 1:height(plot1.Time);

correction_vector_output = lsim(SSsys, [plot1.RH plot2r.smooth_difference plot3.REF plot4.T], Tsim);

smooth_difference_corrected = plot3.REF + correction_vector_output;

figure

plot(plot2r.Time, plot2r.smooth_difference, 'DisplayName','Original')

hold on

plot(plot2r.Time, smooth_difference_corrected, 'DisplayName','Corrected')

plot(plot3.Time, plot3.REF, 'DisplayName','REF')

hold off

grid

legend('Location','best')

figure

plot(plot2r.Time, plot2r.smooth_difference, 'DisplayName','Original')

hold on

plot(plot2r.Time, smooth_difference_corrected, 'DisplayName','Corrected')

plot(plot3.Time, plot3.REF, 'DisplayName','REF')

hold off

grid

legend('Location','best')

xlim([0 caldays(14)]+plot3.Time(1))

This actually appears to calculate the correction signal ‘(smooth_difference - REF)’ reasonably well. To get the corrected value of ‘smooth_difference’ add the ‘REF’ vector to it —

syms sd REF

correction_vector_output = sd - REF

correction_vector_output =

smooth_difference_corrected = REF + correction_vector_output

smooth_difference_corrected =

sd

I was hoping that the corrected signal would come out closer to ‘REF’ than to the original signal. I may have missed something in your description, so experiment with this approach, since you understand the problem much better than I do.

.

Dharmesh Joshi
on 31 Oct 2023

Please see the plot below. This is the trend I am referring to.

This is a zoomed-in view.

You can observe that the reference signal (in blue) is fairly flat. Then, there's the humidity signal (in cyan), which rises and falls throughout the day, affecting the smooth_difference signal. You'll notice that the smooth_difference signal increases its amplitude during the periods when the humidity drops, and does the opposite when it rises.

This is the impact of humidity. I have also included temperature data in case it provides any additional insights, but temperature can be ignored for now.

Additionally, I will review the steps you outlined to see if they help address the issue.

Dharmesh Joshi
on 1 Nov 2023

Star Strider
on 1 Nov 2023

Dharmesh Joshi
on 1 Nov 2023

Well, my inputs will be my Predicted_DATA signal and Humidity. The ref signal is simply there to provide a bit of visual guidance. The Predicted_DATA fluctuates when humidity changes or when the rate of humidity change varies, and this is what I am trying to correct.

Is it possible to just analyze all the rising and falling peaks, calculate the rate of change between the key points, and then adjust a block of samples from Predicted_DATA to be either multiplied or divided by a factor, so it proportionally compensates for what humidity has done?

Star Strider
on 1 Nov 2023

I had to run this offline because it takes longer than the allotted 55 seconds (it needs about 4½ times that).

load('REG_Model_Data.mat')

cleaned_timetable_REG

% NrMissing = nnz(ismissing(cleaned_timetable_REG))

% cleaned_timetable_REG = fillmissing(cleaned_timetable_REG, 'nearest')

cleaned_timetable_REG_TT = table2timetable(cleaned_timetable_REG);

cleaned_timetable_REG_TT = retime(cleaned_timetable_REG_TT, 'hourly');

cleaned_timetable_REG_TTs = cleaned_timetable_REG_TT(:,[1 3 4 2])

Data = iddata(cleaned_timetable_REG_TTs)

Data = misdata(Data)

SSsys = ssest(Data, 8)

figure

compare(Data, SSsys)

grid

It produces this system:

SSsys =

Continuous-time identified state-space model:

dx/dt = A x(t) + B u(t) + K e(t)

y(t) = C x(t) + D u(t) + e(t)

A =

x1 x2 x3 x4 x5 x6 x7 x8

x1 -0.09591 0.1753 0.1319 0.1004 -0.07117 -0.1091 -0.01476 -0.1967

x2 0.09113 -0.1586 -0.2084 -0.09554 0.1261 0.1156 0.02182 0.2023

x3 -0.1483 0.2734 -0.06624 -0.4684 0.2209 0.154 0.02348 0.1978

x4 0.06936 -0.09895 0.4544 -0.08761 0.2892 0.1671 0.06083 0.2402

x5 0.04081 -0.06385 -0.0145 -0.02059 -0.178 -0.9404 0.07352 -1.255

x6 0.04111 -0.04807 -0.141 -0.1098 0.7128 -0.1866 -1.424 -0.09405

x7 0.01679 -0.01909 0.01431 -0.03315 -0.04821 1.374 0.005255 -1.766

x8 0.02773 -0.04175 -0.09524 -0.08299 0.7382 -0.4315 1.609 -0.3404

B =

Predicted_DA Humidity Temperature

x1 -4.047e-05 3.267e-05 0.0001242

x2 3.706e-05 -2.963e-05 -0.0001252

x3 4.425e-05 -8.677e-06 5.426e-05

x4 6.102e-05 -2.236e-05 5.346e-05

x5 2.801e-05 -1.11e-05 -2.314e-07

x6 -6.455e-06 3.922e-06 1.297e-05

x7 4.199e-06 -1.922e-06 -1.146e-06

x8 -8.82e-07 1.541e-06 1.093e-05

C =

x1 x2 x3 x4 x5 x6 x7 x8

REF -1751 160.9 -33.37 43.05 20.1 10.11 9.504 2.154

D =

Predicted_DA Humidity Temperature

REF 0 0 0

K =

REF

x1 -0.0005337

x2 0.0002469

x3 -0.0001914

x4 0.0002226

x5 0.0001757

x6 5.587e-06

x7 0.0001112

x8 4.167e-05

Parameterization:

FREE form (all coefficients in A, B, C free).

Feedthrough: none

Disturbance component: estimate

Number of free coefficients: 104

Use "idssdata", "getpvec", "getcov" for parameters and their uncertainties.

Status:

Estimated using SSEST on time domain data "Data".

Fit to estimation data: 87.19% (prediction focus)

FPE: 19.05, MSE: 18.78

Model Properties

Elapsed time is 231.098728 seconds.

The inputs are:

Predicted_DATA

Humidity

Temperature

and the output is:

REF

The order 8 system does a reasonably accurate approximation.

What would you want from a model such as this?

.

Dharmesh Joshi
on 1 Nov 2023

Star Strider
on 1 Nov 2023

It will output something approximating ‘REF’ since that is how I designed it. My impression is that is what you want, given those inputs. It does not output a compensation signal, although I attempted exactly that earlier, although with ‘REF’ also as an input.

I can experiment with the same sort of difference output however not with ‘REF’ as an input if that makes sense. The ‘Predicted_DATA’ would still be an input however, since that is what you specified. The output would be the difference signal vector, as previously. The only difference from the previous approach would be that ‘REF’ would not be an input.

Dharmesh Joshi
on 2 Nov 2023

Star Strider
on 2 Nov 2023

It can output a vector of differences, similar to the approach in my earlier Comment. That approach models the time-related dynamics of the system, including the transients.

If the ranges of all the variables are accounted for in the data available here, and if ‘REF’ will not always be available, then the regression approach might be the best option. The next step would be to determine the relationships abetween the variables. If you want to model them all as linear, then that would be straightforward (multiple linear regression) and would return coefficients for each variable, for example —

If the relationships were nonlinear (and that depends on the physics and chemistry involved, and could be a polynomial relation or a more complicated functional relation), that would have to be modeled for each of them and combined into a single function. If you have a mathematical description of these relationships, estimating the parameters would be straightforward. Those models and parameters could then be incorporated into a full model and given the inputs, would produce a single output that would (within a specific error tolerance that could also be estimated and modeled), be the result. This approach might not model the transients well, however it could model a steady-state system.

Dharmesh Joshi
on 2 Nov 2023

Thanks I will be expirment with your code. What i found it that using Regression Leaner App with GPR model is if i was to convert my Humidity data as Rate differnce as shown in the code below, it gives a little more insight into the effect Humditity has. I aim to feed this data also into your example.

timeIntervals = [1, 2, 3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24]

% Ensure the timetable is sorted in ascending order by time

temptime_table = sortrows(temptime_table);

% Loop through each time interval and calculate the rate of change

for i = 1:length(timeIntervals)

interval = timeIntervals(i);

% Initialize a column for the rate of change

rateChange = NaN(height(temptime_table), 1);

% Calculate the rate of change for each row

for j = 1:height(temptime_table)

% Find the time point exactly 'interval' hours before the current time point

targetTime = temptime_table.Time(j) - hours(interval);

% Find the closest previous data point to the target time

previousIndex = find(temptime_table.Time <= targetTime, 1, 'last');

if ~isempty(previousIndex) && all(~isnan(temptime_table.Humidity([previousIndex, j])))

rateChange(j) = (temptime_table.Humidity(j) - temptime_table.Humidity(previousIndex)) / interval;

end

end

% Add the new column to the timetable

new_column_name = sprintf('HumidityRate_%dhr', interval);

temptime_table.(new_column_name) = rateChange;

end

Star Strider
on 2 Nov 2023

It might also be possible to use the gradient function with each of the associated variables as an additional input (covariate). It computes the numeric derivative (in this instance with respect to time), however like all high-pass filters (and the gradient or derivative is one such), it accentuates the noise, so noise has to be minimised first by smoothing the signal. (I prefer using the sgolayfilt function for this, using a 3-order polynomial and varying the ‘framelen’ until I get the result I want. The smoothdata function is also an option.)

If the data are not regularly sampled or the sampling interval is not known, compute the derivative as:

dydt = gradient(y) ./ gradient(t);

(assuming vector arguments of the same dimensions) and proceed from there.

Plotting the variables and their derivatives as functions of time will demonstrate if this is a suitable option.

It might also be appropriate to calculate and plot the derivatives of the viariables with respect to each other. This might provide insight into their relationships, even if those results are not included in the model.

.

### More Answers (0)

### See Also

### Categories

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!**An Error Occurred**

Unable to complete the action because of changes made to the page. Reload the page to see its updated state.

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list

How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

Americas

- América Latina (Español)
- Canada (English)
- United States (English)

Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)

Asia Pacific

- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)