How to calaculate This standard deviation correction factor in Matlab?

Hello guys,
i want to normalize my dta matrix by using the below formula.
could anyone help me to write the approriate piece of code for that, if that possible?
Normalization_formula.jpg
where g(i,j) is the value for feature i in sample j, sd(g(i)) is the standard deviation across samples for feature i, sd10(g) is the 10-percentile value of standard deviations across features.
My data matrix is looks like this which have 8 features and 16 samples in this example
my_data_matrix.jpg
I appreciate any help!

17 Comments

@Stephen Cobeldick, @Guillaume I'm expecting some help from you if that possible?
Can you provide your attempts? What have you tried? What about for loop?
@darova Thanks for your response , ok i will share what i have tried
here what i have tried
xx = importdata('Dataset.txt','\t');
x = xx.data;
[n,m]=size(x);
for i1 = 1:n
for i2 = 1:m
Norm= x(i2,i1)/(std(x(i2)))+std(prctile(x,10));
end
end
I made some changes
xx = importdata('Dataset.txt','\t');
x = xx.data;
Norm = x*0;
std10 = 0.1*std(x);
[m,n]=size(x);
for i1 = 1:m
for i2 = 1:n
Norm(i1,i2) = x(i1,i2)/(std(x(i1,:)))+std10);
end
end
Seems simple enough to calculate except I've no idea what "the 10-percentile value of standard deviations across features" mean. Standard deviation turns an array into a scalar, percentile also turns an array into a scalar. Hence you can neither take the percentile of a standard deviation, nor the standard deviation of a percentile.
The rest is trivial and certainly doesn't need a loop.
@darova hi, ihave tried your code and im getting this error
Error using /
Matrix dimensions must agree.
Error in stages_overlapped (line 46)
Norm(i1,i2) = x(i1,i2)/(std(x(i1,:))+std10);
@darova Hi to be clear m is the number of rows which is in my data number of features, and n is the number of columns which in my data number of samples
Hi @darova Now no error and the code is running let me see what i will get.
@Guillaume hi i aso didn't understand that and just checked on matlab, perhaps its like what @darova help it std10 = 0.1*std(x(:))
@darova hi darova, sorry for disturbing you again, but the code is still running, seems will take much time is it reasonable?!
Impossible
Can you attach the data?
@darova Sorry the data exceeds 5MB thatswhy may be took much time
If the data is that large, how about posting a smaller subset of the data (say a dozen or so rows) and telling us what you would expect the answer to be for that smaller subset of data.
For example, let's take this sample dataset. What would you expect to be for it?
rng default % Make sure we can each create the same A
A = randn(12, 8);
@Steven Lord so sorry for that , i share a subset of it just to show how it looks like and whats the nature of the values on it, however my data is of size 219*25172

Sign in to comment.

Answers (1)

I think you first need to clarify what is meant by "the 10-percentile value of standard deviations across features". I very much doubt it's the 10th of the standard deviation and obviously it's going to greatly influence your results.
However, since we don't know, here is the implementation with 1/10 of the standard deviation:
%input: g a NxM matrix, where rows are samples, columns are feature
result = g ./ (std(g, 0, 1) + 0.1*std(g, 0, 'all')); %R2018b or later
%result = g ./ (std(g, 0, 1) + 0.1*std(g(:))); %R2016b or later
%result = bsxfun(@rdivide, g, std(g, 0, 1) + 0.1*std(g(:))); %prior to R2016b

2 Comments

Hi @Guillaume as it attached in the mathematical formula above is written sd10 so may be what you exactly supposed ?
im going to try your code and thanks alot for your cooperation
Hi @Guillaume last line works well but the results are different from you and @darova

Sign in to comment.

Categories

Asked:

on 12 Jan 2020

Commented:

on 13 Jan 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!