data frequency conversion problem

Dear all,
Is there any function for converting data that are available every 2 months to monthly data
thank you
EDIT [01 Aug 2012, 19:33 BST - OK] Added example input from comment
Here is an example of a panel data set with 3 individuals
A = {
1 '1-2 2004' 0.256 0.385
1 '3-4 2004' 0.268 3.0394
1 '5-6 2004' 0.0504 0.6475
1 '7-8 2004' 14.0985 148.2583
1 '9-10 2004' 0.1128 1.1506
1 '11-12 2004' NaN 148.2583
1 '1-2 2005' NaN 148.2583
1 '3-4 2005' 2.5852 34.0146
1 '5-6 2005' 0.322 3.2846
1 '7-8 2005' 14.0985 148.2583
1 '9-10 2005' 2.5852 NaN
1 '11-12 2005' 0.2938 2.854
2 '1-2 2004' 0.256 0.385
2 '3-4 2004' 0.268 3.0394
2 '5-6 2004' 0.0504 0.6475
2 '7-8 2004' 14.0985 148.2583
2 '9-10 2004' 0.1128 1.1506
2 '11-12 2004' NaN 148.2583
2 '1-2 2005' NaN 148.2583
2 '3-4 2005' 2.5852 34.0146
2 '5-6 2005' 0.322 3.2846
2 '7-8 2005' 14.0985 148.2583
2 '9-10 2005' 2.5852 NaN
2 '11-12 2005' 0.2938 2.854
3 '1-2 2004' 0.256 0.385
3 '3-4 2004' 0.268 3.0394
3 '5-6 2004' 0.0504 0.6475
3 '7-8 2004' 14.0985 148.2583
3 '9-10 2004' 0.1128 1.1506
3 '11-12 2004' NaN 148.2583
3 '1-2 2005' NaN 148.2583
3 '3-4 2005' 2.5852 34.0146
3 '5-6 2005' 0.322 3.2846
3 '7-8 2005' 14.0985 148.2583
3 '9-10 2005' 2.5852 NaN
3 '11-12 2005' 0.2938 2.854}
Note that I have 30000 invividuals (instead of 3) and 20 numerical columns instead of the last 2 that I display above. The interpolation should be done for each i=1,2,3 separately.

10 Comments

could you please help me?
thanks
Why do you wanna use elaborated methods? Have you tried interpolation? Also, sometimes no data no party. It means that you don't have monthly data, you can't do research in that way.
salva
salva on 1 Aug 2012
Edited: salva on 1 Aug 2012
thank you oleg. could I send an example of the data set I have in order to give me some guidelines about how to use the interpolation. Actually this is what I was thinking (interpolation) but I said to be more general in my question in case I miss something.
thank you
Please, post a brief example here so that everyone can contribute.
Thank you oleg.
Here is an example of a panel data set with 3 individuals
A = {
1 'MA 2009' [ 0.256] [ 0.385]
1 'MJ 2009' [ 0.2680] [ 3.0394]
1 'JA 2009' [ 0.0504] [ 0.6475]
1 'SO 2009' [ 14.0985] [ 148.2583]
1 'ND 2009' [ 0.1128] [ 1.1506]
1 'JF 2010' [ NaN] [ 148.2583]
1 'MA 2010' [ 2.5852] [ 34.0146]
1 'MJ 2010' [ 0.3220] [ 3.2846]
1 'JA 2010' [ 14.0985] [ 148.2583]
1 'SO 2010' [ 2.5852] [ NaN]
1 'ND 2010' [ 0.2938] [ 2.8540]
1 'JF 2011' [ 0.1128] [ 1.1506]
1 'MA 2011' [ 14.0985] [ 148.2583]
1 'MJ 2011' [ 2.1091] [ 15.0233]
1 'JA 2011' [ 2.43] [ 3.1]
2 'MA 2009' [ 14.0985] [ 148.2583]
2 'MJ 2009' [ 2.7827] [ 18.9879]
2 'JA 2009' [ 11.8755] [ 126.4359]
2 'SO 2009' [ 0.0589] [ 0.6685]
2 'ND 2009' [ 11.8755] [ 126.4359]
2 'JF 2010' [ 0.0504] [ 0.6475]
2 'MA 2010' [ 11.8755] [ 126.4359]
2 'MJ 2010' [ 0.0504] [ 0.6475]
2 'JA 2010' [ 3.56] [ 7.21]
2 'SO 2010' [ 0.0248] [ 0.2823]
2 'ND 2010' [ 4.21] [ 9.370]
2 'JF 2011' [ 2.5852] [ 34.0146]
2 'MA 2011' [ 0.0207] [ 0.2282]
2 'MJ 2011' [ 11.8755] [ 126.4359]
2 'JA 2011' [ 14.0985] [ 148.2583]
3 'MA 2009' [ 2.1091] [ 15.0233]
3 'MJ 2009' [ 0] [ 0]
3 'JA 2009' [ 0.1128] [ 1.1506]
3 'SO 2009' [ 0.0207] [ 0.2282]
3 'ND 2009' [ 5.56] [ 3.56]
3 'JF 2010' [ NaN] [ 1.1506]
3 'MA 2010' [ 0] [ 0]
3 'MJ 2010' [ 2.1091] [ 15.0233]
3 'JA 2010' [ 0] [ 0]
3 'SO 2010' [ 2.7827] [ NaN]
3 'ND 2010' [ 0] [ 0]
3 'JF 2011' [ 0.0207] [ 0.2282]
3 'MA 2011' [ 2.5852] [ 34.0146]
3 'MJ 2011' [ 0] [ 0]
3 'JA 2011' [ 11.8755] [ 126.4359]
}
Note that I have 30000 invividuals (instead of 3) and 20 numerical columns instead of the last 2 that I display above. The interpolation should be done for each i=1,2,3 separately.
thank you so much
I hope this example helps
cheers
If you're going from bimonthly data to monthly data, why not just average the bimonthly data?
Are the data stocks or flows? Because in the former case you can interpolate while in the latter case you have to split.
These data are stocks, So interpolation is needed. It would be wrong to divide them by two
thank you oleg
Is it possible to have some code as a help?
thanks

Sign in to comment.

 Accepted Answer

Oleg Komarov
Oleg Komarov on 2 Aug 2012
Edited: Oleg Komarov on 4 Aug 2012
EDIT#2 I didn't notice at first that it had different series. Added also linear interpolation of NaNs:
% Inpaint NaNs and keep numeric matrix (easier to work than with cell % arrays) with inpaint_nans() from the FEX.
data = inpaint_nans(cell2mat(A(:,3:4)),2);
% Partition interpolation in blocks (first column)
blocks = [A{:,1}];
unBlocks = unique(blocks);
% Preallocate
interpData = cell(numel(unBlocks),1);
% Interpolate each block
for b = unBlocks
idxBlock = b == blocks; % index the block
n = nnz(idxBlock)*2; % counts its length
interpData{b} = interp1((1:2:n)', data(idxBlock,:),(1:n-1)');
end
% This plot gives an idea of the type of interpolation (for the first block/series only)
subplot(311)
plot(1:2:n,[A{1:15,3}],'-or')
legend('original')
subplot(312)
plot(1:n-1,interpData{1}(:,1)','-db')
legend('linear interp')
subplot(313)
plot(1:2:n,[A{1:15,3}],'-or',1:n-1,interpData{1}(:,1)','-db')
As you can see, you need to decide what to do with NaN's (especially to avoid losing info)
% Concatenate the cells into one numeric matrix (optional)
interpData = cat(1,interpData{:});
WARNING: with this approach I assume every month has same length.

4 Comments

I observe that in "out", the values in rows 1 3 5 7 9 and so forth are actually the values from A. And the values in rows 2,4,6 and so forth are the interpolated values. Am I right? But if this is the case, then we still have some bimonthly data in H. What I was thinking is that all the values in H must be interpolated.
thanks
Oleg Komarov
Oleg Komarov on 4 Aug 2012
Edited: Oleg Komarov on 4 Aug 2012
The interpolation retains the true values and "invents" the other according to a certain rule. The linear rule is very conservative and less intrusive. To use different rules you should have a valid reason/justification. As I said in a comment to Azzi's answer, the less you have to invent the better is.
In the code I proposed there two open issues:
  • avoid loosing data with interpolationg (trivially solvable with union())
  • How to treat NaNs. For this point you need to decide and I can propose a solution.
salva
salva on 4 Aug 2012
Edited: salva on 4 Aug 2012
Thank you Oleg for your reply. Regarding how to treat NaNs I would like to interpolate over them before changing the data frequency. That is, we can just replace them via interpolation.
I found some material on this
The second link is maybe what I am looking for
Having done that, then we can change the frequency.
thank you
Added NaN interpolation with John D'Errico's inpaint_nan() (but you could also use interp1).
See the graph for the result.

Sign in to comment.

More Answers (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!