analyzing assymetric data sets
Show older comments
Dear all,
I have some data on prices for a specific good across time and across countries. The first problem is that the start and end date for each country is different. For example
Austria Belgium
"2/11/08" "07/12/08"
"30/11/08" "04/01/09"
"28/12/08" "01/02/09"
"25/01/09" "01/03/09
"22/02/09" "29/03/09"
The second problem is that the time span for each country is different. For example the data for France are available for 39 periods of 4 weeks(or 28 days) The data for Belgium are available for 36 periods of 4 weeks.
The third problem is that I have jumps which means that in some cases the next date is not always every after 4 weeks. Put differently, the distance that separates apart two successive dates in not always 28 days but in some cases it is 29 , 27 or 34.
Is there anything I can do (any function perhaps?) to solve these 3 problems. If I do not solve these problems I will not be able to use the data set for analysis. Please be as specific as you can
Thanks
10 Comments
Walter Roberson
on 4 Jun 2012
You have not said anything about how you intend to analyze the data, so it is difficult for us to make suggestions.
antonet
on 4 Jun 2012
Walter Roberson
on 4 Jun 2012
For a particular time, t, if the price exists in one location, but the price in another location starts after t or ends before t, what would you _like_ to have happen?
The jumps can be taken care of (approximately at least) by using interp1(), so the main thing you need to define is what you want done when data has not yet started or is already finished in another country.
antonet
on 4 Jun 2012
Walter Roberson
on 4 Jun 2012
You can extrapolate from the values that do exist for that one country, or you can return a constant result such as 0 or -inf or NaN at those locations, or you can just not produce a value for those locations, or you could extrapolate based upon the values that exist in that time frame over all of the countries.
Caution: extrapolation usually has quite a wide margin of error !!
antonet
on 4 Jun 2012
Walter Roberson
on 4 Jun 2012
The most preferable is up to you, depending on your needs.
Generally speaking, in locations where there is no data, you need to refrain from performing a meaningful calculation there, or else you need to calculate new data there based upon existing data. Your situation is one in which there is no reasonable mathematical model to predict the past or future behavior with accuracy (there are too many factors, too much psychology and politics involved in the prices.)
antonet
on 5 Jun 2012
Oleg Komarov
on 5 Jun 2012
interp1() will interpolate for all dates based on the existing datapoints that you will provide, i.e. it will NOT interpolate the first date and THEN interpolate the others from the previously interpolated value.
Check the graph in the documentation: http://www.mathworks.co.uk/help/matlab/ref/interp1.html
antonet
on 6 Jun 2012
Answers (0)
Categories
Find more on Language Fundamentals in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!