Get and interpolate missed daily data
    7 views (last 30 days)
  
       Show older comments
    
    Stefania Avvedimento
 on 14 Mar 2022
  
    
    
    
    
    Commented: Peter Perkins
    
 on 14 Mar 2022
            Hi all,
I am working with daily climate data for future projections on a basin catchment (see file attached). I just noticed that the daily precipitation data are reported considering all months of 30 days. So, I need to add rows for month of 31 days (January, March, May, July, August, October,December) and delete row for the month of February (in this case also considering the 29 days for 2032, 2036, 2040, 2044, 2048, 2052, 2056, 2060). Is there any way to automatize the process?
Also how can I interpolate the missing data?
Thanks
0 Comments
Accepted Answer
  Peter Perkins
    
 on 14 Mar 2022
        
      Edited: Peter Perkins
    
 on 14 Mar 2022
  
      Tis is a one-liner with timetables and retime:
>> tt = readtimetable("missed data.xlsx")
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names. 
tt =
  11158×3 timetable
       Date         Lat       Lon      Prec_mm_day_
    ___________    ______    ______    ____________
    01-Jan-2030    45.341    8.7889             0  
    02-Jan-2030    45.341    8.7889             0  
    03-Jan-2030    45.341    8.7889        3.0156  
[snip]
    28-Dec-2060    45.341    8.7889             0  
    29-Dec-2060    45.341    8.7889             0  
    30-Dec-2060    45.341    8.7889             0  
((In older releases of MATLAB, use readtable and table2timetable.) There are a bunch of NaT's in the times, where the data is "at" Feb 29 in a non-leap year and Feb 30 in any year. TGhose are easy to delete.
>> tt(415:420,:)
ans =
  11×3 timetable
       Date         Lat       Lon      Prec_mm_day_
    ___________    ______    ______    ____________
    27-Feb-2031    45.341    8.7889       0.30469  
    28-Feb-2031    45.341    8.7889             0  
    NaT            45.341    8.7889     0.0078125  
    NaT            45.341    8.7889             0  
    01-Mar-2031    45.341    8.7889     0.0078125  
    02-Mar-2031    45.341    8.7889             0  
>> tt(ismissing(tt.Date),:) = [];
Now interpolate the missing days.
>> tt2 = retime(tt,"daily","linear")
tt2 =
  11322×3 timetable
       Date         Lat       Lon      Prec_mm_day_
    ___________    ______    ______    ____________
    01-Jan-2030    45.341    8.7889             0  
    02-Jan-2030    45.341    8.7889             0  
    03-Jan-2030    45.341    8.7889        3.0156  
[snip]
    29-Dec-2060    45.341    8.7889             0  
    30-Dec-2060    45.341    8.7889             0  
>> tt(28:33,:)
ans =
  7×3 timetable
       Date         Lat       Lon      Prec_mm_day_
    ___________    ______    ______    ____________
    28-Jan-2030    45.341    8.7889        16.398  
    29-Jan-2030    45.341    8.7889        20.563  
    30-Jan-2030    45.341    8.7889        2.6094  
    01-Feb-2030    45.341    8.7889        4.5938  
    02-Feb-2030    45.341    8.7889       0.14844  
    03-Feb-2030    45.341    8.7889     0.0078125  
>> tt2(28:34,:)
ans =
  7×3 timetable
       Date         Lat       Lon      Prec_mm_day_
    ___________    ______    ______    ____________
    28-Jan-2030    45.341    8.7889        16.398  
    29-Jan-2030    45.341    8.7889        20.563  
    30-Jan-2030    45.341    8.7889        2.6094  
    31-Jan-2030    45.341    8.7889        3.6016  
    01-Feb-2030    45.341    8.7889        4.5938  
    02-Feb-2030    45.341    8.7889       0.14844  
    03-Feb-2030    45.341    8.7889     0.0078125  
The Lat/Lon seem to be constant. I don't know if they should be stored separately, or if this is only art of your data and you have not told us about needing to do this calculation at all your locations.
3 Comments
  Peter Perkins
    
 on 14 Mar 2022
				My response already shows that the value on the 31st of each long month is interpolated.
More Answers (1)
  Arif Hoq
      
 on 14 Mar 2022
        A=(datetime(2030,1,1):datetime(2060,12,31))';  % generating normal date
A1=datenum(A);
C=readtable('missed data.xlsx');
D1=datenum(table2cell(C(:,1)));
E=ismember(A1,D1);   
idx=find(E==0); % index of missing date
F=datetime(A1(idx),'ConvertFrom','datenum')  % missing date
2 Comments
  Peter Perkins
    
 on 14 Mar 2022
				
      Edited: Peter Perkins
    
 on 14 Mar 2022
  
			DON'T use datenum. It has been old for many years. Can't say this strongly enough. Use datetime.

See Also
Categories
				Find more on Calendar in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

