Simple data extration from notepad
11 views (last 30 days)
Show older comments
Hi there. I have financial data in a notepad in the form of:
10/21/2002,0609,0.97270,0.97270,0.97260,0.97260,0,0
10/21/2002,0610,0.97260,0.97260,0.97260,0.97260,0,0
10/21/2002,0611,0.97280,0.97280,0.97280,0.97280,0,0
10/21/2002,0612,0.97290,0.97290,0.97290,0.97290,0,0
10/21/2002,0613,0.97290,0.97290,0.97290,0.97290,0,0
10/21/2002,0614,0.97290,0.97290,0.97290,0.97290,0,0
Now to brief you this is data for 1 minute data 24 hours a day and 5 days a week. Each entry is on a new line with no spaces.
I want to transfer this data to MATLAB...but I want a easy method to select certain periods...For instance lets say I want period of 0600 - 0800 only for the historical data.
Additionally for anybody very clever is there a way I can select certain dates and time constraint like 10/28/2003 0600-0800.
I look forward to some answers.
Thanks
0 Comments
Accepted Answer
Eric
on 30 Mar 2012
Here's an approach I would try:
1. Use csvread() to read in only the first two columns, the dates and times.
2. Use the datenum() function to convert these to serial date numbers.
3. Use the datenum() function to convert the desired dates and times to serial date numbers.
4. You should now be able to figure out exactly which rows to read. Use csvread() again to read only those lines of data.
The goal is to read only the data of interest from the file rather than reading in the whole file and then selecting the data of interest. I'm assuming that partially reading in a file using csvread() is faster than reading the whole thing, which I haven't tried. I would hope csvread() is that intelligent, though.
If your data truly are quite repeatable, then step 1 above could be replaced by reading in only the first date/time present in the first row. You could figure out the desired rows from just the first entry if they are absolutely repeatable. That would save you from having to read in all of the first two columns.
Good luck,
Eric
0 Comments
More Answers (3)
Andrei Bobrov
on 30 Mar 2012
try this is code
fid = fopen('yourtxtfile.txt');
C = textscan(fid,'%s %s %f %f %f %f %f %f','Delimiter',',','CollectOutput',1);
fclose(fid);
mdyhm = arrayfun(@(x)[C{1}{x,:}],(1:size(C{1},1))','un',0);
nmdyhm = datenum(mdyhm,'mm/dd/yyyyHHMM');
% input your period
mdy = '10/28/2003';
hm = ['0600';'0800'];
bd = strcat(mdy,hm);
nbd = datenum(bd,'mm/dd/yyyyHHMM');
out = C{2}(nmdyhm >= nbd(1) & nmdyhm <= nbd(2),:);
2 Comments
Jason Ross
on 30 Mar 2012
That's the file ID. The next line is what reads the data from the file ID.
http://www.mathworks.com/help/techdoc/ref/fopen.html
Mate 2u
on 30 Mar 2012
1 Comment
Jason Ross
on 30 Mar 2012
Rather than importing all the data, then throwing away what you don't want, figure out a way to organize the data into smaller file chunks so you only have to open what you want. For example, you could create five files, one for each day, or you could create files by date and hour. This would give you a well-known pattern you can search against since you can get a directory listing very quickly and discard the files that don't contain the data you need.
The actual scheme for the file naming is up to you. You could use some sort of YYMMDDHH layout, or if it's all relative to now, you could use .0 (today), .1 (yesterday) and on back.
Of course, at some point you are essentially re-implementing a database. If you are getting this data from a database already, you can figure out how to make a query to the database for only the data you want, dump that to a file, and then you don't need to search in MATLAB since you already have narrowed the data set.
See Also
Categories
Find more on Dates and Time in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!