Extract data from URL Human Mortality Database

1 view (last 30 days)
Hi!
I'm having trouble trying to extract data from mortality.org.
urldoc = urlread('http://email:passwor@www.mortality.org/hmd/PRT/STATS/Mx_1x1.txt');
Error using urlreadwrite (line 36)
Either this URL could not be parsed or the protocol is not supported.
Error in urlread (line 36)
[s,status] = urlreadwrite(mfilename,catchErrors,url,varargin{:});
Thanks,
Melânia

Accepted Answer

Cedric
Cedric on 26 Sep 2013
Edited: Cedric on 26 Sep 2013
MATLAB doesn't support well authentication. If you have 2013a, you can try this workaround.
If not, save the following function written I guess by Andrew Janke:
function [s,info] = urlread_auth(url, user, password)
%URLREAD_AUTH Like URLREAD, with basic authentication
%
% [s,info] = urlread_auth(url, user, password)
%
% Returns bytes. Convert to char if you're retrieving text.
%
% Examples:
% sampleUrl = 'http://browserspy.dk/password-ok.php';
% [s,info] = urlread_auth(sampleUrl, 'test', 'test');
% txt = char(s)
% Matlab's urlread() doesn't do HTTP Request params, so work directly with Java
jUrl = java.net.URL(url);
conn = jUrl.openConnection();
conn.setRequestProperty('Authorization', ['Basic ' base64encode([user ':' password])]);
conn.connect();
info.status = conn.getResponseCode();
info.errMsg = char(readstream(conn.getErrorStream()));
s = readstream(conn.getInputStream());
function out = base64encode(str)
% Uses Sun-specific class, but we know that is the JVM Matlab ships with
encoder = sun.misc.BASE64Encoder();
out = char(encoder.encode(java.lang.String(str).getBytes()));
function out = readstream(inStream)
%READSTREAM Read all bytes from stream to uint8
try
import com.mathworks.mlwidgets.io.InterruptibleStreamCopier;
byteStream = java.io.ByteArrayOutputStream();
isc = InterruptibleStreamCopier.getInterruptibleStreamCopier();
isc.copyStream(inStream, byteStream);
inStream.close();
byteStream.close();
out = typecast(byteStream.toByteArray', 'uint8'); %'
catch err
out = []; %HACK: quash
end
and then execute
url = 'http://www.mortality.org/hmd/PRT/STATS/Mx_1x1.txt' ;
user = 'yourLogin' ;
pwd = 'yourPassword' ;
buffer = char( urlread_auth( url, user, pwd )) ;
You can write buffer to file using e.g. FWRITE. You can also generate url dynamically using SPRINTF if you want to loop over files.
EDIT: you could actually extract the data on the fly, and save a .mat file instead of a text file. For example..
% The two following replacements should be fine tuned to your needs.
buffer = strrep(buffer, '+', ' ') ; % Eliminate + ..
buffer = strrep(buffer, ' .', '-1') ; % Replace . entries with -1 ..
% Txt -> cell array of num columns.
data = textscan(buffer, '%f %f %f %f %f', Inf, 'HeaderLines', 3) ;

More Answers (1)

Image Analyst
Image Analyst on 26 Sep 2013
Does it change all the time, such that you need to read it in dynamically via MATLAB? If not, then just download it to a local file via your browser then have MATLAB use the local file instead.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!