if-else statement to check the claim identity of URL
4 views (last 30 days)
Show older comments
How to check whether there is more than 1 URL (2 or 3 URLs...) exist in 1 URL? My purpose for this feature is to check whether there is 2 or 3...URLs hide within 1 URL, if yes then return 1, else return 0. e.g. www.abc.com/www.koko.my, http://www.abc.com=https://www.koko.my, www.abc.com.www.koko.my....etc. Here is my code, I face prob in checking the condition of URL. I have about 100++ data which save as 'URL' file. Then I want that data use 'is_double_url' function to check the results
| *is_double_url.m* |
function out = is_double_url(url_path1)
f1 = strfind(url_path1,'www.');
if isempty(f1)
out = 0;
return;
end
f2 = strfind(url_path1,'/');
f3 = bsxfun(@minus,f2,f1');
count_dots = zeros(size(f3,1),1);
for k = 1:size(f3,1)
[x,y] = find(f3(k,:)>0,1);
str2 = url_path1(f1(k):f2(y));
if ~isempty(strfind(str2,'..'))
continue
end
count_dots(k) = nnz(strfind(str2,'.'));
end
out = ~any(count_dots(2:end)<2);
if any(strfind(url_path1,'://')>f2(1))
out = true;
end
return;
| *f10.m* |
data = importdata('url');
[sizeData b] = size(data);
for i = 1:sizeData
feature10(i) = is_double_url(data{i});
end
0 Comments
Answers (1)
Walter Roberson
on 21 Mar 2014
This turns out to be quite tough to get right.
You need to consider percent-encoding, and UTF-8 encoding, and Unicode strings, Then you have to worry about Internationalized Domain Name encoding.
Note: your example,
http://www.abc.com=https://www.koko.my
is not a valid URL. The "com=https:" would be considered to be all one component, but neiter "=" nor ":" are permitted as characters in host name components.
0 Comments
See Also
Categories
Find more on Develop Apps Using App Designer in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!