Hello,
I have a large dataset of patents by
year, region, type of patents, regional share.
2000 FR01 0 0.137
2000 FR01 1 0.135
2000 FR01 1 1
2000 FR02 0 0.144
2000 FR02 1 0.135
2000 FR02 1 1
2001 FR01 0 0.143
2001 FR01 1 0.135
2001 FR01 1 1
2001 FR02 0 0.155
2001 FR02 1 0.175
2001 FR02 1 1
.........................................................................................
I want to find the total of regional share for each region by year by type of patents.
I would appreciate if someone is able to help me.
Thank you.

 Accepted Answer

Chunru
Chunru on 30 Jun 2021
Edited: Chunru on 30 Jun 2021
It could be something like this:
% Assume table (T) with these variables: year, region, type_of_patents, regional_share
% Find the total of regional share for each region by year by type of patents.
u_type_of_patents = unique(T.type_of_patents)
u_year = unique(T.year)
u_region = unique(T.region)
u_year = unique(T.year)
for ip = 1:length(u_type_of_patents)
for iy = 1:length(u_year)
for ir = 1:length(u_region)
totalshare=(sum(T.regional_share(...
T.type_of_patents==u_type_of_patents(ip) & ...
T.year==u_year(iy) & ...
T.region==u_region(ir) )));
%fprintf(...)
end
end
end

5 Comments

Sorry there is a error
Operands to the logical and (&&) and or (||) operators must be convertible to logical scalar values.
Error in untitled (line 10)
T.type_of_patents==u_type_of_patents(ip) && ...
Apparently there are ndef entries in the year column, I had no idea, previously.
I have to try to take care of them first.
Thank you.
Use "&" instead. If you have string as table column, using appropriate string comparison. (I edit the && above)
Hello,
I corrected the data and tried the command, but even after 6 hours the process was not complete. There are more than 13 million rows to be considered, so I guess due to loop it is taking a lot of time. Is there anyway to avoid loop and perform the task.
Thank you.
Hi Saptorshee,
Try the following and see whether the performance is better.
>> rowfun(@sum,t,"InputVariables","regional share",'GroupingVariables',["year" "type of patents" "region"],"OutputVariableNames","total region share")
ans =
8×5 table
year type of patents region GroupCount total region share
____ _______________ ________ __________ __________________
2000 0 {'FR01'} 1 0.137
2000 0 {'FR02'} 1 0.144
2000 1 {'FR01'} 2 1.135
2000 1 {'FR02'} 2 1.135
2001 0 {'FR01'} 1 0.143
2001 0 {'FR02'} 1 0.155
2001 1 {'FR01'} 2 1.135
2001 1 {'FR02'} 2 1.175
Thanks,
Lei
Hello Lei,
Thank you very very much indeed.

Sign in to comment.

More Answers (0)

Products

Release

R2021a

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!