You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
How to run MATLAB function on each row of database?
1 view (last 30 days)
Show older comments
I have wrote a MATALAB function to get an output. I import the PostgreSQL database to MATLAB workspace and then work from there. For the function, I fetch the inputs from the database. Now I need to run the function on each row of the database and put the results back in the database in a new column. How should I proceed with this? Any suggestion is appreciated.
18 Comments
dpb
on 21 May 2022
Too little detail of just what you did/what form the data are in or even what they are...
If the data a numeric array, more than likely you can operate on the entire array in a vector form that works over the whole array. If you really must deal with on a row-wise basis, there are syntax forms specific for that as well, but depends on just how you've got the data to deal with...
dpb
on 22 May 2022
"potentially use rowfun" -- if the data are returned as a table or are converted thereto -- why "it depends" on the storage technique OP is using and I waffled on just how, precisely. (I know you know, just amplifying that there are conditions).
Walter Roberson
on 22 May 2022
User mentioned a database. When you do a database query, one of the offered return formats is table()
dpb
on 22 May 2022
Yeah, I know...but we don't know if that's what he's using or not...he only mentions "the database" so we're still guessing.
dpb
on 22 May 2022
Can't do anything w/ image, although it is a table; that's about all I can even read.
Attach a .mat file containing the result of
tmp=head(sorted_data,30);
save testdata tmp
Then we need to see what you tried w/ rowfun and tell us what you're trying to actually do...
Walter Roberson
on 22 May 2022
If you had a function that accepted one row, then which of the row variables would it need?
Atanu
on 23 May 2022
Here is the mat file attched. I have several output. I need all the column variables to get output. I hope that's more understandable.
Walter Roberson
on 23 May 2022
Okay, so take one row, say the first row, row1 = subject_data(1,:) . It consists of a mix of categorical and datetime entries and cells. Some of the cells turn out to contain scalar values, but others turn out to contain about 311 x 1 double (exact size varies.)
So, given that row1 table of height 1... what would you like to have happen with the entries?
You say something about "get output", but that is not specific. Do you need a separate .mat file to be generated for each row? Do you need to convert the overall table to a struct array? Do you need to join all of those 311 x 1 (or 310 x 1, or 298 x 1 or varies a bit) into a single 2d array per variable, so for N rows you would want (say) a 311 x N numeric array to be output for that variable ?
Atanu
on 23 May 2022
Edited: Atanu
on 23 May 2022
My original function reads like
function [logical_out, run_time, reaction_time, average_position, position_when_offered] ...
= reaction_time_function5(name,day,Trial)
name = subjectid, day = date, and Trial = trialname in the database columns
First 3 outputs are scalar, last 2 outputs are [x, y] coordinates. To get these outputs I need to sometime join the 311 x 1 (or, so on) arrays from all trials to compare with the special trial under observation. But each row entry should give these 5 unique outputs and I plan to tabulate them. I could attach the .tar file (postgres database) and my .m script for your reference but it might be too hectic for you to go through. But, please let me know.
Walter Roberson
on 23 May 2022
Edited: Walter Roberson
on 23 May 2022
"To get these outputs I need to sometime join the 311 x 1 (or, so on) arrays from all trials"
Your summary diagram at https://www.mathworks.com/matlabcentral/answers/1724475-how-to-run-matlab-function-on-each-row-of-database#comment_2172585 shows the first two rows being associated with Trial 5, and the next 4 being associated with Trial 6.
Do I understand correctly that the x coordinates for those first two (Trial 5) should be joined together, and the y coordinates for the first two (Trial 5) should be joined together, to form joint information that should somehow be compared to a particular trial -- but at the same time, that each of the two Trial 5 lines should produce independent output (even that all the Trial 5 information got pooled) ??
You can use findgroups() to group by trial, and you can splitapply(), which would permit you to feed all of the rows for each trial together into one function call.
If you had a function that was being feed all of the rows for one Trial at a time, how would you want to process the information?
Atanu
on 23 May 2022
Edited: Atanu
on 23 May 2022
Each 'subjectid' has 40 trials on a single date. For example 'scar' has 40 trials on '03-Apr-2022'. To obtain 'logical_out' all 'subjectid' coordinatetimes2, xcoordinatetimes2, xcoordinatetimes2 on that date (for example, '03-Apr-2022'). So if there is 5 'subjectid' on '03-Apr-2022', 5 x 40 trails need to be pooled together.
Whereas, to obtain 'average_position' only 40 trials of the 'scar' on that date need to be pooled.
By the way there are more than 25000 entries of data in the table. So, I can not run all at a time. I need to run 10-20 rows at a time.
Walter Roberson
on 23 May 2022
findgroups() passing in subjectid and date information. Then splitapply() passing in coordinatetimes2, xcoordinatetimes2, xcoordinatetimes2 (and possibly subjectid and date as well.) If you are emitting more than one row, return a cell array containing the data; you can always vertcat() the expanded cell contents afterwards.
Atanu
on 23 May 2022
Thank you Sir for your help. I will let you know if I can figure it out. I am not very proficient, might take some time. :)
dpb
on 23 May 2022
"...I can not run all at a time. I need to run 10-20 rows at a time."
Why's that? 25K rows is all that much data unless there are thousands of values per row. Unless retrieving the data from the database hangs up or somesuch? I've never used the database connection so have no idea how efficient it is, but if you can retrieve the data, I see no reason it shouldn't be able to be handled in toto in memory.
We're still pretty limited in what you've said about what the function(s) need to do -- I see Walter had time to poke at the data a little, I've not had that opportunity yet and may not until later in the day today with other commitments, but "so if there is 5 'subjectid' on '03-Apr-2022', 5 x 40 trails need to be pooled together." what does "pooled" mean, specifically?
Atanu
on 30 May 2022
The problem has been solved. I used a for loop for this. I imported the database as a table 'all_data'.
loadfile = load('all_data.mat');
all_data = loadfile.all_data;
rows = 4;
logical_out = cell(rows,1); run_time = cell(rows,1);
reaction_time = cell(rows,1); average_position = cell(rows,1);
position_when_offered = cell(rows,1);
for row = 1:rows
[logical_out{row}, run_time{row}, reaction_time{row}, average_position{row}, ...
position_when_offered{row}] ...
= new_table2(string(all_data.subjectid(row)),string(all_data.date(row)),string(all_data.trialname(row)));
end
Accepted Answer
Atanu
on 30 May 2022
The problem has been solved. I used a for loop for this. I imported the database as a table 'all_data'.
loadfile = load('all_data.mat');
all_data = loadfile.all_data;
rows = 4;
logical_out = cell(rows,1); run_time = cell(rows,1);
reaction_time = cell(rows,1); average_position = cell(rows,1);
position_when_offered = cell(rows,1);
for row = 1:rows
[logical_out{row}, run_time{row}, reaction_time{row}, average_position{row}, ...
position_when_offered{row}] ...
= new_table2(string(all_data.subjectid(row)),string(all_data.date(row)),string(all_data.trialname(row)));
end
More Answers (0)
See Also
Categories
Find more on Database Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)