Bar chart from Excel with hidden columns

Hi there
I want to create the attached chart (plus an average line per section) with Matlab. I can do it super quick with Excel, but unfortunately it didn´t work well with Matlab.
There are three complications for me. 1 - How can I can exclude the hidden columns? 2 - What is the best way to deal with German numbers with comma instead of point? 3 -And Is there any way to have an average line per section?
Seems the "readtable" doesn´t work well with commas.
Thanks

7 Comments

Please upload the file, using the paperclip button.
What code did you try? Matlab should be able to deal with the locale properly (it generally does for me with a Dutch locale, which uses the comma as decimal separator as well).
Stephen23
Stephen23 on 7 Aug 2024
Edited: Stephen23 on 7 Aug 2024
"What is the best way to deal with German numbers with comma instead of point?"
Why do you think you need to do this? Regardless of the locale settings you used to save an XLSX file, MS Excel saves numeric values in using a decimal point. But this is merely an internal detail, irrelevant to the user, but consistent for MATLAB to import. What you see displayed in the Excel GUI depends on your locale:
If you are using another file format... then you need to tell us this important information.
"Seems the "readtable" doesn´t work well with commas."
I import numeric data all the time from data sources that use decimal commas. I have never had any problems using READTABLE. But perhaps it depends on the file format you are using, which you have not told us.
As Voss wrote two hours ago, please click the paperclip button to upload a sample data file.
% Daten aus der Excel-Datei einlesen
data = readtable("Auswertung.xlsx");
% Kategorien und Werte zuweisen
category = data.Cell;
values = [data.Can; data.CurrentCollectorAnode; data.CoatingAnode; data.Separator; data.CurrentCollectorCathode; data.CoatingCathode; data.TabCopper; data.TabAluminium];
% Balkendiagramm erstellen
bar(values);
% Achsenbeschriftungen setzen
xlabel('categories');
ylabel('values');
% Y-Achse logarithmisch skalieren
set(gca, 'YScale', 'log');
% Titel des Diagramms
% title('Balkendiagramm');
For length the number import is fine, but for diameter it doesn´t work.
Stephen23
Stephen23 on 8 Aug 2024
Edited: Stephen23 on 8 Aug 2024
"For length the number import is fine, but for diameter it doesn´t work."
Most likely there is something non-numeric in that column. Of course with some minor fiddling you can convert that text to numeric (i.e. STRREP & STR2DOUBLE), but the best solution is to fix the data in the XLSX document.
EDIT: judging by the screenshot that you uploaded, the used range of the worksheet includes many more rows than the numeric data that you are trying to import. Of course you will need to tell READTABLE how to handle that (e.g. using the MISSINGRULE option), otherwise you will just get nonsense in your table.
If you want more help with this then please upload a sample data file by clicking the paperclip button.
"I raise it to Q once I know the error isn´t me haha"
In fact the data in column C is actually text. You can check this yourself using the TYPE function (in german TYP):
You can also check the type by viewing the Open Office XML file content. The Open Office standard specifies that the t="s" attribute&value defines that cell content as being text. Lets take a look at cell C2:
Yep, C2 is definitely marked as being text content. As are all other values in column C.
(As an aside, the value 23 refers to the sharedString where that text content is actually defined)
Here is a third approach to check the content of column C. Lets import the data using READCELL:
C = readcell('Auswertung - Kopie.xlsx')
C = 11x13 cell array
Columns 1 through 9 {'Cell' } {'Length' } {'Diameter' } {'Can' } {'Anode total'} {'Current Collec...'} {'Coating Anode'} {'Separator'} {'Cathode total'} {'Blau' } {[ 65]} {'26' } {[ 0.8000]} {[ 0.1100]} {[ 0.0700]} {[ 0.0200]} {[ 0.0200]} {[ 0.1100]} {'Grau' } {[ 65]} {'26' } {[ 0.8000]} {[ 0.1100]} {[ 0.0700]} {[ 0.0200]} {[ 0.0200]} {[ 0.1100]} {'Orange' } {[ 65]} {'18' } {[ 0.8000]} {[ 0.1100]} {[ 0.0700]} {[ 0.0200]} {[ 0.0200]} {[ 0.1100]} {'Rot' } {[ 65]} {'26' } {[ 0.8000]} {[ 0.1100]} {[ 0.0700]} {[ 0.0200]} {[ 0.0200]} {[ 0.1100]} {'Grün' } {[ 65]} {'26' } {[ 0.8000]} {[ 0.1100]} {[ 0.0700]} {[ 0.0200]} {[ 0.0200]} {[ 0.1100]} {'Literature 1'} {[<missing>]} {[<missing>]} {[<missing>]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing>]} {[<missing> ]} {'Literature 2'} {[<missing>]} {[<missing>]} {[<missing>]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing>]} {[<missing> ]} {'Literature 3'} {[<missing>]} {[<missing>]} {[<missing>]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing>]} {[<missing> ]} {'Literature 4'} {[<missing>]} {[<missing>]} {[<missing>]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing>]} {[<missing> ]} {'Literature 5'} {[<missing>]} {[<missing>]} {[<missing>]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing>]} {[<missing> ]} Columns 10 through 13 {'Current Collec...'} {'Coating Cathode'} {'Tab Copper'} {'Tab Aluminium'} {[ 0.0500]} {[ 0.0400]} {[ 0.2000]} {[ 0.1000]} {[ 0.0500]} {[ 0.0400]} {[ 0.2000]} {[ 0.1000]} {[ 0.0500]} {[ 0.0400]} {[ 0.2000]} {[ 0.1000]} {[ 0.0500]} {[ 0.0400]} {[ 0.2000]} {[ 0.1000]} {[ 0.0500]} {[ 0.0400]} {[ 0.2000]} {[ 0.1000]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]} {[<missing> ]}
So I used three different independent methods to confirm the type of the data in column C. All of them told me that the content of column C is text. Which means that READTABLE is correctly importing text as text. Which is the expected, documented, desired behavior of READTABLE. Which means that the problem lies with the file data, not with READTABLE.
The best place to fix this import is to fix the incorrectly defined file data. First change the cell format to "general" and then use the approach given under "Convert a column of text to numbers" here:

Sign in to comment.

 Accepted Answer

filename = 'Auswertung - Kopie.xlsx';
opts = detectImportOptions(filename);
opts = setvartype(opts,2:numel(opts.VariableTypes),'double');
opts.VariableNamingRule = 'preserve';
T = readtable(filename,opts);
cats = T{:,1};
% data = T{:,2:end};
data = T{:,[4 6:8 10:end]};
idx = all(isnan(data),2);
cats(idx) = [];
data(idx,:) = [];
hb = bar(data.','FaceColor','flat');
colors = [0 0.4 0.7; 0.5 0.5 0.5; 1 0.5 0; 0.6 0 0; 0 0.6 0];
set(hb,{'CData'},num2cell(colors,2));
mean_data = mean(data,1,'omitnan');
N = size(data,2);
yd = [mean_data([1 1],:); NaN(1,N)];
xd = 0.5*[-1; 1; NaN]+(1:N);
hl = line(xd(:),yd(:),'Color','k','LineStyle',':','LineWidth',2);
legend([hb hl],[cats; {'Averages'}],'Location','best')
set(gca(), ...
'YScale','log', ...
'XLim',[0 N+1], ...
'XTick',1:N, ...
...'XTickLabels',T.Properties.VariableNames(2:end))
'XTickLabels',T.Properties.VariableNames([4 6:8 10:end]))

More Answers (2)

warning('off','MATLAB:table:ModifiedAndSavedVarNames'); % turn off annoying nag message
tA=readtable('Auswertung - Kopie.xlsx','MissingRule','omitrow'); % read, don't bring missing rows
tA.Cell=categorical(tA.Cell); % convert types
tA.Diameter=str2double(tA.Diameter); % ditto, w/o the opt object
head(tA) % look at what's we gots...
Cell Length Diameter Can AnodeTotal CurrentCollectorAnode CoatingAnode Separator CathodeTotal CurrentCollectorCathode CoatingCathode TabCopper TabAluminium ______ ______ ________ ___ __________ _____________________ ____________ _________ ____________ _______________________ ______________ _________ ____________ Blau 65 26 0.8 0.11 0.07 0.02 0.02 0.11 0.05 0.04 0.2 0.1 Grau 65 26 0.8 0.11 0.07 0.02 0.02 0.11 0.05 0.04 0.2 0.1 Orange 65 18 0.8 0.11 0.07 0.02 0.02 0.11 0.05 0.04 0.2 0.1 Rot 65 26 0.8 0.11 0.07 0.02 0.02 0.11 0.05 0.04 0.2 0.1 Grün 65 26 0.8 0.11 0.07 0.02 0.02 0.11 0.05 0.04 0.2 0.1
ixBar=[4 6 8:width(tA)]; % variables wanted in bar()
x=(tA.Properties.VariableNames(ixBar)); % create an x vector for labelling
hB=bar(x,tA{:,ixBar}.','Grouped'); % pull the columns; transpose for grouping
hL=legend(tA.Cell,'location','northeast'); % label 'em...
ylim([0 1])
Not terribly interesting given all the data are the same by color (whatever that represents), but the idea should be clear.
Left as "exercise for Student" to add a row to the table that is the mean for each... :)
dpb
dpb on 7 Aug 2024
Edited: dpb on 7 Aug 2024
opt=detectImportOptions('yourfile.xlsx'); % base import options object
opt.SelectedVariableNames=opt.SelectedVariableNames([1 3:end]); % don't read the second column
opt=setvaropts(opt,'DecimalSeparator',','); % set comma as decimal
tT=readtable('yourfile.xlsx',opt);
MATLAB may need a little additional help given it doesn't have all of Excel built in...thank goodnes!!! :)
Above would be a start; you may want to add some additional details and it's certainly easy enough to then compute means or whatever statistics desired...the extra lines may be something else to deal with; not sure what, if anything, might be behind the graphic; if there are just extra labels in the first column for "in case", then the 'MissingRule' may be useful to not read those in...
See detectImportOptions and friends for all the details, examples, etc., ...

7 Comments

"MATLAB may need a little additional help given it doesn't have all of Excel built in..."
Help with what exactly?
XLSX files created by MS Excel use decimal points internally. What radix is displayed is purely an artifact of the locale settings and is not saved in the file:
I've experienced the issue before, @Stephen23 when reading a file from another place with a local US locale setting, although it has been a number of years ago (maybe even during the time of xlsread, I don't know for sure, anymore). But, if it is always stored in decimal divider, then you'd not think @Florian would have asked the Q? about the commas; but duly noted that having the file itself and the code used is only way to determine just exactly what might be happening.
Then again, mayhaps he saved to a csv file instead??? :)
thanks, @dpb
I raise it to Q once I know the error isn´t me haha
may you also have an idea hoq I can group the values and give them a specific colour? (each first bar is the data of one cell)
See bar for grouped data; it will automagically color the bars by group; it presumes columns in the 2D input array are the groups while rows is the number of groups. If read as a table; use "{}" addressing to return an array to pass...
Again, as said before, attach the file if you want/need additional specific details...
thanks, attached the file
so do I need an input to the bar function as this example?
y = [2 2 3; 2 5 6; 2 8 9; 2 11 12];
bar(y)
than I am assuming I need a loop for creating the groups?
"assuming I need a loop for creating the groups?"
Not likely, no...
type Auswertung.m
% Daten aus der Excel-Datei einlesen data = readtable("Auswertung.xlsx"); % Kategorien und Werte zuweisen category = data.Cell; values = [data.Can; data.CurrentCollectorAnode; data.CoatingAnode; data.Separator; data.CurrentCollectorCathode; data.CoatingCathode; data.TabCopper; data.TabAluminium]; % Balkendiagramm erstellen bar(values); % Achsenbeschriftungen setzen xlabel('categories'); ylabel('values'); % Y-Achse logarithmisch skalieren set(gca, 'YScale', 'log'); % Titel des Diagramms % title('Balkendiagramm');
But, it's the Excel file we need...

Sign in to comment.

Products

Release

R2022a

Tags

Asked:

on 7 Aug 2024

Moved:

on 9 Aug 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!