How do I get my MATLAB editor to read UTF-8 characters? UTF-8 characters in blank squares in editors, but in the command window and workspace works fine.

139 views (last 30 days)
I have a project, which is commented in UTF-8 characters.
I tried changing the system locale on my Windows 10, however MATLAB editor is not recognizing UTF-8 characters(in blank squares). I'm not sure what to do here.
If I open the same .m file in text editor, it works fine.
How do I get my MATLAB editor to read UTF-8? Thank you
feature('DefaultCharacterSet')
feature('locale')
ans =
UTF-8
ans =
ctype: 'en_US.windows-1252'
collate: 'en_US.windows-1252'
time: 'en_US.windows-1252'
numeric: 'en_US_POSIX.windows-1252'
monetary: 'en_US.windows-1252'
messages: 'en_US.windows-1252'
encoding: 'windows-1252'
terminalEncoding: 'windows-949'
jvmEncoding: 'Cp1252'
status: 'MathWorks locale management system initialized.'
warning: 'System locale setting, ko_KR, is different from user locale setti…'
  3 Comments
Hongyu Shi
Hongyu Shi on 1 May 2016
I'm having the same issue for a long time. It seems that sometimes it can handle Chinese characters when I change my system into Chinese default language. But in English system they just becomes a lot of question marks.
Keith Hooks
Keith Hooks on 30 Aug 2016
I'm having the same problem. It seems that editing the lcdata.xml file to change the encoding of en_US has no effect at all. It stays windows-1252 no matter what I change it to in the file.

Sign in to comment.

Accepted Answer

Jinghao Lei
Jinghao Lei on 20 Oct 2016
I have a very tricky way to solve this problem. And it seems works. In my case, (windows matlab 2016b x64)
feature('locale')
always output below even I have modified lcdata.xml
ctype: 'zh_CN.GBK'
...
so, I delete this in lcdata.xml (in codeset)
<encoding name="GBK">
<encoding_alias name="936">
</encoding>
then I change following
<encoding name="UTF-8">
<encoding_alias name="utf8"/>
</encoding>
to
<encoding name="UTF-8">
<encoding_alias name="utf8"/>
<encoding_alias name="GBK"/>
</encoding>
The point is cheat matlab GBK is just alias of utf8
  9 Comments
Isidora Timkov-Glumac
Isidora Timkov-Glumac on 14 Sep 2023
@Jinghao Lei I have the same version of MatLab as you, but my ctype is 'en_US_POSIX.US-ASCII'. How should I modify the lcdata file so that this hack works for me? Thank you!
Simon Diehl
Simon Diehl on 30 Nov 2023
This solution worked for me as welll. I'm using Matlab R2023b on Windows Server 2019. My encoding was windows-1252.
This is not necessary on newer operating systems. I don't experience this problem with R2023b on Windows 10.
My steps where:
  1. Go to Program Files\MATLAB\R2023b\bin
  2. Rename lcdata.xml to lcdata_old.xml
  3. Copy lcdata_utf8.xml and rename it to lcdata.xml
  4. Open the file and go to section codeset <!-- Codeset entry -->
  5. Comment out <encoding name="windows-1252" ...
  6. Go to <encoding name="UTF-8"> and add the alias <encoding_alias name="1252">
The final sections look like this:
<!-- <encoding name="windows-1252" jvm_encoding="Cp1252">
<encoding_alias name="1252"/>
</encoding> -->
<encoding name="UTF-8">
<encoding_alias name="utf8"/>
<encoding_alias name="1252"/>
</encoding>
And the result is:
feature("locale")
ans =
struct with fields:
ctype: 'de_DE.UTF-8'
collate: 'de_DE.UTF-8'
time: 'de_DE.UTF-8'
numeric: 'en_US_POSIX.UTF-8'
monetary: 'de_DE.UTF-8'
messages: 'de_DE.UTF-8'
encoding: 'UTF-8'
terminalEncoding: 'IBM850'
jvmEncoding: 'UTF-8'
status: 'MathWorks locale management system initialized.'
warning: ''
The editor now works as expected. Thank you very much for the solution @Jinghao Lei

Sign in to comment.

More Answers (1)

Michael Cappello
Michael Cappello on 31 Oct 2017
Edited: Rik on 30 Mar 2023
% read in the file
fID = fopen(filename, 'r', 'n', 'UTF-8');
bytes = fread(fID);
fclose(fID);
The data read from the file can then be converted into Unicode characters, like so:
unic = native2unicode(bytes, 'UTF-8');
if you want, clear the Carriage Returns, set the Line Feeds to a space
unic(unic == 10) = []; unic(unic == 13) = ' ';
disp(unic'); % display the Unicode text

Categories

Find more on Toolbox Distribution in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!