Convert Unicode character representation to numeric bytes


bytes = unicode2native(unicodestr)
bytes = unicode2native(unicodestr,encoding)


bytes = unicode2native(unicodestr) converts the input Unicode® character representation, unicodestr, to the user default encoding, and returns the bytes as a uint8 vector, bytes. Output vector bytes has the same general array shape as the unicodestr input. You can save the output of unicode2native to a file using the fwrite function. unicodestr can be a character vector or a string scalar.

bytes = unicode2native(unicodestr,encoding) converts unicodestr to the character encoding scheme specified by encoding. The input argument encoding must have no characters ('') or must be a name or alias for an encoding scheme. Some examples are 'UTF-8', 'latin1', 'US-ASCII', and 'Shift_JIS'. If encoding is unspecified or has no characters (''), the default encoding scheme is used. encoding can be a character vector or a string scalar.


This example begins with two character vectors containing Unicode character representations. It assumes that str1 contains text in a Western European language and that str2 contains Japanese text. The example writes both character vectors into the same file, using the ISO-8859-1 character encoding scheme for the first character vector and the Shift-JIS encoding scheme for the second character vector. The example uses unicode2native to convert str1 and str2 to the appropriate encoding schemes.

fid = fopen('mixed.txt', 'w');
bytes1 = unicode2native(str1, 'ISO-8859-1');
fwrite(fid, bytes1, 'uint8');
bytes2 = unicode2native(str2, 'Shift_JIS');
fwrite(fid, bytes2, 'uint8');

Introduced before R2006a