@dburyak in this case, he talks about encoding and we use to encode string as UTF-8
When exactly to encode string, and which exactly string? Well, in any case, for whatever purpose, unfortunately java doesn't use utf-8 as default in any place. Source code (MyClass.java) files don't have to be utf-8, though fortunately in practice in virtually all the cases they are utf-8. Again, repeating myself, jvm in memory holds strings in utf-16. For all the classes in the jdk that perform any kind of conversion of bytes=>string or back you can specify charset as a parameter. And function overloads that don't accept charset, will use system default. Which is basically what your OS configuration is (unless it's overriden explicitly with some java sys prop). Not utf-8. I clearly remember a production bug investigation around a year related to this "nice" (not) behavior of this "use system default charset by default". We've deployed new release, and started having issues with some specific clients. Turned out there was "O with two dots" (or something from French alphabet, can't remember exactly) in the name. After further investigation it turned out that in our new release we changed the base docker image for all of our microservices, and that base image for weird reason had some exotic charset configured as system default. And our microservices picked it and used everywhere for converting bytes=>strings and back. So, no, utf-8 is not used as default in whatever aspect of java we are talking about. Maybe there's some other place where utf-8 is used by default in jvm, but I have no idea what other part it can be
Обсуждают сегодня