Tag Archives: utf8

Inserting arbitrary unicode characters to kwrite

While you can normally insert arbitrary unicode characters to any X11 application using Ctrl-Shift-u and four hex digits, it doesn’t work in kwrite or kate. Instead you’d have to press F7 to switch to command line and type in For example, to get the degree symbol (Unicode: U+00B0) you’d type in ‘char 176’ (176 being 0xB0 converted do decimal).

Specifying file encoding when writing dom Documents

Assumed, we got a fully parsed org.w3c.dom.Document: Just using LSSerializer‘s writeToString method without specifying any encoding will result in (rather impractical) UTF-16 encoded xml file per default will output Unfortunately, specifying an encoding isn’t trivial. Here are two solutions that don’t require any third party libraries: 1. Using org.w3c.dom.ls.LSOutput 2. Using javax.xml.transform.Transformer

FileWriter, XML and UTF-8

      1 Comment on FileWriter, XML and UTF-8

Deep down in the Java-API: http://java.sun.com/javase/6/docs/api/java/io/FileWriter.html Convenience class for writing character files. The constructors of this class assume that the default character encoding and the default byte-buffer size are acceptable. To specify these values yourself, construct an OutputStreamWriter on a FileOutputStream. So, if you want to write you XML-Document to a file, for the love of god, don’t use the… Read more »

Convert filenames from iso-8859-1 to utf-8

Just as you can convert entire files from one charset to another, you can convert the filenames. For example: would recursively convert all files in the current directory from iso-8859-1 charset into utf-8. Well, not exactly. To finally rename the files you need the –notest flag. Otherwise convmv will perform a dry run without any changes.