I have files that are either encoded in ASCII (Windows-1252) or UTF-8 (with or without BOM encoding). How can I convert them from their encoding to UTF-8, like I could do with text editors like Notepad++?
Sin título.jpg
Welcome to the Java Programming Forums
The professional, friendly Java community. 21,500 members and growing!
The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.
>> REGISTER NOW TO START POSTING
Members have full access to the forums. Advertisements are removed for registered users.
I have files that are either encoded in ASCII (Windows-1252) or UTF-8 (with or without BOM encoding). How can I convert them from their encoding to UTF-8, like I could do with text editors like Notepad++?
Sin título.jpg
You can establish an input stream to the original file and fetch the data, then place it in an output steam with UTF-8 encoding.
Example of the output stream creation.
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file),"UTF8"));
efluvio (June 26th, 2011)
Thank you. I've found that when specifying encoding for the output is necessary as well to specify it for the input:
BufferedReader input = new BufferedReader(new InputStreamReader(new FileInputStream(file), encType));
The encType can be obtained from org.mozilla.universalchardet.UniversalDetector
juniversalchardet - Java port of universalchardet - Google Project Hosting
If encType is null don't specify encoding.