CharCopy

CharCopy is a small but powerful util that copies text files and converts between character encodings.

Version: build 2002-175
Language: English
Requirements: Java 1.1
Start command:
java CharCopy parameters
Download: CharCopy.class (5 kb)

Description

The java runtimes have good support for diffent character encodings, but with the runtime only, the end user can't use these facilities. That's why I created CharCopy, a small commandline util that works like the copy command, but with the important difference that it works with characters, not bytes, and that you can give the target file another encoding than the source file.

Some notes about characters and bytes: CharCopy works with characters, not bytes. Copy plain text files only! CharCopy reports the number of characters copied; this can differ from the number of bytes copied. The target file can get another size than the source file, but the number of characters remains.

A java runtime should (must) at least support these encodings: US-ASCII, ISO-8859-1, UTF-8, UTF-16BE, UTF-16LE, UTF-16

Example of use

To fetch the syntax pattern, type:

java CharCopy

A source file is required. A source encoding is optional (default is platform's default). A target file is optional (no target given will write to standard out, which normally is the screen). A target encoding is optional (default is platform's default).

Here follows some examples. To copy the contents of iso.txt with ISO-8859-1 encoding to utf.txt with UTF-8 encoding:

java CharCopy iso.txt:8859-1 utf.txt:UTF-8

To copy the contents of native.txt with the platsforms default encoding to iso.txt with ISO-8859-1 encoding:

java CharCopy native.txt iso.txt:8859-1

To read utf.txt with UTF-8 encoding and display the contents on the screen (with platsforms default encoding):

java CharCopy utf.txt:UTF-8

The following two lines should have the same effect. However, the last one probably doesn't work on all platforms.

java CharCopy utf.txt:UTF-8 iso.txt:8859-1
java CharCopy utf.txt:UTF-8 :8859-1 > iso.txt

The above stupid filenames (iso.txt utf.txt native.txt) is just to make it easier to understand, and doesn't have anything with the functionality to do.