$ iconv simplified_chinese_input.txt -f utf8 -t gb2312 | iconv -f gb2312 -t big5 | iconv -f big5 -t utf8 -o traditional_chinese_output.txt
This involves three steps actually,
1. This first convert the text file from UTF8 to GB2312 (Simplied Chinese) 2. Then, convert the GB2312(Simplied Chinese) to Big5 Encoding (Traditional Chinese) 3. Finally, convert the Big5 to UTF8 text file
In fact, you can change the order so that it can convert traditional Chinese to simplified Chinese
$ iconv traditional_chinese_input.txt -f utf8 -t big5 | iconv -f big5 -t gb2312 | iconv -f gb2312 -t utf8 -o simplified_chinese_output.txt
If you need to do this many times, you can store it as a Shell Script named S2T.sh with following content.
#!/usr/bin/sh
iconv $1 -f utf8 -t gb2312 | iconv -f gb2312 -t big5 | iconv -f big5 -t utf8 -o $2
Then, set it to be executable
$ chmod u+x S2T.sh
Finally, use it with
$ ./S2T.sh input.txt output.txt
Hope it helps.
Using iconv to do the job is error-prune, for example, "非常强烈" will become "非常烈"
ReplyDelete