Mp3 Book Helper

Transformation and Character Sets

Transformation Directive could be used to convert code pages in tags or file names:

Transformation usage Examples (Russian):

You have file on Windows XP created on Windows 98:

×òî äåëàòü.mp3 $F<T><1251> result: Что делать.mp3

þÔÏ ÄÅÌÁÔØ.mp3 $F<T><20866> result: Что делать.mp3

Что делать.mp3 $F<T><1251, 0> result: ×òî äåëàòü.mp3

Что делать.mp3 $F<T><rus2lat> result: Chto delat'.mp3


Result of <T> directive is Unicode string. If you save this string in non Unicode tags it would be converted to Single byte using Character set selected in "View" -> "Encoding"

Also Mp3BH would not save tags in Unicode format when "Unicode Tags Only When Necessary" checked in options and text could be represented in single byte characters selected in "View" -> "Encoding" Character set.

On Windows 98 character set used to show Strings could be changed in 'View' -> 'Font' dialog on XP this settings would not have effect unless you disabled Unicode support in options.

This is the list of windows Codepages/character sets for your reference.

CodePage Char Set Name INA ID
37 IBM EBCDIC (US-Canada) IBM037
290 IBM Extended English Katakana IBM290
300 IBM Japanese Character Sets  
420 IBM EBCDIC (Arabic)  
437 MS-DOS United States IBM437
500 EBCDIC "500V1" IBM500
708 Arabic (ASMO 708) ISO-8859-6
709 Arabic (ASMO 449+, BCON V4) ASMO_449
710 MS-DOS Arabic (Transparent Arabic)  
711 Arabic (Nafitha Enhanced)  
720 Arabic (Transparent ASMO)  
737 MS-DOS Greek (formerly 437G)  
775 MS-DOS Baltic IBM775
833 IBM Hangul Extended Single-Byte  
834 Korean Host Double-byte  
835 IBM Traditional Chinese Character Sets  
836 Simplified Chinese Single-Byte  
837 IBM Simplified Chinese Character Sets  
850 MS-DOS Multilingual (Latin I) IBM850
852 MS-DOS Slavic (Latin II) IBM852
855 IBM Cyrillic (primarily Russian) IBM855
857 IBM Turkish IBM857
860 MS-DOS Portuguese IBM860
861 MS-DOS Icelandic IBM861
862 MS-DOS Hebrew IBM862
863 MS-DOS Canadian-French IBM863
864 MS-DOS Arabic IBM864
865 MS-DOS Nordic IBM865
866 MS-DOS Russian, Cyrillic IBM866
869 IBM Modern Greek IBM869
870 IBM EBCDIC Latin-2 IBM870
874 Thai  
875 IBM Greek EBCDIC IBM423
932 Japanese  
936 Chinese (China, Singapore) GB_2312-80
949 Korean (Wansung) KS_C_5601-1987
950 Chinese (Taiwan) Big5
1026 IBM Turkish EBCDIC IBM1026
1027 IBM Extended Lowercase English  
1200 Universal Alphabet (Unicode) (ISO-10646-UCS-2)
1201 Universal Alphabet (Unicode) (ISO-10646-UCS-2)
1250 Windows Central European windows-1250
1251 Windows Cyrillic windows-1251
1252 Windows Western European/US windows-1252
1253 Windows Greek windows-1253
1254 Windows Turkish windows-1254
1255 Windows Hebrew windows-1255
1256 Windows Arabic windows-1256
1257 Windows Baltic windows-1257
1258 Windows Vietnamese windows-1258
1361 Windows Korean KS_C_5601-1992
10000 Macintosh Roman  
10001 Macintosh Japanese  
10002 Macintosh Chinese  
10003 Macintosh Korean  
10004 Macintosh Arabic  
10005 Macintosh Hebrew  
10006 Macintosh Greek 1  
10007 Macintosh Cyrillic  
10008 Macintosh Chinese Simplified  
10010 Macintosh Romanian  
10017 Macintosh Ukrainian  
10029 Macintosh Latin 2  
10079 Macintosh Icelandic  
10081 Macintosh Turkish  
10082 Macintosh Croatian  
20105 IA5 IRV DIN_66003
20106 IA6 (German) DIN_66003
20107 IA6 (Swedish) SEN_850200_B
20108 IA6 (Norwegian) NS_4551-1
20261 T.61 T.61-8bit
20269 ISO-6937  
20273 IBM EBCDIC Germany IBM273
20277 IBM EBCDIC Denmark/Norway IBM277
20278 IBM EBCDIC Finland/Sweden IBM278
20280 IBM EBCDIC Italy IBM280
20284 IBM EBCDIC Latin America/Spain IBM284
20285 IBM EBCDIC United Kingdom IBM285
20290 IBM EBCDIC Japanese IBM290
20297 IBM EBCDIC France IBM297
20420 IBM EBCDIC Arabic IBM420
20423 IBM EBCDIC Greek IBM423
20833 Korean (IBM EBCDIC?)  
20838 IBM EBCDIC Thai IBM-Thai
20866 Russian - KOI8-R KOI8-R
20871 IBM EBCDIC Icelandic IBM871
20880 IBM EBCDIC Cyrillic IBM880
20905 IBM EBCDIC Turkish IBM905
21025 IBM EBCDIC Cyrillic  
21027 Japanese  
21866 Ukrainian - KOI8-RU KOI8-U
28591 ISO 8859-1 Western ISO_8859-1:1987
28592 ISO 8859-2 Eastern Europe ISO_8859-2:1987
28593 ISO 8859-3 Turkish ISO_8859-3:1988
28594 ISO 8859-4 Baltic ISO_8859-4:1988
28595 ISO 8859-5 Cyrillic ISO_8859-5:1988
28596 ISO 8859-6 Arabic ISO_8859-6:1987
28597 ISO 8859-7 Greek ISO_8859-7:1987
28598 ISO 8859-8 Hebrew ISO_8859-8:1988
28599 ISO 8859-9 ISO_8859-9:1989
50220 Japanese (JIS) ISO-2022-JP
50221 Japanese (JIS) ISO-2022-JP
50222 Japanese (JIS) ISO-2022-JP
50225 Korean ISO-2022-KR
50932 Japanese (autodetect)  
50949 Korean (autodetect)  
51932 Japanese (EUC) EUC-JP
51949 Korean (EUC) EUC-KR
52936 Simplified Chinese HZ-GB-2312
65000 Unicode UTF-7 UTF-7
65001 Unicode UTF-8 UTF-8

Page Top released under the GNU/GPL license