| |
Mp3 Book Helper |
Transformation Directive could be used to convert code pages in tags or file names:
<T><#1, #2> Converts CodePage#1 to CodePage#2,
Example: <T><20866, 1251> KOI8-R -> Cyrillic Windows
<T><#1> Converts Single byte characters to CodePage#1 <T><view> Converts Single byte characters to CodePage selected in "View" -> "Encoding" <T><rus2lat>, <T><lat2rus> Transliteration Cyrillic to Latin
Performs transliteration of ANSI Cyrillic (Windows 1251) text with readable Latin combinations, known as 'Runglish' or 'Volapyuk', and also can perform reverse translation from 'Runglish' text to its Cyrillic equivalent.
<T><file_name> Transformation using configuration file
See example file: transl_rus2lat.txt in Mp3BH installation directory. If no path specified in file name <T> directive will search in mp3BH installation directory first, then in current directory.
Transformation usage Examples (Russian):
You have file on Windows XP created on Windows 98:
×òî äåëàòü.mp3 $F<T><1251> result: Что делать.mp3
þÔÏ ÄÅÌÁÔØ.mp3 $F<T><20866> result: Что делать.mp3
Что делать.mp3 $F<T><1251, 0> result: ×òî äåëàòü.mp3
Что делать.mp3 $F<T><rus2lat> result: Chto delat'.mp3
N.B.
Result of <T> directive is Unicode string. If you save this string in non Unicode tags it would be converted to Single byte using Character set selected in "View" -> "Encoding"
Also Mp3BH would not save tags in Unicode format when "Unicode Tags Only When Necessary" checked in options and text could be represented in single byte characters selected in "View" -> "Encoding" Character set.
On Windows 98 character set used to show Strings could be changed in 'View' -> 'Font' dialog on XP this settings would not have effect unless you disabled Unicode support in options.
This is the list of windows Codepages/character sets for your reference.
| CodePage | Char Set Name | INA ID |
| 37 | IBM EBCDIC (US-Canada) | IBM037 |
| 290 | IBM Extended English Katakana | IBM290 |
| 300 | IBM Japanese Character Sets | |
| 420 | IBM EBCDIC (Arabic) | |
| 437 | MS-DOS United States | IBM437 |
| 500 | EBCDIC "500V1" | IBM500 |
| 708 | Arabic (ASMO 708) | ISO-8859-6 |
| 709 | Arabic (ASMO 449+, BCON V4) | ASMO_449 |
| 710 | MS-DOS Arabic (Transparent Arabic) | |
| 711 | Arabic (Nafitha Enhanced) | |
| 720 | Arabic (Transparent ASMO) | |
| 737 | MS-DOS Greek (formerly 437G) | |
| 775 | MS-DOS Baltic | IBM775 |
| 833 | IBM Hangul Extended Single-Byte | |
| 834 | Korean Host Double-byte | |
| 835 | IBM Traditional Chinese Character Sets | |
| 836 | Simplified Chinese Single-Byte | |
| 837 | IBM Simplified Chinese Character Sets | |
| 850 | MS-DOS Multilingual (Latin I) | IBM850 |
| 852 | MS-DOS Slavic (Latin II) | IBM852 |
| 855 | IBM Cyrillic (primarily Russian) | IBM855 |
| 857 | IBM Turkish | IBM857 |
| 860 | MS-DOS Portuguese | IBM860 |
| 861 | MS-DOS Icelandic | IBM861 |
| 862 | MS-DOS Hebrew | IBM862 |
| 863 | MS-DOS Canadian-French | IBM863 |
| 864 | MS-DOS Arabic | IBM864 |
| 865 | MS-DOS Nordic | IBM865 |
| 866 | MS-DOS Russian, Cyrillic | IBM866 |
| 869 | IBM Modern Greek | IBM869 |
| 870 | IBM EBCDIC Latin-2 | IBM870 |
| 874 | Thai | |
| 875 | IBM Greek EBCDIC | IBM423 |
| 932 | Japanese | |
| 936 | Chinese (China, Singapore) | GB_2312-80 |
| 949 | Korean (Wansung) | KS_C_5601-1987 |
| 950 | Chinese (Taiwan) | Big5 |
| 1026 | IBM Turkish EBCDIC | IBM1026 |
| 1027 | IBM Extended Lowercase English | |
| 1200 | Universal Alphabet (Unicode) | (ISO-10646-UCS-2) |
| 1201 | Universal Alphabet (Unicode) | (ISO-10646-UCS-2) |
| 1250 | Windows Central European | windows-1250 |
| 1251 | Windows Cyrillic | windows-1251 |
| 1252 | Windows Western European/US | windows-1252 |
| 1253 | Windows Greek | windows-1253 |
| 1254 | Windows Turkish | windows-1254 |
| 1255 | Windows Hebrew | windows-1255 |
| 1256 | Windows Arabic | windows-1256 |
| 1257 | Windows Baltic | windows-1257 |
| 1258 | Windows Vietnamese | windows-1258 |
| 1361 | Windows Korean | KS_C_5601-1992 |
| 10000 | Macintosh Roman | |
| 10001 | Macintosh Japanese | |
| 10002 | Macintosh Chinese | |
| 10003 | Macintosh Korean | |
| 10004 | Macintosh Arabic | |
| 10005 | Macintosh Hebrew | |
| 10006 | Macintosh Greek 1 | |
| 10007 | Macintosh Cyrillic | |
| 10008 | Macintosh Chinese Simplified | |
| 10010 | Macintosh Romanian | |
| 10017 | Macintosh Ukrainian | |
| 10029 | Macintosh Latin 2 | |
| 10079 | Macintosh Icelandic | |
| 10081 | Macintosh Turkish | |
| 10082 | Macintosh Croatian | |
| 20105 | IA5 IRV | DIN_66003 |
| 20106 | IA6 (German) | DIN_66003 |
| 20107 | IA6 (Swedish) | SEN_850200_B |
| 20108 | IA6 (Norwegian) | NS_4551-1 |
| 20261 | T.61 | T.61-8bit |
| 20269 | ISO-6937 | |
| 20273 | IBM EBCDIC Germany | IBM273 |
| 20277 | IBM EBCDIC Denmark/Norway | IBM277 |
| 20278 | IBM EBCDIC Finland/Sweden | IBM278 |
| 20280 | IBM EBCDIC Italy | IBM280 |
| 20284 | IBM EBCDIC Latin America/Spain | IBM284 |
| 20285 | IBM EBCDIC United Kingdom | IBM285 |
| 20290 | IBM EBCDIC Japanese | IBM290 |
| 20297 | IBM EBCDIC France | IBM297 |
| 20420 | IBM EBCDIC Arabic | IBM420 |
| 20423 | IBM EBCDIC Greek | IBM423 |
| 20833 | Korean (IBM EBCDIC?) | |
| 20838 | IBM EBCDIC Thai | IBM-Thai |
| 20866 | Russian - KOI8-R | KOI8-R |
| 20871 | IBM EBCDIC Icelandic | IBM871 |
| 20880 | IBM EBCDIC Cyrillic | IBM880 |
| 20905 | IBM EBCDIC Turkish | IBM905 |
| 21025 | IBM EBCDIC Cyrillic | |
| 21027 | Japanese | |
| 21866 | Ukrainian - KOI8-RU | KOI8-U |
| 28591 | ISO 8859-1 Western | ISO_8859-1:1987 |
| 28592 | ISO 8859-2 Eastern Europe | ISO_8859-2:1987 |
| 28593 | ISO 8859-3 Turkish | ISO_8859-3:1988 |
| 28594 | ISO 8859-4 Baltic | ISO_8859-4:1988 |
| 28595 | ISO 8859-5 Cyrillic | ISO_8859-5:1988 |
| 28596 | ISO 8859-6 Arabic | ISO_8859-6:1987 |
| 28597 | ISO 8859-7 Greek | ISO_8859-7:1987 |
| 28598 | ISO 8859-8 Hebrew | ISO_8859-8:1988 |
| 28599 | ISO 8859-9 | ISO_8859-9:1989 |
| 50220 | Japanese (JIS) | ISO-2022-JP |
| 50221 | Japanese (JIS) | ISO-2022-JP |
| 50222 | Japanese (JIS) | ISO-2022-JP |
| 50225 | Korean | ISO-2022-KR |
| 50932 | Japanese (autodetect) | |
| 50949 | Korean (autodetect) | |
| 51932 | Japanese (EUC) | EUC-JP |
| 51949 | Korean (EUC) | EUC-KR |
| 52936 | Simplified Chinese | HZ-GB-2312 |
| 65000 | Unicode UTF-7 | UTF-7 |
| 65001 | Unicode UTF-8 | UTF-8 |
| http://mp3BookHelper.sourceforge.net released under the GNU/GPL license |