Added new section 'Locale Related Issues' to Chapter 2, 'Important Information', thanks to Alexander Patrakov for contributing the text for this page

git-svn-id: svn://svn.linuxfromscratch.org/BLFS/trunk/BOOK@5498 af4574ff-66df-0310-9fd7-8a98e5e911e0
This commit is contained in:
Randy McMurchy 2005-12-29 03:55:45 +00:00
parent 5254d12fef
commit 9c90b1bee9
5 changed files with 161 additions and 15 deletions

View File

@ -5,11 +5,11 @@
%general-entities;
<!ENTITY unzip-download-http "http://www.mirrorservice.org/sites/ftp.info-zip.org/pub/infozip/src/unzip552.tar.gz">
<!ENTITY unzip-download-ftp "ftp://ftp.info-zip.org/pub/infozip/src/unzip552.tar.gz">
<!ENTITY unzip-md5sum "9d23919999d6eac9217d1f41472034a9">
<!ENTITY unzip-size "1.1 MB">
<!ENTITY unzip-buildsize "7.2 MB">
<!ENTITY unzip-time "0.09 SBU">
<!ENTITY unzip-download-ftp "ftp://ftp.info-zip.org/pub/infozip/src/unzip552.tar.gz">
<!ENTITY unzip-md5sum "9d23919999d6eac9217d1f41472034a9">
<!ENTITY unzip-size "1.1 MB">
<!ENTITY unzip-buildsize "7.2 MB">
<!ENTITY unzip-time "0.1 SBU">
]>
<sect1 id="unzip" xreflabel="UnZip-&unzip-version;">
@ -34,11 +34,18 @@
<title>Introduction to UnZip</title>
<para>The <application>UnZip</application> package contains
<filename>ZIP</filename> extraction utilities. These are useful for extracting
files from <filename>ZIP</filename> archives. <filename>ZIP</filename>
archives are created with <application>PKZIP</application> or
<application>Info-ZIP</application> utilities primarily in a DOS
environment. </para>
<filename>ZIP</filename> extraction utilities. These are useful for
extracting files from <filename>ZIP</filename> archives.
<filename>ZIP</filename> archives are created with
<application>PKZIP</application> or <application>Info-ZIP</application>
utilities, primarily in a DOS environment.</para>
<caution>
<para>The <application>UnZip</application> package has some locale
related issues. For a full explanation of the issues and some possible
solutions, see the <xref linkend="locale-unzip"/> section of the
<xref linkend="locale-issues"/>.</para>
</caution>
<bridgehead renderas="sect3">Package Information</bridgehead>
<itemizedlist spacing="compact">
@ -117,8 +124,8 @@ cp -v -d libunzip.so* /usr/lib</userinput></screen>
<command>make list</command> command.</para>
<para><command>make ... linux_shlibz</command>: Build shared
<filename>libunzip</filename> and link <application>UnZip</application> against
it and <application>zlib</application>.</para>
<filename>libunzip</filename> and link <application>UnZip</application>
against it and <application>zlib</application>.</para>
</sect2>
@ -168,8 +175,8 @@ cp -v -d libunzip.so* /usr/lib</userinput></screen>
<term><command>unzipfsx</command></term>
<listitem>
<para>is a self-extracting stub that can be prepended to a
<filename>ZIP</filename> archive. Files in this format allow the recipient to
decompress the archive without installing
<filename>ZIP</filename> archive. Files in this format allow the
recipient to decompress the archive without installing
<application>UnZip</application>.</para>
<indexterm zone="unzip unzipfsx">
<primary sortas="b-unzipfsx">unzipfsx</primary>

View File

@ -19,7 +19,7 @@
<xi:include xmlns:xi="http://www.w3.org/2003/XInclude" href="position.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2003/XInclude" href="patches.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2003/XInclude" href="bootscripts.xml"/>
<!-- <xi:include xmlns:xi="http://www.w3.org/2003/XInclude" href="locale-issues.xml"/> -->
<xi:include xmlns:xi="http://www.w3.org/2003/XInclude" href="locale-issues.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2003/XInclude" href="beyond.xml"/>
</chapter>

View File

@ -0,0 +1,122 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [
<!ENTITY % general-entities SYSTEM "../../general.ent">
%general-entities;
]>
<sect1 id="locale-issues" xreflabel="Locale Related Issues">
<?dbhtml filename="locale-issues.html"?>
<sect1info>
<othername>$LastChangedBy:$</othername>
<date>$Date:$</date>
</sect1info>
<title>Locale Related Issues</title>
<para>This page contains information about locale related problems and
issues. In this paragraph you'll find a generic overview of things that can
come up when configuring your system for various locales. The previous
sentence and the remainder of this paragraph must still be
revised/completed.</para>
<sect2>
<title>Package Specific Locale Issues</title>
<para>For package specific issues, find the concerned package from the list
below and follow the link to view the available information. If a package
is not listed here, it means there are no known locale specific issues or
problems with that package.</para>
<itemizedlist>
<title>List of Packages with Locale Related Issues</title>
<listitem>
<para><xref linkend="locale-unzip"/></para>
</listitem>
</itemizedlist>
<sect3 id="locale-unzip" xreflabel="UnZip-&unzip-version;">
<title><xref linkend="unzip"/></title>
<note>
<para>Use of <application>UnZip</application> in the
<application>JDK</application>, <application>Mozilla</application>,
<application>DocBook</application> or any other BLFS installation
instructions is not a problem, as these applications never use
<application>UnZip</application> to extract a file with non-ASCII
characters in its name.</para>
</note>
<para>The <application>UnZip</application> package assumes that filenames
stored in the ZIP archives created on non-Unix systems are encoded in
CP850, and that they should be converted to ISO-8859-1 when writing files
onto the filesystem. Such assumptions are not always valid. In fact,
inside the ZIP archive, filenames are encoded in the DOS codepage that is
in use in the relevant country, and the filenames on disk should be in
the locale encoding. In MS Windows, the OemToChar() C function (from
<filename>User32.DLL</filename>) does the correct conversion (which is
indeed the conversion from CP850 to a superset of ISO-8859-1 if MS
Windows is set up to use the US English language), but there is no
equivalent in Linux.</para>
<para>When using <command>unzip</command> to unpack a ZIP archive
containing non-ASCII filenames, the filenames are damaged because
<command>unzip</command> uses improper conversion when any of
<replaceable>[SOMETHING NEEDS TO BE PUT HERE AS THE SENTENCE WAS
INCOMPLETE]</replaceable>. For example, in the ru_RU.KOI8-R locale,
conversion of filenames from CP866 to KOI8-R is required, but conversion
from CP850 to ISO-8859-1 is done, which produces filenames consisting of
undecipherable characters instead of words (the closest equivalent
understandable example for English-only users is rot13). There are
several ways around this limitation:</para>
<para>1) For unpacking ZIP archives with filenames containing non-ASCII
characters, use <ulink url="http://www.winzip.com/">WinZip</ulink> while
running the <ulink url="http://www.winehq.com/">Wine</ulink> Windows
emulator.</para>
<para>2) After running <command>unzip</command>, fix the damage made to
the filenames using the <command>convmv</command> tool
(<ulink url="http://j3e.de/linux/convmv/"/>). The following is an example
for the ru_RU.KOI8-R locale:</para>
<blockquote>
<para>Step 1. Undo the conversion done by
<command>unzip</command>:</para>
<screen><userinput>convmv -f iso-8859-1 -t cp850 -r --nosmart --notest \
<replaceable>[/path/to/unzipped/files]</replaceable></userinput></screen>
<para>Step 2. Do the correct conversion instead:</para>
<screen><userinput>convmv -f cp866 -t koi8-r -r --nosmart --notest \
<replaceable>[/path/to/unzipped/files]</replaceable></userinput></screen>
</blockquote>
<para>3) Apply this patch to unzip:
<ulink url="https://bugzilla.altlinux.ru/attachment.cgi?id=532"/></para>
<para>It allows to specify the assumed filename encoding in the ZIP
archive using the <option>-O charset_name</option> option and the
on-disk filename encoding using the <option>-I charset_name</option>
option. Defaults: the on-disk filename encoding is the locale encoding,
the encoding inside the ZIP archive is guessed according to the builtin
table based on the locale encoding. For US English users, this still
means that unzip converts from CP850 to ISO-8859-1 by default.</para>
<para>Caveat: this method works only with 8-bit locale encodings, not
with UTF-8. Attempting to use a patched <command>unzip</command> in UTF-8
locales may result in a segmentation fault and is probably a security
risk.</para>
</sect3>
</sect2>
</sect1>

View File

@ -41,6 +41,17 @@
-->
<listitem>
<para>December 29th, 2005</para>
<itemizedlist>
<listitem>
<para>[randy] - Added new section 'Locale Related Issues' to Chapter
2, 'Important Information', thanks to Alexander Patrakov for
contributing the text for this page.</para>
</listitem>
</itemizedlist>
</listitem>
<listitem>
<para>December 28th, 2005</para>
<itemizedlist>

View File

@ -67,6 +67,12 @@
<emphasis>Tushar Teredesai</emphasis>.</para>
</listitem>
<listitem>
<para>Chapter 02: Locale Related Issues:
<emphasis>Alexander Patrakov</emphasis> and
<emphasis>Randy McMurchy</emphasis>.</para>
</listitem>
<listitem>
<para>Chapter 03: /etc/inputrc:
<emphasis>Chris Lynn</emphasis>.</para>