Shortened line lengths in the compressdoc script

git-svn-id: svn://svn.linuxfromscratch.org/BLFS/trunk/BOOK@3271 af4574ff-66df-0310-9fd7-8a98e5e911e0
This commit is contained in:
Randy McMurchy 2005-01-12 23:53:33 +00:00
parent 68637f8c17
commit 4ee1c44284
2 changed files with 111 additions and 82 deletions

View File

@ -26,7 +26,8 @@ who wrote what.</para>
lcms-1.14 and GIMP-2.2.2.</para></listitem>
<listitem><para>January 12th, 2005 [randy]: Moved OpenSSL instructions from
Chapter 8 to Chapter 4, suggested by Torsten Vollmann.</para></listitem>
Chapter 8 to Chapter 4, suggested by Torsten Vollmann; shortened line lengths
in the compressdoc script.</para></listitem>
<listitem><para>January 11th, 2005 [randy]: Moved libgtkhtml, GNOME-Doc-Utils
and Yelp from GNOME-Addons to GNOME-Core; added Cdrtools to Nautilus-CD-Burner

View File

@ -5,28 +5,31 @@
%general-entities;
]>
<sect1 id="postlfs-config-compressdoc" xreflabel="compressdoc">
<sect1 id="compressdoc" xreflabel="compressdoc">
<sect1info>
<othername>$LastChangedBy$</othername>
<date>$Date$</date>
</sect1info>
<?dbhtml filename="compressdoc.html"?>
<title>Compressing man and info pages</title>
<indexterm zone="compressdoc">
<primary sortas="b-compressdoc">compressdoc</primary></indexterm>
<para>Man and info reader programs can transparently process gzip'ed or
bzip2'ed pages, a feature you can use to free some disk space while keeping
your documentation available. However, things are not that simple; man
directories tend to contain links&mdash;hard and symbolic&mdash;which defeat simple
ideas like recursively calling <command>gzip</command> on them. A better way
to go is to use the script below.
directories tend to contain links&mdash;hard and symbolic&mdash;which defeat
simple ideas like recursively calling <command>gzip</command> on them. A
better way to go is to use the script below.
</para>
<screen><userinput><command>cat &gt; /usr/sbin/compressdoc &lt;&lt; "EOF"</command>
#!/bin/bash
# VERSION: 20040320.0026
# VERSION: 20050112.0027
#
# Compress (with bzip2 or gzip) all man pages in a hierarchy and
# update symlinks - By Marc Heerdink &lt;marc @ koelkast.net&gt;
#
# Modified to be able to gzip or bzip2 files as an option and to deal
# with all symlinks properly by Mark Hymers &lt;markh @ linuxfromscratch.org&gt;
#
@ -35,19 +38,24 @@ to go is to use the script below.
# to allow for changing hard-links into soft- ones, to specify the
# compression level, to parse the man.conf for all occurrences of MANPATH,
# to allow for a backup, to allow to keep the newest version of a page.
# Modified 20040330 by Tushar Teredesai to replace $0 by the name of the script.
#
# Modified 20040330 by Tushar Teredesai to replace $0 by the name of the
# script.
# (Note: It is assumed that the script is in the user's PATH)
#
# Modified 20050112 by Randy McMurchy to shorten line lengths and
# correct grammar errors.
#
# TODO:
# - choose a default compress method to be based on the available
# tool : gzip or bzip2;
# - offer an option to automagically choose the best compression method
# on a per page basis (eg. check which ofgzip/bzip2/whatever is the
# most effective, page per page);
# - when a MANPATH env var exists, use this instead of /etc/man.conf
# (useful for users to (de)compress their man pages;
# - offer an option to restore a previous backup;
# - add other compression engines (compress, zip, etc?). Needed?
# - choose a default compress method to be based on the available
# tool : gzip or bzip2;
# - offer an option to automagically choose the best compression
# methed on a per page basis (eg. check which of
# gzip/bzip2/whatever is the most effective, page per page);
# - when a MANPATH env var exists, use this instead of /etc/man.conf
# (useful for users to (de)compress their man pages;
# - offer an option to restore a previous backup;
# - add other compression engines (compress, zip, etc?). Needed?
# Funny enough, this function prints some help.
function help ()
@ -65,73 +73,76 @@ Where comp_method is one of :
--decompress, -d
Decompress the man pages.
--backup Specify a .tar backup shall be done for every directories.
In case a backup already exists, it is saved as .tar.old prior
to making the new backup. If an .tar.old backup exist, it is
removed prior to saving the backup.
--backup Specify a .tar backup shall be done for all directories.
In case a backup already exists, it is saved as .tar.old
prior to making the new backup. If a .tar.old backup
exists, it is removed prior to saving the backup.
In backup mode, no other action is performed.
And where options are :
-1 to -9, --fast, --best
The compression level, as accepted by gzip and bzip2. When not
specified, uses the default compression level for the given
method (-6 for gzip, and -9 for bzip2). Not used when in backup
or decompress modes.
The compression level, as accepted by gzip and bzip2.
When not specified, uses the default compression level
for the given method (-6 for gzip, and -9 for bzip2).
Not used when in backup or decompress modes.
--force, -F Force (re-)compression, even if the previous one was the same
method. Useful when changing the compression ratio. By default,
a page will not be re-compressed if it ends with the same suffix
as the method adds (.bz2 for bzip2, .gz for gzip).
--force, -F Force (re-)compression, even if the previous one was
the same method. Useful when changing the compression
ratio. By default, a page will not be re-compressed if
it ends with the same suffix as the method adds
(.bz2 for bzip2, .gz for gzip).
--soft, -S Change hard-links into soft-links. Use with _caution_ as the
first encountered file will be used as a reference. Not used
when in backup mode.
--soft, -S Change hard-links into soft-links. Use with _caution_
as the first encountered file will be used as a
reference. Not used when in backup mode.
--hard, -H Change soft-links into hard-links. Not used when in backup mode.
--hard, -H Change soft-links into hard-links. Not used when in
backup mode.
--conf=dir, --conf dir
Specify the location of man.conf. Defaults to /etc.
--verbose, -v Verbose mode, print the name of the directory being processed.
Double the flag to turn it even more verbose, and to print the
name of the file being processed.
--verbose, -v Verbose mode, print the name of the directory being
processed. Double the flag to turn it even more verbose,
and to print the name of the file being processed.
--fake, -f Fakes it. Print the actual parameters compman will use.
dirs A list of space-separated _absolute_ pathname to the man
directories.
When empty, and only then, parse ${MAN_CONF}/man.conf for all
occurrences of MANPATH.
dirs A list of space-separated _absolute_ pathnames to the
man directories. When empty, and only then, parse
${MAN_CONF}/man.conf for all occurrences of MANPATH.
Note about compression
Note about compression:
There has been a discussion on blfs-support about compression ratios of
both gzip and bzip2 on man pages, taking into account the hosting fs,
the architecture, etc... On the overall, the conclusion was that gzip
was much efficient on 'small' files, and bzip2 on 'big' files, small and
big being very dependent on the content of the files.
was much more efficient on 'small' files, and bzip2 on 'big' files,
small and big being very dependent on the content of the files.
See the original post from Mickael A. Peters, titled "Bootable Utility CD",
and dated 20030409.1816(+0200), and subsequent posts:
See the original post from Mickael A. Peters, titled
"Bootable Utility CD", dated 20030409.1816(+0200), and subsequent posts:
http://linuxfromscratch.org/pipermail/blfs-support/2003-April/038817.html
On my system (x86, ext3), man pages were 35564kiB before compression. gzip -9
compressed them down to 20372kiB (57.28%), bzip2 -9 got down to 19812kiB
(55.71%). That is a 1.57% gain in space. YMMV.
On my system (x86, ext3), man pages were 35564KB before compression.
gzip -9 compressed them down to 20372KB (57.28%), bzip2 -9 got down to
19812KB (55.71%). That is a 1.57% gain in space. YMMV.
What was not taken into consideration was the decompression speed. But
does it make sense to? You gain fast access with uncompressed man
pages, or you gain space at the expense of a slight overhead in time.
Well, my P4-2.5GHz does not even let me notice this... :-)
What was not taken into consideration was the decompression speed. But does
it make sense to? You gain fast access with uncompressed man pages, or you
gain space at the expense of a slight overhead in time. Well, my P4-2.5GHz
does not even let me notice this... :-)
EOT
) | less
}
# This function checks that the man page is unique amongst bzip2'd, gzip'd and
# uncompressed versions.
# This function checks that the man page is unique amongst bzip2'd,
# gzip'd and uncompressed versions.
# $1 the directory in which the file resides
# $2 the file name for the man page
# Returns 0 (true) if the file is the latest and must be taken care of, and 1
# (false) if the file is not the latest (and has therefore been deleted).
# Returns 0 (true) if the file is the latest and must be taken care of,
# and 1 (false) if the file is not the latest (and has therefore been
# deleted).
function check_unique ()
{
# NB. When there are hard-links to this file, these are
@ -147,7 +158,8 @@ function check_unique ()
BZ_FILE="$BASENAME".bz2
# Look for, and keep, the most recent one
LATEST=`(cd "$DIR"; ls -1rt "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}" 2&gt;/dev/null | tail -n 1)`
LATEST=`(cd "$DIR"; ls -1rt "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}" \
2&gt;/dev/null | tail -n 1)`
for i in "${BASENAME}" "${GZ_FILE}" "${BZ_FILE}"; do
[ "$LATEST" != "$i" ] &amp;&amp; rm -f "$DIR"/"$i"
done
@ -161,9 +173,10 @@ function check_unique ()
# Name of the script
MY_NAME=`basename $0`
# OK, parse the command-line for arguments, and initialize to some sensible
# state, that is : don't change links state, parse /etc/man.conf, be most
# silent, search man.conf in /etc, and don't force (re-)compression.
# OK, parse the command-line for arguments, and initialize to some
# sensible state, that is: don't change links state, parse
# /etc/man.conf, be most silent, search man.conf in /etc, and don't
# force (re-)compression.
COMP_METHOD=
COMP_SUF=
COMP_LVL=
@ -269,7 +282,8 @@ case $VERBOSE_LVL in
;;
esac
# Note: on my machine, 'man --path' gives /usr/share/man twice, once with a trailing '/', once without.
# Note: on my machine, 'man --path' gives /usr/share/man twice, once
# with a trailing '/', once without.
if [ -z "$MAN_DIR" ]; then
MAN_DIR=`man --path -C "$MAN_CONF"/man.conf \
| sed 's/:/\\n/g' \
@ -301,9 +315,11 @@ if [ "$FAKE" != "no" ]; then
[ "foo$FORCE_OPT" = "foo-F" ] &amp;&amp; echo "yes" || echo "no"
echo "man.conf is.......: ${MAN_CONF}/man.conf"
echo -n "Hard-links........: "
[ "foo$LN_OPT" = "foo-S" ] &amp;&amp; echo "convert to soft-links" || echo "leave as is"
[ "foo$LN_OPT" = "foo-S" ] &amp;&amp;
echo "convert to soft-links" || echo "leave as is"
echo -n "Soft-links........: "
[ "foo$LN_OPT" = "foo-H" ] &amp;&amp; echo "convert to hard-links" || echo "leave as is"
[ "foo$LN_OPT" = "foo-H" ] &amp;&amp;
echo "convert to hard-links" || echo "leave as is"
echo "Backup............: $BACKUP"
echo "Faking (yes!).....: $FAKE"
echo "Directories.......: $MAN_DIR"
@ -324,7 +340,8 @@ if [ "$BACKUP" = "yes" ]; then
DIR_NAME=`basename "${DIR}"`
echo "Backing up $DIR..." &gt; $DEST_FD0
[ -f "${DIR_NAME}.tar.old" ] &amp;&amp; rm -f "${DIR_NAME}.tar.old"
[ -f "${DIR_NAME}.tar" ] &amp;&amp; mv "${DIR_NAME}.tar" "${DIR_NAME}.tar.old"
[ -f "${DIR_NAME}.tar" ] &amp;&amp;
mv "${DIR_NAME}.tar" "${DIR_NAME}.tar.old"
tar cfv "${DIR_NAME}.tar" "${DIR_NAME}" &gt; $DEST_FD1
done
exit 0
@ -340,21 +357,24 @@ for DIR in $MAN_DIR; do
if [ "foo$FILE" = "foo*" ]; then continue; fi
# Fixes the case when hard-links see their compression scheme change
# (from not compressed to compressed, or from bz2 to gz, or from gz to bz2)
# Also fixes the case when multiple version of the page are present, which
# are either compressed or not.
# (from not compressed to compressed, or from bz2 to gz, or from gz
# to bz2)
# Also fixes the case when multiple version of the page are present,
# which are either compressed or not.
if [ ! -L "$FILE" -a ! -e "$FILE" ]; then continue; fi
# Do not compress whatis files
if [ "$FILE" = "whatis" ]; then continue; fi
if [ -d "$FILE" ]; then
cd "${MEM_DIR}" # Go back to where we ran "$0", in case "$0"=="./compressdoc" ...
cd "${MEM_DIR}" # Go back to where we ran "$0",
# in case "$0"=="./compressdoc" ...
# We are going recursive to that directory
echo "-&gt; Entering ${DIR}/${FILE}..." &gt; $DEST_FD0
# I need not pass --conf, as I specify the directory to work on
# But I need exit in case of error
"$MY_NAME" ${COMP_METHOD} ${COMP_LVL} ${LN_OPT} ${VERBOSE_OPT} ${FORCE_OPT} "${DIR}/${FILE}" || exit 1
"$MY_NAME" ${COMP_METHOD} ${COMP_LVL} ${LN_OPT} ${VERBOSE_OPT}
${FORCE_OPT} "${DIR}/${FILE}" || exit 1
echo "&lt;- Leaving ${DIR}/${FILE}." &gt; $DEST_FD1
cd "$DIR" # Needed for the next iteration of the loop
@ -364,7 +384,8 @@ for DIR in $MAN_DIR; do
# Check if the file is already compressed with the specified method
BASE_FILE=`basename "$FILE" .gz`
BASE_FILE=`basename "$BASE_FILE" .bz2`
if [ "${FILE}" = "${BASE_FILE}${COMP_SUF}" -a "foo${FORCE_OPT}" = "foo" ]; then continue; fi
if [ "${FILE}" = "${BASE_FILE}${COMP_SUF}" \
-a "foo${FORCE_OPT}" = "foo" ]; then continue; fi
# If we have a symlink
if [ -h "$FILE" ]; then
@ -378,7 +399,8 @@ for DIR in $MAN_DIR; do
esac
if [ ! "$EXT" = "none" ]; then
LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 | tr -d " " | sed s/\.$EXT$//`
LINK=`ls -l "$FILE" | cut -d "&gt;" -f2 \
| tr -d " " | sed s/\.$EXT$//`
NEWNAME=`echo "$FILE" | sed s/\.$EXT$//`
mv "$FILE" "$NEWNAME"
FILE="$NEWNAME"
@ -400,8 +422,9 @@ for DIR in $MAN_DIR; do
elif [ -f "$FILE" ]; then
# Take care of hard-links: build the list of files hard-linked
# to the one we are {de,}compressing.
# NB. This is not optimum has the file will eventually be compressed
# as many times it has hard-links. But for now, that's the safe way.
# NB. This is not optimum has the file will eventually be
# compressed as many times it has hard-links. But for now,
# that's the safe way.
inode=`ls -li "$FILE" | awk '{print $1}'`
HLINKS=`find . \! -name "$FILE" -inum $inode`
@ -450,20 +473,23 @@ for DIR in $MAN_DIR; do
# Keep the hard-link a hard- one
ln "${FILE}$COMP_SUF" "${NEWFILE}$COMP_SUF"
fi
chmod 644 "${NEWFILE}$COMP_SUF" # Really work only for hard-links. Harmless for soft-links
# Really work only for hard-links. Harmless for soft-links
chmod 644 "${NEWFILE}$COMP_SUF"
done
fi
else
# There is a problem when we get neither a symlink nor a plain file
# Obviously, we shall never ever come here... :-(
echo "Whaooo... \"${DIR}/${FILE}\" is neither a symlink nor a plain file. Please check:"
# There is a problem when we get neither a symlink nor a plain
# file. Obviously, we shall never ever come here... :-(
echo -n "Whaooo... \"${DIR}/${FILE}\" is neither a symlink "
echo "nor a plain file. Please check:"
ls -l "${DIR}/${FILE}"
exit 1
fi
fi
done # for FILE
done # for DIR
<command>EOF
chmod 755 /usr/sbin/compressdoc</command></userinput></screen>
@ -474,12 +500,14 @@ comprehensive help about what the script is able to do.</para>
<para> Don't forget that a few programs, like the <application>X</application>
Window System and <application>XEmacs</application> also install their
documentation in non standard places (such as <filename class="directory">
/usr/X11R6/man</filename>, etc...). Be sure to add these locations to the
file <filename>/etc/man.conf</filename>, as a
<envar>MANPATH</envar>=<replaceable>/path</replaceable> section.</para>
<para> Example:</para><screen><userinput>
...
documentation in non standard places (such as
<filename class="directory">/usr/X11R6/man</filename>, etc...). Be sure to add
these locations to the file <filename>/etc/man.conf</filename>, as a
<envar>MANPATH</envar>=<replaceable>[/path]</replaceable> section.</para>
<para> Example:</para>
<screen><userinput> ...
MANPATH=/usr/share/man
MANPATH=/usr/local/man
MANPATH=/usr/X11R6/man