glfs/introduction/important/building-notes.xml
2024-02-23 01:30:17 +08:00

1350 lines
61 KiB
XML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
<!ENTITY % general-entities SYSTEM "../../general.ent">
%general-entities;
]>
<sect1 id="unpacking">
<?dbhtml filename="notes-on-building.html"?>
<title>Notes on Building Software</title>
<para>Those people who have built an LFS system may be aware
of the general principles of downloading and unpacking software. Some
of that information is repeated here for those new to building
their own software.</para>
<para>Each set of installation instructions contains a URL from which you
can download the package. The patches; however, are stored on the LFS
servers and are available via HTTP. These are referenced as needed in the
installation instructions.</para>
<para>While you can keep the source files anywhere you like, we assume that
you have unpacked the package and changed into the directory created by the
unpacking process (the source directory). We also assume you have
uncompressed any required patches and they are in the directory
immediately above the source directory.</para>
<para>We can not emphasize strongly enough that you should start from a
<emphasis>clean source tree</emphasis> each time. This means that if
you have had an error during configuration or compilation, it's usually
best to delete the source tree and
re-unpack it <emphasis>before</emphasis> trying again. This obviously
doesn't apply if you're an advanced user used to hacking
<filename>Makefile</filename>s and C code, but if in doubt, start from a
clean tree.</para>
<sect2>
<title>Building Software as an Unprivileged (non-root) User</title>
<para>The golden rule of Unix System Administration is to use your
superpowers only when necessary. Hence, BLFS recommends that you
build software as an unprivileged user and only become the
<systemitem class='username'>root</systemitem> user when installing the
software. This philosophy is followed in all the packages in this book.
Unless otherwise specified, all instructions should be executed as an
unprivileged user. The book will advise you on instructions that need
<systemitem class='username'>root</systemitem> privileges.</para>
</sect2>
<sect2>
<title>Unpacking the Software</title>
<para>If a file is in <filename class='extension'>.tar</filename> format
and compressed, it is unpacked by running one of the following
commands:</para>
<screen><userinput>tar -xvf filename.tar.gz
tar -xvf filename.tgz
tar -xvf filename.tar.Z
tar -xvf filename.tar.bz2</userinput></screen>
<note>
<para>You may omit using the <option>v</option> parameter in the commands
shown above and below if you wish to suppress the verbose listing of all
the files in the archive as they are extracted. This can help speed up the
extraction as well as make any errors produced during the extraction
more obvious to you.</para>
</note>
<para>You can also use a slightly different method:</para>
<screen><userinput>bzcat filename.tar.bz2 | tar -xv</userinput></screen>
<para>
Finally, sometimes we have a compressed patch file in
<filename class='extension'>.patch.gz</filename> or
<filename class='extension'>.patch.bz2</filename> format.
The best way to apply the patch is piping the output of the
decompressor to the <command>patch</command> utility. For example:
</para>
<screen><userinput>gzip -cd ../patchname.patch.gz | patch -p1</userinput></screen>
<para>
Or for a patch compressed with <command>bzip2</command>:
</para>
<screen><userinput>bzcat ../patchname.patch.bz2 | patch -p1</userinput></screen>
</sect2>
<sect2>
<title>Verifying File Integrity</title>
<para>Generally, to verify that the downloaded file is complete,
many package maintainers also distribute md5sums of the files. To verify the
md5sum of the downloaded files, download both the file and the
corresponding md5sum file to the same directory (preferably from different
on-line locations), and (assuming <filename>file.md5sum</filename> is the
md5sum file downloaded) run the following command:</para>
<screen><userinput>md5sum -c file.md5sum</userinput></screen>
<para>If there are any errors, they will be reported. Note that the BLFS
book includes md5sums for all the source files also. To use the BLFS
supplied md5sums, you can create a <filename>file.md5sum</filename> (place
the md5sum data and the exact name of the downloaded file on the same
line of a file, separated by white space) and run the command shown above.
Alternately, simply run the command shown below and compare the output
to the md5sum data shown in the BLFS book.</para>
<screen><userinput>md5sum <replaceable>&lt;name_of_downloaded_file&gt;</replaceable></userinput></screen>
<para>MD5 is not cryptographically secure, so the md5sums are only
provided for detecting unmalicious changes to the file content. For
example, an error or truncation introduced during network transfer, or
a <quote>stealth</quote> update to the package from the upstream
(updating the content of a released tarball instead of making a new
release properly).</para>
<para>There is no <quote>100%</quote> secure way to make
sure the genuity of the source files. Assuming the upstream is managing
their website correctly (the private key is not leaked and the domain is
not hijacked), and the trust anchors have been set up correctly using
<xref linkend="make-ca"/> on the BLFS system, we can reasonably trust
download URLs to the upstream official website
<emphasis role="bold">with https protocol</emphasis>. Note that
BLFS book itself is published on a website with https, so you should
already have some confidence in https protocol or you wouldn't trust the
book content.</para>
<para>If the package is downloaded from an unofficial location (for
example a local mirror), checksums generated by cryptographically secure
digest algorithms (for example SHA256) can be used to verify the
genuity of the package. Download the checksum file from the upstream
<emphasis role="bold">official</emphasis> website (or somewhere
<emphasis role="bold">you can trust</emphasis>) and compare the
checksum of the package from unofficial location with it. For example,
SHA256 checksum can be checked with the command:</para>
<note>
<para>If the checksum and the package are downloaded from the same
untrusted location, you won't gain security enhancement by verifying
the package with the checksum. The attacker can fake the checksum as
well as compromising the package itself.</para>
</note>
<screen><userinput>sha256sum -c <replaceable>file</replaceable>.sha256sum</userinput></screen>
<para>If <xref linkend="gnupg2"/> is installed, you can also verify the
genuity of the package with a GPG signature. Import the upstream GPG
public key with:</para>
<screen><userinput>gpg --recv-key <replaceable>keyID</replaceable></userinput></screen>
<para><replaceable>keyID</replaceable> should be replaced with the key ID
from somewhere <emphasis role="bold">you can trust</emphasis> (for
example, copy it from the upstream official website using https). Now
you can verify the signature with:</para>
<screen><userinput>gpg --recv-key <replaceable>file</replaceable>.sig <replaceable>file</replaceable></userinput></screen>
<para>The advantage of <application>GnuPG</application> signature is,
once you imported a public key which can be trusted, you can download
both the package and its signature from the same unofficial location and
verify them with the public key. So you won't need to connect to the
official upstream website to retrieve a checksum for each new release.
You only need to update the public key if it's expired or revoked.
</para>
</sect2>
<sect2>
<title>Creating Log Files During Installation</title>
<para>For larger packages, it is convenient to create log files instead of
staring at the screen hoping to catch a particular error or warning. Log
files are also useful for debugging and keeping records. The following
command allows you to create an installation log. Replace
<replaceable>&lt;command&gt;</replaceable> with the command you intend to execute.</para>
<screen><userinput>( <replaceable>&lt;command&gt;</replaceable> 2&gt;&amp;1 | tee compile.log &amp;&amp; exit $PIPESTATUS )</userinput></screen>
<para><option>2&gt;&amp;1</option> redirects error messages to the same
location as standard output. The <command>tee</command> command allows
viewing of the output while logging the results to a file. The parentheses
around the command run the entire command in a subshell and finally the
<command>exit $PIPESTATUS</command> command ensures the result of the
<replaceable>&lt;command&gt;</replaceable> is returned as the result and not the
result of the <command>tee</command> command.</para>
</sect2>
<sect2 id="parallel-builds" xreflabel="Using Multiple Processors">
<title>Using Multiple Processors</title>
<para>For many modern systems with multiple processors (or cores) the
compilation time for a package can be reduced by performing a "parallel
make" by either setting an environment variable or telling the make program
to simultaneously execute multiple jobs.</para>
<para>For instance, an Intel Core i9-13900K CPU contains 8 performance
(P) cores and 16 efficiency (E) cores, and the P cores support SMT
(Simultaneous MultiThreading, also known as
<quote>Hyper-Threading</quote>) so each P core can run two threads
simultaneously and the Linux kernel will treat each P core as two
logical cores. As the result, there are 32 logical cores in total.
To utilize all these logical cores running <command>make</command>, we
can set an environment variable to tell <command>make</command> to
run 32 jobs simultaneously:</para>
<screen><userinput>export MAKEFLAGS='-j32'</userinput></screen>
<para>or just building with:</para>
<screen><userinput>make -j32</userinput></screen>
<para>
If you have applied the optional <command>sed</command> when building
<application>ninja</application> in LFS, you can use:
</para>
<screen><userinput>export NINJAJOBS=32</userinput></screen>
<para>
when a package uses <command>ninja</command>, or just:
</para>
<screen><userinput>ninja -j32</userinput></screen>
<para>
If you are not sure about the number of logical cores, run the
<command>nproc</command> command.
</para>
<para>
For <command>make</command>, the default number of jobs is 1. But
for <command>ninja</command>, the default number of jobs is N + 2 if
the number of logical cores N is greater than 2; or N + 1 if
N is 1 or 2. The reason to use a number of jobs slightly greater
than the number of logical cores is keeping all logical
processors busy even if some jobs are performing I/O operations.
</para>
<para>
Note that the <option>-j</option> switches only limits the parallel
jobs started by <command>make</command> or <command>ninja</command>,
but each job may still spawn its own processes or threads. For
example, <command>ld.gold</command> will use multiple threads for
linking, and some tests of packages can spawn multiple threads for
testing thread safety properties. There is no generic way for the
building system to know the number of processes or threads spawned by
a job. So generally we should not consider the value passed with
<option>-j</option> a hard limit of the number of logical cores to
use. Read <xref linkend='build-in-cgroup'/> if you want to set such
a hard limit.
</para>
<para>Generally the number of processes should not exceed the number of
cores supported by the CPU too much. To list the processors on your
system, issue: <userinput>grep processor /proc/cpuinfo</userinput>.
</para>
<para>In some cases, using multiple processes may result in a race
condition where the success of the build depends on the order of the
commands run by the <command>make</command> program. For instance, if an
executable needs File A and File B, attempting to link the program before
one of the dependent components is available will result in a failure.
This condition usually arises because the upstream developer has not
properly designated all the prerequisites needed to accomplish a step in the
Makefile.</para>
<para>If this occurs, the best way to proceed is to drop back to a
single processor build. Adding <option>-j1</option> to a make command
will override the similar setting in the <envar>MAKEFLAGS</envar>
environment variable.</para>
<important>
<para>
Another problem may occur with modern CPU's, which have a lot of cores.
Each job started consumes memory, and if the sum of the needed
memory for each job exceeds the available memory, you may encounter
either an OOM (Out of Memory) kernel interrupt or intense swapping
that will slow the build beyond reasonable limits.
</para>
<para>
Some compilations with <command>g++</command> may consume up to 2.5 GB
of memory, so to be safe, you should restrict the number of jobs
to (Total Memory in GB)/2.5, at least for big packages such as LLVM,
WebKitGtk, QtWebEngine, or libreoffice.
</para>
</important>
</sect2>
<sect2 id="build-in-cgroup">
<title>Use Linux Control Group to Limit the Resource Usage</title>
<para>
Sometimes we want to limit the resource usage when we build a
package. For example, when we have 8 logical cores, we may want
to use only 6 cores for building the package and reserve another
2 cores for playing a movie. The Linux kernel provides a feature
called control groups (cgroup) for such a need.
</para>
<para>
Enable control group in the kernel configuration, then rebuild the
kernel and reboot if necessary:
</para>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="cgroup-kernel.xml"/>
<!-- We need cgroup2 mounted at /sys/fs/cgroup. It's done by
systemd itself in LFS systemd, mountvirtfs script in LFS sysv. -->
<para revision='systemd'>
Ensure <xref linkend='systemd'/> and <xref linkend='shadow'/> have
been rebuilt with <xref linkend='linux-pam'/> support (if you are
interacting via a SSH or graphical session, also ensure the
<xref linkend='openssh'/> server or the desktop manager has been
built with <xref linkend='linux-pam'/>). As the &root; user, create
a configuration file to allow resource control without &root;
privilege, and instruct <command>systemd</command> to reload the
configuration:
</para>
<screen revision="systemd" role="nodump"><userinput>mkdir -pv /etc/systemd/system/user@.service.d &amp;&amp;
cat &gt; /etc/systemd/system/user@.service.d/delegate.conf &lt;&lt; EOF &amp;&amp;
<literal>[Service]
Delegate=memory cpuset</literal>
systemctl daemon-reload</userinput></screen>
<para revision='systemd'>
Then logout and login again. Now to run <command>make -j5</command>
with the first 4 logical cores and 8 GB of system memory, issue:
</para>
<screen revision="systemd" role="nodump"><userinput>systemctl --user start dbus &amp;&amp;
systemd-run --user --pty --pipe --wait -G -d \
-p MemoryHigh=8G \
-p AllowedCPUs=0-3 \
make -j5</userinput></screen>
<para revision='sysv'>
Ensure <xref linkend='sudo'/> is installed. To run
<command>make -j5</command> with the first 4 logical cores and 8 GB
of system memory, issue:
</para>
<!-- "\EOF" because we expect $$ to be expanded by the "bash -e"
shell, not the current shell.
TODO: can we use elogind to delegate the controllers (like
systemd) to avoid relying on sudo? -->
<screen revision="sysv" role="nodump"><userinput>bash -e &lt;&lt; \EOF
sudo mkdir /sys/fs/cgroup/$$
sudo sh -c \
"echo +memory +cpuset > /sys/fs/cgroup/cgroup.subtree_control"
sudo sh -c \
"echo 0-3 > /sys/fs/cgroup/$$/cpuset.cpus"
sudo sh -c \
"echo $(bc -e '8*2^30') > /sys/fs/cgroup/$$/memory.high"
(
sudo sh -c "echo $BASHPID > /sys/fs/cgroup/$$/cgroup.procs"
exec make -j5
)
sudo rmdir /sys/fs/cgroup/$$
EOF</userinput></screen>
<para>
With
<phrase revision='systemd'>
<parameter>MemoryHigh=8G</parameter>
</phrase>
<phrase revision='sysv'>
<literal>8589934592</literal> (the output of
<userinput>bc -e '8*2^30'</userinput>, 2^30 represents
2<superscript>30</superscript>, i.e. a Gigabyte) in the
<filename>memory.high</filename> entry
</phrase>, a soft limit of memory usage is set.
If the processes in the cgroup (<command>make</command> and all the
descendants of it) uses more than 8 GB of system memory in total,
the kernel will throttle down the processes and try to reclaim the
system memory from them. But they can still use more than 8 GB of
system memory. If you want to make a hard limit instead, replace
<phrase revision='systemd'>
<parameter>MemoryHigh</parameter> with
<parameter>MemoryMax</parameter>.
</phrase>
<phrase revision='sysv'>
<filename>memory.high</filename> with
<filename>memory.max</filename>.
</phrase>
But doing so will cause the processes killed if 8 GB is not enough
for them.
</para>
<para>
<phrase revision='systemd'>
<parameter>AllowedCPUs=0-3</parameter>
</phrase>
<phrase revision='sysv'>
<literal>0-3</literal> in the <filename>cpuset.cpus</filename>
entry
</phrase> makes the kernel only run the processes in the cgroup on
the logical cores with numbers 0, 1, 2, or 3. You may need to
adjust this setting based the mapping between the logical cores and the
physical cores. For example, with an Intel Core i9-13900K CPU,
the logical cores 0, 2, 4, ..., 14 are mapped to the first threads of
the eight physical P cores, the logical cores 1, 3, 5, ..., 15 are
mapped to the second threads of the physical P cores, and the logical
cores 16, 17, ..., 31 are mapped to the 16 physical E cores. So if
we want to use four threads from four different P cores, we need to
specify <literal>0,2,4,6</literal> instead of <literal>0-3</literal>.
Note that the other CPU models may use a different mapping scheme.
If you are not sure about the mapping between the logical cores
and the physical cores, run <command>grep -E '^processor|^core'
/proc/cpuinfo</command> which will output logical core IDs in the
<computeroutput>processor</computeroutput> lines, and physical core
IDs in the <computeroutput>core id</computeroutput> lines.
</para>
<para>
When the <command>nproc</command> or <command>ninja</command> command
runs in a cgroup, it will use the number of logical cores assigned to
the cgroup as the <quote>system logical core count</quote>. For
example, in a cgroup with logical cores 0-3 assigned,
<command>nproc</command> will print
<computeroutput>4</computeroutput>, and <command>ninja</command>
will run 6 (4 + 2) jobs simultaneously if no <option>-j</option>
setting is explicitly given.
</para>
<para revision="systemd">
Read the man pages <ulink role='man'
url='&man;systemd-run.1'>systemd-run(1)</ulink> and
<ulink role='man'
url='&man;systemd.resource-control.5'>systemd.resource-control(5)</ulink>
for the detailed explanation of parameters in the command.
</para>
<para revision="sysv">
Read the <filename>Documentation/admin-guide/cgroup-v2.rst</filename>
file in the Linux kernel source tree for the detailed explanation of
<systemitem class="filesystem">cgroup2</systemitem> pseudo file
system entries referred in the command.
</para>
</sect2>
<sect2 id="automating-builds" xreflabel="Automated Building Procedures">
<title>Automated Building Procedures</title>
<para>There are times when automating the building of a package can come in
handy. Everyone has their own reasons for wanting to automate building,
and everyone goes about it in their own way. Creating
<filename>Makefile</filename>s, <application>Bash</application> scripts,
<application>Perl</application> scripts or simply a list of commands used
to cut and paste are just some of the methods you can use to automate
building BLFS packages. Detailing how and providing examples of the many
ways you can automate the building of packages is beyond the scope of this
section. This section will expose you to using file redirection and the
<command>yes</command> command to help provide ideas on how to automate
your builds.</para>
<bridgehead renderas="sect3">File Redirection to Automate Input</bridgehead>
<para>You will find times throughout your BLFS journey when you will come
across a package that has a command prompting you for information. This
information might be configuration details, a directory path, or a response
to a license agreement. This can present a challenge to automate the
building of that package. Occasionally, you will be prompted for different
information in a series of questions. One method to automate this type of
scenario requires putting the desired responses in a file and using
redirection so that the program uses the data in the file as the answers to
the questions.</para>
<!-- outdated
<para>Building the <application>CUPS</application> package is a good
example of how redirecting a file as input to prompts can help you automate
the build. If you run the test suite, you are asked to respond to a series
of questions regarding the type of test to run and if you have any
auxiliary programs the test can use. You can create a file with your
responses, one response per line, and use a command similar to the
one shown below to automate running the test suite:</para>
<screen><userinput>make check &lt; ../cups-1.1.23-testsuite_parms</userinput></screen>
-->
<para>This effectively makes the test suite use the responses in the file
as the input to the questions. Occasionally you may end up doing a bit of
trial and error determining the exact format of your input file for some
things, but once figured out and documented you can use this to automate
building the package.</para>
<bridgehead renderas="sect3">Using <command>yes</command> to Automate
Input</bridgehead>
<para>Sometimes you will only need to provide one response, or provide the
same response to many prompts. For these instances, the
<command>yes</command> command works really well. The
<command>yes</command> command can be used to provide a response (the same
one) to one or more instances of questions. It can be used to simulate
pressing just the <keycap>Enter</keycap> key, entering the
<keycap>Y</keycap> key or entering a string of text. Perhaps the easiest
way to show its use is in an example.</para>
<para>First, create a short <application>Bash</application> script by
entering the following commands:</para>
<screen><userinput>cat &gt; blfs-yes-test1 &lt;&lt; "EOF"
<literal>#!/bin/bash
echo -n -e "\n\nPlease type something (or nothing) and press Enter ---> "
read A_STRING
if test "$A_STRING" = ""; then A_STRING="Just the Enter key was pressed"
else A_STRING="You entered '$A_STRING'"
fi
echo -e "\n\n$A_STRING\n\n"</literal>
EOF
chmod 755 blfs-yes-test1</userinput></screen>
<para>Now run the script by issuing <command>./blfs-yes-test1</command> from
the command line. It will wait for a response, which can be anything (or
nothing) followed by the <keycap>Enter</keycap> key. After entering
something, the result will be echoed to the screen. Now use the
<command>yes</command> command to automate the entering of a
response:</para>
<screen><userinput>yes | ./blfs-yes-test1</userinput></screen>
<para>Notice that piping <command>yes</command> by itself to the script
results in <keycap>y</keycap> being passed to the script. Now try it with a
string of text:</para>
<screen><userinput>yes 'This is some text' | ./blfs-yes-test1</userinput></screen>
<para>The exact string was used as the response to the script. Finally,
try it using an empty (null) string:</para>
<screen><userinput>yes '' | ./blfs-yes-test1</userinput></screen>
<para>Notice this results in passing just the press of the
<keycap>Enter</keycap> key to the script. This is useful for times when the
default answer to the prompt is sufficient. This syntax is used in the
<xref linkend="net-tools-automate-example"/> instructions to accept all the
defaults to the many prompts during the configuration step. You may now
remove the test script, if desired.</para>
<bridgehead renderas="sect3">File Redirection to Automate Output</bridgehead>
<para>In order to automate the building of some packages, especially those
that require you to read a license agreement one page at a time, requires
using a method that avoids having to press a key to display each page.
Redirecting the output to a file can be used in these instances to assist
with the automation. The previous section on this page touched on creating
log files of the build output. The redirection method shown there used the
<command>tee</command> command to redirect output to a file while also
displaying the output to the screen. Here, the output will only be sent to
a file.</para>
<para>Again, the easiest way to demonstrate the technique is to show an
example. First, issue the command:</para>
<screen><userinput>ls -l /usr/bin | less</userinput></screen>
<para>Of course, you'll be required to view the output one page at a time
because the <command>less</command> filter was used. Now try the same
command, but this time redirect the output to a file. The special file
<filename>/dev/null</filename> can be used instead of the filename shown,
but you will have no log file to examine:</para>
<screen><userinput>ls -l /usr/bin | less &gt; redirect_test.log 2&gt;&amp;1</userinput></screen>
<para>Notice that this time the command immediately returned to the shell
prompt without having to page through the output. You may now remove the
log file.</para>
<para>The last example will use the <command>yes</command> command in
combination with output redirection to bypass having to page through the
output and then provide a <keycap>y</keycap> to a prompt. This technique
could be used in instances when otherwise you would have to page through
the output of a file (such as a license agreement) and then answer the
question of <quote>do you accept the above?</quote>. For this example,
another short <application>Bash</application> script is required:</para>
<screen><userinput>cat &gt; blfs-yes-test2 &lt;&lt; "EOF"
<literal>#!/bin/bash
ls -l /usr/bin | less
echo -n -e "\n\nDid you enjoy reading this? (y,n) "
read A_STRING
if test "$A_STRING" = "y"; then A_STRING="You entered the 'y' key"
else A_STRING="You did NOT enter the 'y' key"
fi
echo -e "\n\n$A_STRING\n\n"</literal>
EOF
chmod 755 blfs-yes-test2</userinput></screen>
<para>This script can be used to simulate a program that requires you to
read a license agreement, then respond appropriately to accept the
agreement before the program will install anything. First, run the script
without any automation techniques by issuing
<command>./blfs-yes-test2</command>.</para>
<para>Now issue the following command which uses two automation techniques,
making it suitable for use in an automated build script:</para>
<screen><userinput>yes | ./blfs-yes-test2 &gt; blfs-yes-test2.log 2&gt;&amp;1</userinput></screen>
<para>If desired, issue <command>tail blfs-yes-test2.log</command> to see
the end of the paged output, and confirmation that <keycap>y</keycap> was
passed through to the script. Once satisfied that it works as it should,
you may remove the script and log file.</para>
<para>Finally, keep in mind that there are many ways to automate and/or
script the build commands. There is not a single <quote>correct</quote> way
to do it. Your imagination is the only limit.</para>
</sect2>
<sect2>
<title>Dependencies</title>
<para>For each package described, BLFS lists the known dependencies.
These are listed under several headings, whose meaning is as follows:</para>
<itemizedlist>
<listitem>
<para><emphasis>Required</emphasis> means that the target package
cannot be correctly built without the dependency having first been
installed, except if the dependency is said to be
<quote>runtime</quote>, which means the target package can be built but
cannot function without it.</para>
<para>
Note that a target package can start to <quote>function</quote>
in many subtle ways: an installed configuration file can make the
init system, cron daemon, or bus daemon to run a program
automatically; another package using the target package as an
dependency can run a program from the target package in the
building system; and the configuration sections in the BLFS book
may also run a program from a just installed package. So if
you are installing the target package without a
<emphasis>Required (runtime)</emphasis> dependency installed,
You should install the dependency as soon as possible after the
installation of the target package.
</para>
</listitem>
<listitem>
<para><emphasis>Recommended</emphasis> means that BLFS strongly
suggests this package is installed first (except if said to be
<quote>runtime</quote>, see below) for a clean and trouble-free
build, that won't have issues either during the build process, or at
run-time. The instructions in the book assume these packages are
installed. Some changes or workarounds may be required if these
packages are not installed. If a recommended dependency is said
to be <quote>runtime</quote>, it means that BLFS strongly suggests
that this dependency is installed before using the package, for
getting full functionality.</para>
</listitem>
<listitem>
<para><emphasis>Optional</emphasis> means that this package might be
installed for added functionality. Often BLFS will describe the
dependency to explain the added functionality that will result.
An optional dependency may be automatically pick up by the target
package if the dependency is installed, but another some optional
dependency may also need additional configuration options to enable
them when the target package is built. Such additional options are
often documented in the BLFS book. If an optional dependency is
said to be <quote>runtime</quote>, it means you may install
the dependency after installing the target package to support some
optional features of the target package if you need these
features.</para>
<para>An optional dependency may be out of BLFS. If you need such
an <emphasis>external</emphasis> optional dependency for some
features you need, read <xref linkend='beyond'/> for the general
hint about installing an out-of-BLFS package.</para>
</listitem>
</itemizedlist>
</sect2>
<sect2 id="package_updates">
<title>Using the Most Current Package Sources</title>
<para>On occasion you may run into a situation in the book when a package
will not build or work properly. Though the Editors attempt to ensure
that every package in the book builds and works properly, sometimes a
package has been overlooked or was not tested with this particular version
of BLFS.</para>
<para>If you discover that a package will not build or work properly, you
should see if there is a more current version of the package. Typically
this means you go to the maintainer's web site and download the most current
tarball and attempt to build the package. If you cannot determine the
maintainer's web site by looking at the download URLs, use Google and query
the package's name. For example, in the Google search bar type:
'package_name download' (omit the quotes) or something similar. Sometimes
typing: 'package_name home page' will result in you finding the
maintainer's web site.</para>
</sect2>
<sect2 id="stripping">
<title>Stripping One More Time</title>
<para>
In LFS, stripping of debugging symbols and unneeded symbol table
entries was discussed a couple of times. When building BLFS packages,
there are generally no special instructions that discuss stripping
again. Stripping can be done while installing a package, or
afterwards.
</para>
<bridgehead renderas="sect3" id="stripping-install">Stripping while Installing a Package</bridgehead>
<para>
There are several ways to strip executables installed by a
package. They depend on the build system used (see below <link
linkend="buildsystems">the section about build systems</link>),
so only some
generalities can be listed here:
</para>
<note>
<para>
The following methods using the feature of a building system
(autotools, meson, or cmake) will not strip static libraries if any
is installed. Fortunately there are not too many static libraries
in BLFS, and a static library can always be stripped safely by
running <command>strip --strip-unneeded</command> on it manually.
</para>
</note>
<itemizedlist>
<listitem>
<para>
The packages using autotools usually have an
<parameter>install-strip</parameter> target in their generated
<filename>Makefile</filename> files. So installing stripped
executables is just a matter of using
<command>make install-strip</command> instead of
<command>make install</command>.
</para>
</listitem>
<listitem>
<para>
The packages using the meson build system can accept
<parameter>-Dstrip=true</parameter> when running
<command>meson</command>. If you've forgot to add this option
running the <command>meson</command>, you can also run
<command>meson install --strip</command> instead of
<command>ninja install</command>.
</para>
</listitem>
<listitem>
<para>
<command>cmake</command> generates
<parameter>install/strip</parameter> targets for both the
<parameter>Unix Makefiles</parameter> and
<parameter>Ninja</parameter> generators (the default is
<parameter>Unix Makefiles</parameter> on linux). So just run
<command>make install/strip</command> or
<command>ninja install/strip</command> instead of the
<command>install</command> counterparts.
</para>
</listitem>
<listitem>
<para>
Removing (or not generating) debug symbols can also be
achieved by removing the
<parameter>-g&lt;something&gt;</parameter> options
in C/C++ calls. How to do that is very specific for each
package. And, it does not remove unneeded symbol table entries.
So it will not be explained in detail here. See also below
the paragraphs about optimization.
</para>
</listitem>
</itemizedlist>
<bridgehead renderas="sect3" id="stripping-installed">Stripping Installed Executables</bridgehead>
<para>
The <command>strip</command> utility changes files in place, which may
break anything using it if it is loaded in memory. Note that if a file is
in use but just removed from the disk (i.e. not overwritten nor
modified), this is not a problem since the kernel can use
<quote>deleted</quote> files. Look at <filename>/proc/*/maps</filename>
and it is likely that you'll see some <emphasis>(deleted)</emphasis>
entries. The <command>mv</command> just removes the destination file from
the directory but does not touch its content, so that it satisfies the
condition for the kernel to use the old (deleted) file.
But this approach can detach hard links into duplicated copies,
causing a bloat which is obviously unwanted as we are stripping to
reduce system size. If two files in a same file system share the
same inode number, they are hard links to each other and we should
reconstruct the link. The script below is just an example.
It should be run as the &root; user:
</para>
<screen><userinput>cat &gt; /usr/sbin/strip-all.sh &lt;&lt; "EOF"
<literal>#!/usr/bin/bash
if [ $EUID -ne 0 ]; then
echo "Need to be root"
exit 1
fi
last_fs_inode=
last_file=
{ find /usr/lib -type f -name '*.so*' ! -name '*dbg'
find /usr/lib -type f -name '*.a'
find /usr/{bin,sbin,libexec} -type f
} | xargs stat -c '%m %i %n' | sort | while read fs inode file; do
if ! readelf -h $file >/dev/null 2>&amp;1; then continue; fi
if file $file | grep --quiet --invert-match 'not stripped'; then continue; fi
if [ "$fs $inode" = "$last_fs_inode" ]; then
ln -f $last_file $file;
continue;
fi
cp --preserve $file ${file}.tmp
strip --strip-unneeded ${file}.tmp
mv ${file}.tmp $file
last_fs_inode="$fs $inode"
last_file=$file
done</literal>
EOF
chmod 744 /usr/sbin/strip-all.sh</userinput></screen>
<para>
If you install programs in other directories such as <filename
class="directory">/opt</filename> or <filename
class="directory">/usr/local</filename>, you may want to strip the files
there too. Just add other directories to scan in the compound list of
<command>find</command> commands between the braces.
</para>
<para>
For more information on stripping, see <ulink
url="https://www.technovelty.org/linux/stripping-shared-libraries.html"/>.
</para>
</sect2>
<!--
<sect2 id="libtool">
<title>Libtool files</title>
<para>
One of the side effects of packages that use Autotools, including
libtool, is that they create many files with an .la extension. These
files are not needed in an LFS environment. If there are conflicts with
pkgconfig entries, they can actually prevent successful builds. You
may want to consider removing these files periodically:
</para>
<screen><userinput>find /lib /usr/lib -not -path "*Image*" -a -name \*.la -delete</userinput></screen>
<para>
The above command removes all .la files with the exception of those that
have <quote>Image</quote> or <quote>openldap</quote> as a part of the
path. These .la files are used by the ImageMagick and openldap programs,
respectively. There may be other exceptions by packages not in BLFS.
</para>
</sect2>
-->
<sect2 id="buildsystems">
<title>Working with different build systems</title>
<para>
There are now three different build systems in common use for
converting C or C++ source code into compiled programs or
libraries and their details (particularly, finding out about available
options and their default values) differ. It may be easiest to understand
the issues caused by some choices (typically slow execution or
unexpected use of, or omission of, optimizations) by starting with
the <envar>CFLAGS</envar>, <envar>CXXFLAGS</envar>, and
<envar>LDFLAGS</envar> environment variables. There are also some
programs which use Rust.
</para>
<para>
Most LFS and BLFS builders are probably aware of the basics of
<envar>CFLAGS</envar> and <envar>CXXFLAGS</envar> for altering how a
program is compiled. Typically, some form of optimization is used by
upstream developers (<option>-O2</option> or <option>-O3</option>),
sometimes with the creation of debug symbols (<option>-g</option>),
as defaults.
</para>
<para>
If there are contradictory flags (e.g. multiple different
<option>-O</option> values),
the <emphasis>last</emphasis> value will be used. Sometimes this means
that flags specified in environment variables will be picked up before
values hardcoded in the Makefile, and therefore ignored. For example,
where a user specifies <option>-O2</option> and that is followed by
<option>-O3</option> the build will use <option>-O3</option>.
</para>
<para>
There are various other things which can be passed in CFLAGS or
CXXFLAGS, such as allowing using the instruction set extensions
available with a specific microarchitecture (e.g.
<option>-march=amdfam10</option> or <option>-march=native</option>),
tune the generated code for a specific microarchitecture (e. g.
<option>-mtune=tigerlake</option> or <option>-mtune=native</option>,
if <option>-mtune=</option> is not used, the microarchitecture from
<option>-march=</option> setting will be used), or specifying a
specific standard for C or C++ (<option>-std=c++17</option> for
example). But one thing which has now come to light is that
programmers might include debug assertions in their code, expecting
them to be disabled in releases by using <option>-DNDEBUG</option>.
Specifically, if <xref linkend="mesa"/> is built with these
assertions enabled, some activities such as loading levels of games
can take extremely long times, even on high-class video cards.
</para>
<bridgehead renderas="sect3" id="autotools-info">Autotools with Make</bridgehead>
<para>
This combination is often described as <quote>CMMI</quote>
(configure, make, make install) and is used here to also cover
the few packages which have a configure script that is not
generated by autotools.
</para>
<para>
Sometimes running <command>./configure --help</command> will produce
useful options about switches which might be used. At other times,
after looking at the output from configure you may need to look
at the details of the script to find out what it was actually searching
for.
</para>
<para>
Many configure scripts will pick up any CFLAGS or CXXFLAGS from the
environment, but CMMI packages vary about how these will be mixed with
any flags which would otherwise be used (<emphasis>variously</emphasis>:
ignored, used to replace the programmer's suggestion, used before the
programmer's suggestion, or used after the programmer's suggestion).
</para>
<para>
In most CMMI packages, running <command>make</command> will list
each command and run it, interspersed with any warnings. But some
packages try to be <quote>silent</quote> and only show which file
they are compiling or linking instead of showing the command line.
If you need to inspect the command, either because of an error, or
just to see what options and flags are being used, adding
<option>V=1</option> to the make invocation may help.
</para>
<bridgehead renderas="sect3" id="cmake-info">CMake</bridgehead>
<para>
CMake works in a very different way, and it has two backends which
can be used on BLFS: <command>make</command> and
<command>ninja</command>. The default backend is make, but
ninja can be faster on large packages with multiple processors. To
use ninja, specify <option>-G Ninja</option> in the cmake command.
However, there are some packages which create fatal errors in their
ninja files but build successfully using the default of Unix
Makefiles.
</para>
<para>
The hardest part of using CMake is knowing what options you might wish
to specify. The only way to get a list of what the package knows about
is to run <command>cmake -LAH</command> and look at the output for that
default configuration.
</para>
<para>
Perhaps the most-important thing about CMake is that it has a variety
of CMAKE_BUILD_TYPE values, and these affect the flags. The default
is that this is not set and no flags are generated. Any
<envar>CFLAGS</envar> or <envar>CXXFLAGS</envar> in the environment
will be used. If the programmer has coded any debug assertions,
those will be enabled unless -DNDEBUG is used. The following
CMAKE_BUILD_TYPE values will generate the flags shown, and these
will come <emphasis>after</emphasis> any flags in the environment
and therefore take precedence.
</para>
<informaltable align="center">
<tgroup cols="2">
<colspec colnum="1" align="center"/>
<colspec colnum="2" align="center"/>
<thead>
<row><entry>Value</entry><entry>Flags</entry></row>
</thead>
<tbody>
<row>
<entry>Debug</entry><entry><option>-g</option></entry>
</row>
<row>
<entry>Release</entry><entry><option>-O3 -DNDEBUG</option></entry>
</row>
<row>
<entry>RelWithDebInfo</entry><entry><option>-O2 -g -DNDEBUG</option></entry>
</row>
<row>
<entry>MinSizeRel</entry><entry><option>-Os -DNDEBUG</option></entry>
</row>
</tbody>
</tgroup>
</informaltable>
<para>
CMake tries to produce quiet builds. To see the details of the commands
which are being run, use <command>make VERBOSE=1</command> or
<command>ninja -v</command>.
</para>
<para>
By default, CMake treats file installation differently from the other
build systems: if a file already exists and is not newer than a file
that would overwrite it, then the file is not installed. This may be
a problem if a user wants to record which file belongs to a package,
either using <envar>LD_PRELOAD</envar>, or by listing files newer
than a timestamp. The default can be changed by setting the variable
<envar>CMAKE_INSTALL_ALWAYS</envar> to 1 in the
<emphasis>environment</emphasis>, for example by
<command>export</command>'ing it.
</para>
<bridgehead renderas="sect3" id="meson-info">Meson</bridgehead>
<para>
Meson has some similarities to CMake, but many differences. To get
details of the defines that you may wish to change you can look at
<filename>meson_options.txt</filename> which is usually in the
top-level directory.
</para>
<para>
If you have already configured the package by running
<command>meson</command> and now wish to change one or more settings,
you can either remove the build directory, recreate it, and use the
altered options, or within the build directory run <command>meson
configure</command>, e.g. to set an option:
</para>
<screen><userinput>meson configure -D&lt;some_option&gt;=true</userinput></screen>
<para>
If you do that, the file <filename>meson-private/cmd_line.txt</filename>
will show the <emphasis>last</emphasis> commands which were used.
</para>
<para>
Meson provides the following buildtype values, and the flags they enable
come <emphasis>after</emphasis> any flags supplied in the environment and
therefore take precedence.
</para>
<itemizedlist>
<listitem>
<para>plain : no added flags. This is for distributors to supply their
own <envar>CFLAGS</envar>, <envar>CXXFLAGS</envar> and
<envar>LDFLAGS</envar>. There is no obvious reason to use
this in BLFS.</para>
</listitem>
<listitem>
<para>debug : <option>-g</option> - this is the default if
nothing is specified in either <filename>meson.build</filename>
or the command line. However it results large and slow binaries,
so we should override it in BLFS.</para>
</listitem>
<listitem>
<para>debugoptimized : <option>-O2 -g</option> - this is the
default specified in <filename>meson.build</filename> of some
packages.</para>
</listitem>
<listitem>
<para>release : <option>-O3</option> (occasionally a package will
force <option>-O2</option> here) - this is the buildtype we use
for most packages with Meson build system in BLFS.</para>
</listitem>
</itemizedlist>
<!-- From https://mesonbuild.com/Builtin-options.html#core-options:
b_ndebug: Default value = false, Possible values are
true, false, if-release. Some packages sets it to if-release
so we mistakenly believed if-release had been the default. -->
<para>
The <option>-DNDEBUG</option> flag is implied by the release
buildtype for some packages (for example <xref linkend='mesa'/>).
It can also be provided explicitly by passing
<option>-Db_ndebug=true</option>.
</para>
<para>
To see the details of the commands which are being run in a package using
meson, use <command>ninja -v</command>.
</para>
<bridgehead renderas="sect3" id="rust-info">Rustc and Cargo</bridgehead>
<para>
Most released rustc programs are provided as crates (source tarballs)
which will query a server to check current versions of dependencies
and then download them as necessary. These packages are built using
<command>cargo --release</command>. In theory, you can manipulate the
RUSTFLAGS to change the optimize-level (default for
<option>--release</option> is 3, i. e.
<option>-Copt-level=3</option>, like <option>-O3</option>) or to
force it to build for the machine it is being compiled on, using
<option>-Ctarget-cpu=native</option> but in practice this seems to
make no significant difference.
</para>
<para>
If you are compiling a standalone Rust program (as an unpackaged
<filename class='extension'>.rs</filename> file) by running
<command>rustc</command> directly, you should specify
<option>-O</option> (the abbreviation of
<option>-Copt-level=2</option>) or <option>-Copt-level=3</option>
otherwise it will do an unoptimized compile and run
<emphasis>much</emphasis> slower. If are compiling the program
for debugging it, replace the <option>-O</option> or
<option>-Copt-level=</option> options with <option>-g</option> to
produce an unoptimized program with debug info.
</para>
<para>
Like <command>ninja</command>, by default <command>cargo</command>
uses all logical cores. This can often be worked around,
either by exporting
<envar>CARGO_BUILD_JOBS=<replaceable>&lt;N&gt;</replaceable></envar>
or passing
<option>--jobs <replaceable>&lt;N&gt;</replaceable></option> to
<command>cargo</command>.
For compiling rustc itself, specifying
<option>--jobs <replaceable>&lt;N&gt;</replaceable></option> for
invocations of <command>x.py</command>
(together with the <envar>CARGO_BUILD_JOBS</envar> environment
variable, which looks like a <quote>belt and braces</quote>
approach but seems to be necessary) mostly works. The exception is
running the tests when building rustc, some of them will
nevertheless use all online CPUs, at least as of rustc-1.42.0.
</para>
</sect2>
<sect2 id="optimizations">
<title>Optimizing the build</title>
<para>
Many people will prefer to optimize compiles as they see fit, by providing
<envar>CFLAGS</envar> or <envar>CXXFLAGS</envar>. For an
introduction to the options available with gcc and g++ see <ulink
url="https://gcc.gnu.org/onlinedocs/gcc-&gcc-version;/gcc/Optimize-Options.html"/>.
The same content can be also found in <command>info gcc</command>.
</para>
<para>
Some packages default to <option>-O2 -g</option>, others to
<option>-O3 -g</option>, and if <envar>CFLAGS</envar> or
<envar>CXXFLAGS</envar> are supplied they might be added to the
package's defaults, replace the package's defaults, or even be
ignored. There are details on some desktop packages which were
mostly current in April 2019 at
<ulink url="https://www.linuxfromscratch.org/~ken/tuning/"/> - in
particular, <filename>README.txt</filename>,
<filename>tuning-1-packages-and-notes.txt</filename>, and
<filename>tuning-notes-2B.txt</filename>. The particular thing to
remember is that if you want to try some of the more interesting
flags you may need to force verbose builds to confirm what is being
used.
</para>
<para>
Clearly, if you are optimizing your own program you can spend time to
profile it and perhaps recode some of it if it is too slow. But for
building a whole system that approach is impractical. In general,
<option>-O3</option> usually produces faster programs than
<option>-O2</option>. Specifying
<option>-march=native</option> is also beneficial, but means that
you cannot move the binaries to an incompatible machine - this can
also apply to newer machines, not just to older machines. For
example programs compiled for <literal>amdfam10</literal> run on
old Phenoms, Kaveris, and Ryzens : but programs compiled for a
Kaveri will not run on a Ryzen because certain op-codes are not
present. Similarly, if you build for a Haswell not everything will
run on a SandyBridge.
</para>
<note>
<para>
Be careful that the name of a <option>-march</option> setting
does not always match the baseline of the microarchitecture
with the same name. For example, the Skylake-based Intel Celeron
processors do not support AVX at all, but
<option>-march=skylake</option> assumes AVX and even AVX2.
</para>
</note>
<para>
When a shared library is built by GCC, a feature named
<quote>semantic interposition</quote> is enabled by default. When
the shared library refers to a symbol name with external linkage
and default visibility, if the symbol exists in both the shared
library and the main executable, semantic interposition guarantees
the symbol in the main executable is always used. This feature
was invented in an attempt to make the behavior of linking a shared
library and linking a static library as similar as possible. Today
only a small number of packages still depend on semantic
interposition, but the feature is still on by the default of GCC,
causing many optimizations disabled for shared libraries because
they conflict with semantic interposition. The
<option>-fno-semantic-interposition</option> option can be passed
to <command>gcc</command> or <command>g++</command> to disable
semantic interposition and enable more optimizations for shared
libraries. This option is used as the default of some packages
(for example <xref linkend='python3'/>), and it's also the default
of Clang.
</para>
<para>
There are also various other options which some people claim are
beneficial. At worst, you get to recompile and test, and then
discover that in your usage the options do not provide a benefit.
</para>
<para>
If building Perl or Python modules,
in general the <envar>CFLAGS</envar> and <envar>CXXFLAGS</envar>
used are those which were used by those <quote>parent</quote>
packages.
</para>
<para>
For <envar>LDFLAGS</envar>, there are three options can be used
for optimization. They are quite safe to use and the building
system of some packages use some of these options as the default.
</para>
<para>
With <option>-Wl,-O1</option>, the linker will
optimize the hash table to speed up the dynamic linking.
Note that <option>-Wl,-O1</option> is completely unrelated to the
compiler optimization flag <option>-O1</option>.
</para>
<para>
With <option>-Wl,--as-needed</option>, the linker will disregard
unnecessary <option>-l<replaceable>foo</replaceable></option> options
from the command line, i. e. the shared library <systemitem
class='library'>lib<replaceable>foo</replaceable></systemitem>
will only be linked if a symbol in <systemitem
class='library'>lib<replaceable>foo</replaceable></systemitem> is
really referred from the executable or shared library being linked.
This can sometimes mitigate the <quote>excessive dependencies to
shared libraries</quote> issues caused by
<application>libtool</application>.
</para>
<para>
With <option>-Wl,-z,pack-relative-relocs</option>, the linker
generates a more compacted form of the relative relocation entries
for PIEs and shared libraries. It reduces the size of the linked
PIE or shared library, and speeds up the loading of the PIE or
shared library.
</para>
<para>
The <option>-Wl,</option> prefix is necessary because despite the
variable is named <envar>LDFLAGS</envar>, its content is actually
passed to <command>gcc</command> (or <command>g++</command>,
<command>clang</command>, etc.) during the link stage, not directly
passed to <command>ld</command>.
</para>
</sect2>
<sect2 id="hardening">
<title>Options for hardening the build</title>
<para>
Even on desktop systems, there are still a lot of exploitable
vulnerabilities. For many of these, the attack comes via javascript
in a browser. Often, a series of vulnerabilities are used to gain
access to data (or sometimes to pwn, i.e. own, the machine and
install rootkits). Most commercial distros will apply various
hardening measures.
</para>
<para>
In the past, there was Hardened LFS where gcc (a much older version)
was forced to use hardening (with options to turn some of it off on a
per-package basis). The current LFS and BLFS books are carrying
forward a part of its spirit by enabling PIE
(<option>-fPIE -pie</option>) and SSP
(<option>-fstack-protector-strong</option>) as the defaults
for GCC and clang. What is being covered here is different - first
you have to make sure that the package is indeed using your added
flags and not over-riding them.
</para>
<para>
For hardening options which are reasonably cheap, there is some
discussion in the 'tuning' link above (occasionally, one or more
of these options might be inappropriate for a package). These
options are <option>-D_FORTIFY_SOURCE=2</option>
(or <option>-D_FORTIFY_SOURCE=3</option> which is more secure but
with a larger performance overhead) and
(for C++) <option>-D_GLIBCXX_ASSERTIONS</option>. On modern
machines these should only have a little impact on how fast things
run, and often they will not be noticeable.
</para>
<para>
The main distros use much more, such as RELRO (Relocation Read Only)
and perhaps <option>-fstack-clash-protection</option>. You may also
encounter the so-called <quote>userspace retpoline</quote>
(<option>-mindirect-branch=thunk</option> etc.) which
is the equivalent of the spectre mitigations applied to the linux
kernel in late 2018. The kernel mitigations caused a lot of complaints
about lost performance, if you have a production server you might wish
to consider testing that, along with the other available options, to
see if performance is still sufficient.
</para>
<para>
Whilst gcc has many hardening options, clang/LLVM's strengths lie
elsewhere. Some options which gcc provides are said to be less effective
in clang/LLVM.
</para>
</sect2>
</sect1>