Damjan Jovanovic on Tue, 5 Oct 2010 10:24:25 +0200 (SAST)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

[GLUG-chat] Some thoughts on packaging and why desktop Linux is a failure


Hi

Of all the operating systems I've used, Linux has always seemed to me
to hold more potential than any other. Yet this potential is often
misused, abused, overvalued, misunderstood, underused, or ignored,
resulting in what is at best a second-rate system, and at worst a
hackers-only club.

The example I want to discuss this time is this: a few years back Ian
Murdock blogged about how package management is "the single biggest
advancement Linux has brought to the industry"
(http://ianmurdock.com/solaris/how-package-management-changed-everything/).

I am of the exact opposite opinion:

1. Binary incompatibility

Binary compatibility on Linux is just sad.

The original a.out binary format was bad, so Linux moved to ELF. But
ELF is not much better.

With ELF, symbol lookup in dynamic libraries is "process scoped",
meaning the same symbol names in independent libraries that happen to
be loaded into the same process will clash (eg. 2 versions of libpng
get loaded into the same process, both define "png_info_init" ->
memory corruption -> crash). Windows and MacOS use "library scoped"
symbol lookup to avoid this.

Software development is fundamentally distributed, and nobody controls
what names people use in their library symbols. So when different
libraries with the same symbol come together, or even different
versions of the same library get loaded into the same process even
indirectly, this causes unpredictable crashes. A few real world
examples:
* Mozilla.org official builds of Firefox written in C++
(libstdc++.so.5) uses GTK+ (written in C) loads the Simple Chinese
Input Method written in C++ (libstdc++.so.6) -> memory corruption ->
undebuggable crash.
* libgnutls can replace the openssl APIs by redefining them with its
own API, if the real openssl is loaded even indirectly -> same API and
symbols but different ABI -> memory corruption -> undebuggable crash
(http://www.linux-archive.org/fedora-development/148626-libgnutls-openssl-real-openssl-conflict.html).

There still isn't any general solution for this. There's the GNU libc
specific RTLD_DEEPBIND flag to dlopen(), but that's only for run-time
dynamic linking which is not used generally. There's also symbol
versioning as a possible solution, but that's an obscure ELF-only
feature and only libc uses that. Michael Meeks sent a patch for the
-Bdirect option to ld
(http://sourceware.org/ml/binutils/2005-10/msg00437.html) which would
have fixed this, but Ulrich Drepper rejected it for no good reason;
Solaris already uses and benefits from that feature.

Distros have had to work around this issue in common cases, eg. the
Ubuntu package apparently manually symbol versioned the various
libdb's.

The only conclusion that can be drawn is that ELF dynamic linking, as
it is today, is fundamentally broken and should not be relied on in a
desktop environment where you want to run anything from anywhere.

Of course binary incompatibility is not just a matter of how ELF
works. Other problems are numerous:
* many libraries don't preserve API/ABI compatibility between versions
* few/no libraries support "compile on new, run on old". GTK for
instance uses macros that invisibly lock you into >= the version you
compiled against, so you have to compile against a very old version to
be able to run against any later version.
* libc has a number of different options it can be compiled with, each
of which alters the API/ABI needed by the application (eg.
thread-local locales). In practice Linux distributions don't agree on
options to compile libc with, meaning you can and do get applications
compiled on one distro that won't run on another.
* ELF is very slow (Windows OpenOffice on Wine starts up faster than
Linux OpenOffice) so various hacks like DT_GNU_HASH were added to
speed it up; these also broke compatibility between distros (eg. a
while back pcsx2 compiled "Linux binaries" on Fedora, and Ubuntu users
couldn't run them).
* It is near-impossible to make ELF-based "portable apps" that can be
run off a CD or a USB stick and store all their files on it, or
applications that can be installed anywhere after compile-time,
because ELF has hardcoded library search paths, and there's no easy
way for an executable to look up its own path and load resources
relative to it. Windows has GetModuleFileName(), the closest we have
is a symlink in /proc/self/exe that exists only on Linux and works
only for executables - a library has to - this one is really good -
take the memory address of something inside itself like a static
variable, then search the memory address ranges in /proc/self/maps for
that address, then take the path following that range - all exotic,
highly obscure, not well known, Linux-only tricks, that work only if
/proc is mounted. On Windows, you can usually choose where to install
something, on Linux, since nobody knows/cares about this, most
software isn't relocatable after compile-time, and thus distro
packages virtually always need to be installed into /, as *root*!

2. Rigid dependency lock-in

Loose coupling is a virtue in software engineering: we have exactly
the opposite. Binary compatibility problems from the above point lead
to the only logical solution: we can't (more like won't) solve the
problem globally so let's solve it locally, ie. things can work but
you have to compile against the known-good environment, which locks
you into its ABI, and others can then in turn compile against you.
This leads to tight coupling - software components that shouldn't
really depend on each other are tightly bound by the compiled binary
interfaces.

A distribution is thus built top-down, like a pyramid, where things on
top depend on exact versions of things beneath. Nudge just one stone
at the bottom, eg. downgrade the libc version, and the entire distro
needs to be recompiled.

Newer versions of apps thus link to newer versions of libraries,
lockstepping together. Great when you want the latest version of
everything, terrible when you want to backport. Eg. try installing the
Ubuntu OpenOffice 3.2 .deb on Ubuntu 6.06 from 2006 - you'd have to
replace half the libraries, but try installing it on the 6 years older
Windows 2000 - it just works.

3. Centralized repository

We had problems with binary compatibility, so we solved the problem
locally by compiling specific versions against each other, and now we
have a bunch of interconnected packages that work with each other but
probably won't quite work anywhere else: a "repository".

Repositories only support a single version of a package. Users needing
a different version are stuck. I recently played a game of "Scorched
3D" - I couldn't connect to the server, turned out my version was
wrong. The repository didn't have the newer version, the website
didn't provide .deb files, and I wasn't in the mood to compile from
source. I installed and played the Windows version of the game on
Wine. This is the sad state of the art of software installation on
Linux today: if it's in the repository, it's quicker and easier to
install than on any other operating system, but if it's not, you're in
for it. And this hasn't changed from the early 90's. A good quote I
read somewhere is how "It's easier to distribute our [open source]
stuff on Windows than on Linux".

As for there being 30000 packages for Ubuntu, a lot of software I use
isn't packaged, and 30000 is still less than a third of the
applications on Sourceforge alone.

Packaging scripts are usually written and managed by people that
didn't develop the software being packaged, and even if they're
initially correct, they get outdated, and then cause bugs themselves,
like this one I found in Ubuntu's OpenOffice package:
https://bugs.launchpad.net/bugs/479973. This bug affects all Ubuntu
and probably all Debian users and it's been completely ignored for
years. Bugs in distro Wine packages are legendary: packages used to
put wineserver in /etc/init.d because it sounds like a server (it's
actually automatically started by wine and runs as the current user),
and Fedora and OpenSuSE both recently shipped completely broken 64-bit
Wine on x86-64 architectures. Many common packages are broken if you
dig a little, eg. the help file in k3b on Ubuntu 10.04 doesn't open
because it needs some KDE service which isn't a dependency to the
package, but then manually installing that service gives the error
that the help path doesn't exist.

4. Particular distribution

Now before the package is in the repository proper, come the
distro-specific hacks and patches to make the package conform to the
distro's view of the world, or reject it outright: changes to artwork,
translations, integration with the distro bug tracker, addition of
files, removal of "badly behaving" files or tools (eg. odbc-config
removed from Debian and Ubuntu unixodbc package because it outputs
paths in a way that isn't "Debian enough"), or even complete rejection
of the package from the repository based entirely on a political
ideology, something even Microsoft never did in the peak of its reign
(eg. autopackage was rejected, as was a build of Gaim just for using
an autopackage tool that would "defeat Debian's automated dependency
tracking", as was the OpenWatcom compiler for not being licensed
freely enough, and I'm sure there are many more).

By controlling access to the repository - the only easy way to install
software - distros have an effective weapon against software they do
not like. Who are they to dictate what end users should be able
install and run? Distro control is everything that freedom - the whole
point of open source - is not. You might say there are legitimate
issues that distros can't include something in their repository - eg.
OpenWatcom had licensing issues that would affect its hosting in a
repository - but that's also an argument against the existence of the
repository in the first place.

Packages made by distros deviate from the official software in
unpredictable ways. Recall that serious security issue only last year
or so, caused by Debian excluding the OpenSSH blacklist from its
package.

Distros often take in all bug reports, even for software they didn't
write (since they patch it), but never propagate these upstream. I was
shocked when I went onto Fedora's bug tracker and found many valid bug
reports for Wine that hadn't made it into Wine's bug tracker. I even
found and fixed some bugs in Wine by looking through these
distro-specific bug trackers for interesting bugs, but there's more
than 300 distros, I can't visit them all...

There's no predicting the API/ABI/semantics of a piece of software
when a distro provides it; Eclipse SWT developers had an issue where
they work around a bug in certain versions of GTK by checking the GTK
version and then enabling the workaround for broken versions, but
Debian backported the GTK fix to older GTK versions, so they were
stuck with a "if (GTK_version < X || distro == debian)" situation -
imagine if they had to test for all 300+ distros in their code. Of
course even that is not enough - each distro could change the package
any way they like at any time. Distro packaging thus works only works
if distros package absolutely everything - which they don't, by a long
way, and never will.

5. 3rd party software is foreign

This is end result of the previous problems, and the biggest problem
in and of itself.

Applications all come from your distro's repository, right? Because
there is virtually zero infrastructure to support third party,
unpackaged applications. Even forgetting binary compatibility
problems, there is no standard way to install an application
(autopackage, klik, zeroinstall, binary tarball, source tarball,
self-extracting shell script, binary file, package: is it compatible
with your distro+distro version?), no standard install location (/opt,
/usr, /usr/local, ~/.local, anywhere), no standard automatic update
service, no standard way to check/specify/request dependencies, no
standard init.d services or even standard file format for new services
(thank you Upstart), no standard fonts, and generally a very shaky
foundation to stand on. Distros apparently fantasize that all software
is in their repository, where it generally works with the distro's own
infrastructure.

It doesn't help we have more than 1 desktop environment. Luckily
freedesktop.org standardized some some basic desktop functions, but
many things are not standardized yet: thumbnailing, URI handling,
default file open associations, MIME type actions, shared settings,
keyring, and file browser right click menu integration, just to pick a
few.

Sadly the difficulty of developing applications independently of the
whole distro repository system means that it generally isn't done.
People either don't develop a "Linux version", target only a single
distro version (eg. RHEL 4), or add onto the great lumbering beast of
the distro repository, thus keeping the fundamentally broken trash
alive due to sheer momentum rather than competitiveness and technical
excellence. Distro repositories thus act as a deterrent to improving
these problems.

ISVs have been complaining for years about the difficulty of
developing for the desktop (http://www.kegel.com/osdl/da05.html). And
every open source project is ultimately an ISV: some like Eclipse and
Inkscape even want to be seen that way. So these problems bite them
all, some harder than others. Eclipse published their problems with
the Linux desktop at http://vektor.ca/osdl-meeting3.txt, and while
some of these issues have been fixed since 2005, the fundamental
problems haven't changed.

http://linuxhaters.blogspot.com/2010/04/year-later.html hits the nail
on the head: "The big players in the linux desktop are the
distributions. They distribute stuff. They're somewhat ok at
distributing their own stuff. But they're supremely awesome at making
it hard for third parties to distribute their stuff. It's the
distributions (and more precisely, the fact that there are so many of
them) that make the actual distributing hard. Go figure."

***

A Linux distribution is thus not a solid, stable platform - it is
merely a snapshot of a bunch of packages in their current state,
compiled against each other. A Linux desktop in a distribution is not
a general application environment - it is a fixed-function appliance
that works for a predefined set of functions, but if you change it
even a little, it will break in unpredictable ways. The whole fixed
rigid repository structure is perfect for servers - and totally wrong
for desktops.

Case in point: it's easier to make an entire Linux distro for a
special purpose application (eg. Mythbuntu, Trixbox, Untangle, EBox)
than to make a software package that will install and run on any
version of any distro. It is easier to make software to target all
Windows versions since 95 than to target all the versions of a single
distro over the same time period.

Software and frameworks that do work well on a wide range of new and
old distros have to go to unbelievable lengths. Java for example still
has code that emulates connect() on UDP sockets on Linux 2.2 kernels
which didn't support it properly, its selector framework uses the
blazingly fast epoll on 2.6 kernels and falls back to poll on older
kernels, it has manually written workarounds for quirks in each window
manager, it uses delicate dlopen() logic to defensively load external
functionality, and its C code often contains copies of definitions
from system header files rather than including the dodgy native
headers. The image I/O libraries all read and write PNG, GIF, BMP, and
WBMP using 100% Java code instead of the unreliable native libraries;
only JPEG is loaded using a Java-private known-good libjpeg. Java
emulates alpha blending on GTK which doesn't have it by drawing each
widget on a black and then a white background and then mixing the
results. And Java has to ship private copies of some libraries it uses
instead of linking to the dodgy ones distros ship (or might lack).

The insanity of the current system has even resulted in people dissing
Java for not linking to the native libraries
(http://fnords.wordpress.com/2010/09/24/the-real-problem-with-java-in-linux-distros/)
instead of understanding why it can't. And over time the each distro
for itself way has resulted in 64 bit being done differently by each
distro (/lib32 vs /lib vs /lib64 is all different on different
distros).

Open source people look to innovation and improvement in
*distributions* (eg. the next Ubuntu), *desktops* (eg. KDE 4, Gnome
3), and *frameworks* (eg. Mono) whereas it's always been about the
*applications*.

Microsoft is now reporting record profits
(http://mybroadband.co.za/news/business/13978-Microsoft-reports-record-high-sales.html).
Mac OS, not Linux, has become the alternative OS - all you hear in the
media is hype about the latest Apple products.

So, are you waiting for Gnome 3 to revolutionize Linux? Are you
waiting for Ubuntu 10.10 to make everything work? At most, these are
going to make minor, cosmetic changes (menus, themes, windicators, the
"Ubuntu font"). Looking for the "right distro" for you? They're all
the same, they use the same upstream software, just packaged, patched
and themed in various ways for no purpose other than to be different
on that distro.

As predicted in point 1, portable apps on Linux are a disaster:
http://www.linuxinsider.com/story/70921.html

And then some wonder why the "year of desktop Linux" is still not here...

Regards
Damjan Jovanovic
(ducks and hides ;-)
-- 
To unsubscribe: send the line "unsubscribe glug-chat" in the
subject of a mail to "glug-chat-request@xxxxxxxxxxxx".
Problems? Email "glug-chat-admins@xxxxxxxxxxxx". Archives are at
http://www.linux.org.za/Lists-Archives/
RULES: http://www.linux.org.za/glugrules.html