How much code do you need?
(and can you debloat the rest?)
[Benoit Baudry](

Summer 2019

Summer 2019

50th anniversary of Apollo 11
- The Apollo 11 [AGC](
- An iconic, [small]( program
Small programs after 1969 ?
* today we don't write much code, we reuse a lot
* sqllite driver
* microcontrollers code
* commodore 64
Small Programs after 1969
- The [line mode browser]( was [pretty small](
- Competition code, e.g.,
- [Flip dots with feelings in 1021 bytes](
- [A tiny C program](
- Programs for _very_ small things
Application code is big
* [Firefox]( is big
* The [Java Virtual Machine]( is big
* [ffmpeg]( is big (and very [customizable]( :)
* [GAFAs are big](
Why is code growth a problem ?
* code growth is an issue for code quality
* big code is difficult to maintain
* code growth can be a result of making the code easier to maintain
* bandwidth issues when the code is downloaded
* More dificult to understand and thus more difficult to contribute
* I'd add psychological aspects: developing new code is more fun than maintaining it... so it grows (because many work on development), but nobody (or few) want to maintain it09:30
* building the project (compile, test, package) becomes a problem for large code bases (Cf. the AMA session of yesterday ;-) )
Why is code growth a problem ?
* [Wikipedia's JavaScript initialisation on a budget](
* [Stripping dependency bloat in VictoriaMetrics Docker image](
* [Removing Kode](
* [Reduce attack surfaces](
Why does code size grow?
* refactoring can make code size grow
* reusing libraries that have a larger API than the one we actually need
* adding new features
* adding new platform support
* obfuscation / make your code look complicate
* keep all versions live / backward compatibility
* code cloning
* we don't remove old code that we don't use anymore (just in case :)
* some managers expect code growth, it's a sign of progress :)
Code debloating techniques
* [Cimplifier: Automatically Debloating Containers]( ESEC/FSE 2017.
* [Binary Control-Flow Trimming]( CCS 2019.
* [Is Static Analysis Able to Identify Unnecessary Source Code?]( TOSEM 2020.
* [Slimium: Debloating the Chromium Browser with Feature Subsetting]( CCS 2020.
[A Comprehensive Study of Bloated Dependencies in the Maven Ecosystem](
* Intuition: package managers, automatic build encourage software reuse and introduce bloated dependencies
[A Comprehensive Study of Bloated Dependencies in the Maven Ecosystem](
* Let's look at one [build file](, for the [jxls library](
* as well as a [build file for a dependency of jxls](
[A Comprehensive Study of Bloated Dependencies in the Maven Ecosystem](
* 9K artefacts and 700K dependencies
* 75% of dependencies are bloated
* Developers care
* removed 131 dependencies in 30 projects
* experiments at SAP and Ericsson ongoing
* [DepClean Maven dependency debloating tool](
* To appear in EMSE journal, 2020.
* There is lots of code bloat
* from libc to Chrome
* caused by reuse, feature creep, usage, etc.
* Software developers care
* for security
* for performance
* It is a relevant research topic
* that is hard
* that matters
# Thank you!
This work is a collaboration with [César Soto-Valero](, [Thomas Durieux](, [Nicolas Harrand](, [Martin Monperrus](, at the [KTH Royal Institute of Technology]( and is supported by the [WASP program](
More reads
* [Living review on code debloat](
* [Removing code not covered in production](
* [Shrinking a Kotlin binary by 99.2%](
* [unikernels](