How much code do you need? === (and can you debloat the rest?) [Benoit Baudry]( --- ![winter]( --- Summer 2019 --- ![summer]( =1000x580) --- Summer 2019 --- ![apollo]( --- 50th anniversary of Apollo 11 --- - The Apollo 11 [AGC]( - An iconic, [small]( program --- Small programs after 1969 ? --- * today we don't write much code, we reuse a lot * sqllite driver * microcontrollers code * commodore 64 --- Small Programs after 1969 --- - The [line mode browser]( was [pretty small]( - Competition code, e.g., - [Flip dots with feelings in 1021 bytes]( - [A tiny C program]( - Programs for _very_ small things --- Application code is big --- * [Firefox]( is big * The [Java Virtual Machine]( is big * [ffmpeg]( is big (and very [customizable]( :) * [GAFAs are big]( --- Why is code growth a problem ? --- * code growth is an issue for code quality * big code is difficult to maintain * code growth can be a result of making the code easier to maintain * bandwidth issues when the code is downloaded * More dificult to understand and thus more difficult to contribute * I'd add psychological aspects: developing new code is more fun than maintaining it... so it grows (because many work on development), but nobody (or few) want to maintain it09:30 * building the project (compile, test, package) becomes a problem for large code bases (Cf. the AMA session of yesterday ;-) ) --- Why is code growth a problem ? --- * [Wikipedia's JavaScript initialisation on a budget]( * [Stripping dependency bloat in VictoriaMetrics Docker image]( * [Removing Kode]( * [Reduce attack surfaces]( --- Why does code size grow? --- * refactoring can make code size grow * reusing libraries that have a larger API than the one we actually need * adding new features * adding new platform support * obfuscation / make your code look complicate * keep all versions live / backward compatibility * code cloning * we don't remove old code that we don't use anymore (just in case :) * some managers expect code growth, it's a sign of progress :) --- Code debloating techniques --- * [Cimplifier: Automatically Debloating Containers]( ESEC/FSE 2017. * [Binary Control-Flow Trimming]( CCS 2019. * [Is Static Analysis Able to Identify Unnecessary Source Code?]( TOSEM 2020. * [Slimium: Debloating the Chromium Browser with Feature Subsetting]( CCS 2020. --- [A Comprehensive Study of Bloated Dependencies in the Maven Ecosystem]( --- * Intuition: package managers, automatic build encourage software reuse and introduce bloated dependencies --- [A Comprehensive Study of Bloated Dependencies in the Maven Ecosystem]( --- * Let's look at one [build file](, for the [jxls library]( * as well as a [build file for a dependency of jxls]( --- [A Comprehensive Study of Bloated Dependencies in the Maven Ecosystem]( --- * 9K artefacts and 700K dependencies * 75% of dependencies are bloated * Developers care * removed 131 dependencies in 30 projects * experiments at SAP and Ericsson ongoing * [DepClean Maven dependency debloating tool]( * To appear in EMSE journal, 2020. --- Conclusion --- * There is lots of code bloat * from libc to Chrome * caused by reuse, feature creep, usage, etc. * Software developers care * for security * for performance * It is a relevant research topic * that is hard * that matters --- # Thank you! This work is a collaboration with [C├ęsar Soto-Valero](, [Thomas Durieux](, [Nicolas Harrand](, [Martin Monperrus](, at the [KTH Royal Institute of Technology]( and is supported by the [WASP program]( --- More reads --- * [Living review on code debloat]( * [Removing code not covered in production]( * [Shrinking a Kotlin binary by 99.2%]( * [unikernels](