Revision control
Copy as Markdown
Other Tools
# profiler-cli walkthrough: resource usage Linux fat AAR build
Companion to `resource-usage-linux-fat-aar-build.md`. Reproduces the same findings using `profiler-cli` with annotated output.
---
## Load the profile and get an overview
```
profiler-cli profile info
```
```
Name: Firefox Nightly 143.0a1 – x86_64-pc-linux-gnu
Platform: x86_64-pc-linux-gnu
This profile contains 1 threads across 1 processes.
Top processes and threads by CPU usage:
p-0: mach [pid 0] [ts-0 → end] - 0.000ms
t-0: [tid 0] - 0.000ms
CPU activity over time:
No significant activity.
```
One thread, zero CPU samples, zero CPU activity. This is a marker-only profile: every piece of data is in the 45,476 markers, not sampled stacks. The profile is 1285 seconds (~21 minutes) long.
---
## Find the top-level build phases
```
profiler-cli thread select t-0
profiler-cli thread markers --category Phases
```
```
Markers in thread t-0 () — 15 markers (filtered from 45476)
By Name:
Phase 15 markers (interval: min=161.06ms, avg=86.27s, max=1023.64s)
Examples: m-16 ✗ (1023.64s), m-17 ✗ (95.86s), m-18 ✗ (70.53s)
By Category:
Phases 15 markers (100.0%)
```
15 phase markers span the full build, dominated by one 1023s outlier. The top five by duration:
```
profiler-cli marker info m-16 # compile
profiler-cli marker info m-17 # android-archive-geckoview
profiler-cli marker info m-18 # export
profiler-cli marker info m-19 # configure
profiler-cli marker info m-20 # buildsymbols
```
```
Marker m-16: Phase - compile
Time: 112.48s - 1136.12s (1023.64s)
Fields:
CPU Time: 17h3m
CPU Percent: 45.4%
Marker m-17: Phase - android-archive-geckoview
Time: 1148.30s - 1244.16s (95.86s)
Fields:
CPU Time: 1h36m
CPU Percent: 18.5%
Marker m-18: Phase - export
Time: 41.91s - 112.44s (70.53s)
Fields:
CPU Time: 1h10m
CPU Percent: 12.2%
Marker m-19: Phase - configure
Time: 5.83ms - 35.97s (35.97s)
Fields:
CPU Time: 35m49s
CPU Percent: 1.3%
Marker m-20: Phase - buildsymbols
Time: 1244.37s - 1274.98s (30.61s)
Fields:
CPU Time: 30m32s
CPU Percent: 3.7%
```
The five major phases in order: configure (36s) → export (70.5s) → compile (1023.6s) → android-archive-geckoview (95.9s) → buildsymbols (30.6s). The compile phase is 80% of total wall time, so that is where to focus.
The compile phase CPU time is 17h3m at 45.4% average utilization. On a machine with ~60 logical cores that is roughly 27 cores active on average. The machine is well-utilized, so the bottleneck is not idle time or poor parallelism; it is just a lot of work.
The remaining 10 phase markers are all under 10 seconds: android-stage-package (9.97s), package (5.71s), android-fat-aar-artifact (5.50s), teardown (5.06s), upload (4.82s), package-generated-sources (3.79s), misc (1.82s), pre-export (0.31s), libs (0.25s), tools (0.16s).
---
## Survey the compile phase task types
Zoom into the compile marker and survey what task categories appear inside it:
```
profiler-cli zoom push m-16
profiler-cli thread markers --category Tasks --min-duration 60000
```
```
[Thread: t-0 () | View: ts-6→ts-w (1023.64s) | Full: 1285.00s]
Markers in thread t-0 () — 29 markers (filtered from 45476)
By Name:
Object 15 markers avg=85.78s max=147.72s ← C++ object files
Examples: m-41 ✗ (147.72s), m-42 ✗ (121.33s), m-43 ✗ (100.18s)
RustCrate 8 markers avg=186.22s max=396.86s ← Rust crate compilations
Examples: m-86 ✗ (396.86s), m-87 ✗ (280.19s), m-88 ✗ (222.09s)
Gradle 3 markers avg=78.06s max=94.82s
Examples: m-56 ✗ (94.82s), m-57 ✗ (70.14s), m-58 ✗ (69.22s)
file_generate 1 markers avg=70.36s max=70.36s
Examples: m-31 ✗ (70.36s)
Rust 1 markers avg=839.50s max=839.50s ← libgkrust.a link step
Examples: m-81 ✗ (839.50s)
dumpsymbols 1 markers avg=62.89s max=62.89s
Examples: m-101 ✗ (62.89s)
```
The instant taxonomy: Rust crate compilation and C++ objects are the two dominant task categories by count and individual duration. The single `Rust` marker at 839.5s is the `libgkrust.a` link step spanning almost the entire compile window.
```
profiler-cli zoom pop
```
---
## Rust library link steps
```
profiler-cli thread markers --search Rust --min-duration 10000
```
```
By Name:
RustCrate 58 markers max=396.86s
Examples: m-86 ✗ (396.86s), m-87 ✗ (280.19s), m-88 ✗ (222.09s)
Rust 5 markers max=839.50s
Examples: m-81 ✗ (839.50s), m-82 ✗ (44.27s), m-83 ✗ (40.65s)
```
The `Rust` markers are per-library (linking steps), and `RustCrate` markers are per-crate (compilation). Inspecting the top `Rust` markers:
```
profiler-cli marker info m-81 # libgkrust.a
profiler-cli marker info m-82 # libjsrust.a
profiler-cli marker info m-83 # libminidump_analyzer_export.a
profiler-cli marker info m-84 # libcrash_helper_server.a
profiler-cli marker info m-85 # http3server
```
```
Marker m-81: Rust - Rust
Time: 130.66s - 970.16s (839.50s) ← nearly the entire compile phase
Fields:
Description: libgkrust.a
Marker m-82: Rust - Rust
Time: 970.16s - 1014.43s (44.27s)
Fields:
Description: libjsrust.a
Marker m-83: Rust - Rust
Time: 1094.03s - 1134.68s (40.65s)
Fields:
Description: libminidump_analyzer_export.a
Marker m-84: Rust - Rust
Time: 1064.48s - 1094.03s (29.55s)
Fields:
Description: libcrash_helper_server.a
Marker m-85: Rust - Rust
Time: 1014.43s - 1042.99s (28.56s)
Fields:
Description: http3server
```
`libgkrust.a` alone spans 130s-970s. The next libraries are an order of magnitude smaller.
---
## Individual Rust crate compilations
```
profiler-cli thread markers --search RustCrate --min-duration 60000
```
```
By Name:
RustCrate 8 markers max=396.86s
Examples: m-86 ✗ (396.86s), m-87 ✗ (280.19s), m-88 ✗ (222.09s)
```
```
profiler-cli marker info m-86 # gkrust
profiler-cli marker info m-87 # firefox-on-glean
profiler-cli marker info m-88 # webrender
profiler-cli marker info m-89 # style
profiler-cli marker info m-90 # swgl
profiler-cli marker info m-140 # wgpu-core
profiler-cli marker info m-141 # naga
profiler-cli marker info m-142 # geckoservo
```
```
Marker m-86: RustCrate - RustCrate
Time: 566.69s - 963.55s (396.86s)
Fields:
Description: gkrust v0.1.0
Marker m-87: RustCrate - RustCrate
Time: 249.25s - 529.44s (280.19s)
Fields:
Description: firefox-on-glean v0.1.0
Marker m-88: RustCrate - RustCrate
Time: 344.58s - 566.67s (222.09s)
Fields:
Description: webrender v0.62.0
Marker m-89: RustCrate - RustCrate
Time: 213.37s - 414.23s (200.86s)
Fields:
Description: style v0.0.1
Marker m-90: RustCrate - RustCrate
Time: 181.78s - 344.19s (162.41s)
Fields:
Description: swgl v0.1.0 build script (run)
Marker m-140: RustCrate - RustCrate
Time: 202.86s - 295.18s (92.32s)
Fields:
Description: wgpu-core v26.0.0
Marker m-141: RustCrate - RustCrate
Time: 181.84s - 254.50s (72.66s)
Fields:
Description: naga v26.0.0
Marker m-142: RustCrate - RustCrate
Time: 263.62s - 325.98s (62.36s)
Fields:
Description: geckoservo v0.0.1
```
These crates run in parallel (their sum far exceeds the 839s wall time of `libgkrust.a`), but the Rust dependency graph still forces serial bottlenecks at the top. There is nothing to optimize within these crates without reducing what needs to be compiled.
---
## C++ object files
```
profiler-cli thread markers --search Object --min-duration 60000
```
```
By Name:
Object 15 markers avg=85.78s max=147.72s
Examples: m-41 ✗ (147.72s), m-42 ✗ (121.33s), m-43 ✗ (100.18s)
```
```
profiler-cli marker info m-41 # rlbox.wasm.o
profiler-cli marker info m-42 # Unified_cpp_dom_canvas3.o
profiler-cli marker info m-43 # UnifiedBindings27.o
profiler-cli marker info m-44 # Unified_cpp_gfx_harfbuzz_src0.o
profiler-cli marker info m-45 # Unified_cpp_dom_media2.o
```
```
Marker m-41: Object - Object
Time: 452.93s - 600.65s (147.72s)
Fields:
Description: rlbox.wasm.o
Marker m-42: Object - Object
Time: 239.26s - 360.59s (121.33s)
Fields:
Description: Unified_cpp_dom_canvas3.o
Marker m-43: Object - Object
Time: 390.26s - 490.44s (100.18s)
Fields:
Description: UnifiedBindings27.o
Marker m-44: Object - Object
Time: 360.51s - 456.08s (95.57s)
Fields:
Description: Unified_cpp_gfx_harfbuzz_src0.o
Marker m-45: Object - Object
Time: 328.46s - 417.49s (89.03s)
Fields:
Description: Unified_cpp_dom_media2.o
```
The largest C++ objects fall entirely within the libgkrust.a window (130s-970s) and overlap with the Rust crate compilation. They are not on the critical path.
---
## android-archive-geckoview: Gradle packaging
```
profiler-cli marker info m-56
```
```
Marker m-56: Gradle - Gradle
Time: 1148.88s - 1243.69s (94.82s)
Fields:
Description: geckoview:assembleDebug
```
The `android-archive-geckoview` phase (95.9s) is almost entirely `geckoview:assembleDebug` (94.8s). That is the Gradle packaging step and is expected overhead.
---
## Wasted symbolication work
```
profiler-cli thread markers --search dumpsymbols --min-duration 60000
```
```
By Name:
dumpsymbols 1 marker 62.89s
Examples: m-101 ✗ (62.89s)
```
```
profiler-cli marker info m-101
```
```
Marker m-101: dumpsymbols - dumpsymbols
Type: Text
Category: Tasks
Time: 971.31s - 1034.19s (62.89s)
Fields:
Description: libxul.so_syms.track
```
`libxul.so_syms.track` takes 62.9s generating 1.2-2.4 GB of crashreporter symbols that are immediately discarded. The final AAR uses binaries from upstream dependent tasks, not the ones compiled here, so this symbolication work is entirely wasted.
---
```
profiler-cli stop
```