An exposition of the research questions behind 5uGUE
5uGUE is an LLM-driven agent that autonomously fuzzes closed-source 5G baseband firmware (the modem inside a phone) to discover crashes. This site lets reviewers inspect the evidence behind each research question, down to every individual experimental run.
We study three axes, each a research question: (RQ1) the effect of the agent's capability version, (RQ2) the choice of underlying LLM, and (RQ3) generalisation across different phone modems. Within each research question one variable changes while the others are held fixed. We repeat each experiment five times under identical configurations and analyze the aggregated results across all runs to account for variability introduced by stochastic LLM outputs and fuzzing processes. Click any experiment to expand its five individual runs, then click a run to go all the way down the rabbit hole: its turn-by-turn timeline, every test the agent generated (with the actual exploit code), and the crash signatures it triggered.
Note — unclassified crashes and the crash-count breakdown
In run 863ef044 (RQ3, UNISOC UD710), test
863ef044-53 (mac_sch_ud710_pdsch_v81) triggered a crash
but produced no crash-signature string, so the pipeline could
not classify it as publicly-known, non-public, or novel. The agent re-ran the
same exploit as 863ef044-54 (mac_sch_ud710_v81_recheck),
which surfaced a fully-novel crash.
The consequence is a bookkeeping gap: the crash is detected and
counts toward total_crashes, but with no signature string it can be
neither classified (publicly-known / non-public / novel) nor
deduplicated into the unique-crash count — so it is absent from
the category breakdown. For this run the three categories sum to one fewer than
the total (32 vs 33).
This is not unique to 863ef044. The same signature
(total_crashes ≠ publicly-known + non-public + novel) appears
across four runs, accounting for five tests in
total whose crash-signature string is missing. Each is flagged inline in its
run's drill-down with an unclassified crash
badge; the full list:
863ef044-53— RQ3, UNISOC UD710 (mac_sch_ud710_pdsch_v81).2cefbe6f-1— RQ2, Kimi K2.6 (mac_sch_mtk_rrc_setup_crash_v8).7105d3d1-36and7105d3d1-98— RQ3, MediaTek Dimensity 9000 (run off by 2).2e28f2ea-46— RQ3, MediaTek Dimensity 9000.
7105d3d1 and 2e28f2ea have no per-run detail bundled
locally (their drill-down lives on the storage server), so their inline badges
only appear when the site is built from the database; the total-vs-breakdown
discrepancy is identical in all four runs. The builder also emits a consistency
warning for each of these runs at build time.