5uGUE Research-Question Dashboard

An exposition of the research questions behind 5uGUE

5uGUE is an LLM-driven agent that autonomously fuzzes closed-source 5G baseband firmware (the modem inside a phone) to discover crashes. This site lets reviewers inspect the evidence behind each research question, down to every individual experimental run.

Research Questionone scientific question (one axis of experimentation)
Experimentone configuration variance
Runs5 repeated fuzzing sessions; reported numbers are the mean

We study three axes, each a research question: (RQ1) the effect of the agent's capability version, (RQ2) the choice of underlying LLM, and (RQ3) generalisation across different phone modems. Within each research question one variable changes while the others are held fixed. We repeat each experiment five times under identical configurations and analyze the aggregated results across all runs to account for variability introduced by stochastic LLM outputs and fuzzing processes. Click any experiment to expand its five individual runs, then click a run to go all the way down the rabbit hole: its turn-by-turn timeline, every test the agent generated (with the actual exploit code), and the crash signatures it triggered.

Note — unclassified crashes and the crash-count breakdown

In run 863ef044 (RQ3, UNISOC UD710), test 863ef044-53 (mac_sch_ud710_pdsch_v81) triggered a crash but produced no crash-signature string, so the pipeline could not classify it as publicly-known, non-public, or novel. The agent re-ran the same exploit as 863ef044-54 (mac_sch_ud710_v81_recheck), which surfaced a fully-novel crash.

The consequence is a bookkeeping gap: the crash is detected and counts toward total_crashes, but with no signature string it can be neither classified (publicly-known / non-public / novel) nor deduplicated into the unique-crash count — so it is absent from the category breakdown. For this run the three categories sum to one fewer than the total (32 vs 33).

This is not unique to 863ef044. The same signature (total_crashes ≠ publicly-known + non-public + novel) appears across four runs, accounting for five tests in total whose crash-signature string is missing. Each is flagged inline in its run's drill-down with an unclassified crash badge; the full list:

7105d3d1 and 2e28f2ea have no per-run detail bundled locally (their drill-down lives on the storage server), so their inline badges only appear when the site is built from the database; the total-vs-breakdown discrepancy is identical in all four runs. The builder also emits a consistency warning for each of these runs at build time.