How I Found 3 Router 0-Days and Built an AI-Assisted Firmware Emulation Platform

Consumer router research is rarely blocked by theory. It is blocked by bring-up.

This project started with three previously unreported findings in a TP-Link router. One issue lived in the USB stack, one in the TDDP factory/test plane, and one in the admin plane. Those are worth their own posts, and they will get them.

But the part I want to focus on first is the infrastructure that made the research practical: using rootless Podman, QEMU user-mode, proot, and AI-assisted automation to take a vendor firmware image from dead artifact to live management plane inside WSL.

Because in embedded work, the lab is often the difference between “interesting notes” and “repeatable research.”

Disclosure note: I have submitted the vulnerabilities to the vendor. I am intentionally withholding exploit details, packet formats, and weaponized reproduction steps until I hear back from their security team and can follow the appropriate reporting guidelines. Each exploit will get its own writeup after coordinated disclosure. I will also publish a separate follow-up on the broader automated reverse engineering and vulnerability research pipeline behind this work.

The Real Story In This First Post

The headline is the 3 findings.

The useful engineering story is what happened around them:

how to spin up vendor router firmware without a full hardware lab
why rootless Podman ended up being the right reset boundary
why QEMU user-mode plus proot was faster to operationalize than full system emulation
what actually breaks when router firmware meets a normal developer machine
how AI drastically shortened the iteration loop once the problem was framed correctly
how the correct automation pipeline makes all the difference

This was less about “AI found bugs for me” and more about “AI helped me get through the painful firmware bring-up work fast enough that deeper reverse engineering became easy to sustain.”

Why Router Firmware Bring-Up Is Annoying

If you have not done this before, bringing up embedded userspace from a vendor firmware image sounds like it should be straightforward:

unzip the firmware
extract the root filesystem
run it under QEMU
browse to the web UI

That is not how it goes.

In practice, this firmware expected a real board and a real SoC environment:

switch hardware
Wi-Fi hardware
thermal and LED control paths
board-specific binaries
hotplug state
init ordering that assumes a real boot sequence
ubus objects and config side effects created by services that do not exist yet

If you try to brute-force that manually, you do not spend your day doing reverse engineering. You spend your day dealing with:

symlinks that did not survive extraction
shell scripts failing because expected devices are missing
Lua code blowing up on nil ubus objects
init scripts hanging forever
management paths refusing to load because a completely unrelated board daemon never started
weird differences between Linux-native filesystems and Windows-backed mounts in WSL

I wanted something better:

a clean reset boundary
a way to re-create the lab from scratch every time
a mostly intact userspace boot
enough web-management functionality to validate control-plane behavior
a place where AI could help with triage, patching, and regression testing instead of just generating text

The Architecture

The final setup looks like this:

Windows host for workspace and tooling
Ubuntu in WSL as the Linux control plane
rootless Podman as the container runtime
QEMU user-mode for ARM userland execution
proot to present the extracted firmware rootfs as the process root
a Python orchestration script to extract, patch, and run the firmware reproducibly

I deliberately chose Podman over Docker Engine because this is exactly the kind of workflow where daemonless, rootless containers are a better fit:

easier to reason about in WSL
less ceremony for reset-and-retry loops
good enough Docker compatibility for familiar tooling
a safer boundary when I know init scripts are going to do weird things

QEMU user-mode plus proot was the other important decision. I did not need full machine emulation to start getting value. What I needed was a fast path to run the vendor ARM userspace, start the web stack, and validate management functionality.

Rebuilding The Firmware Lab

The platform is reproducible from a fresh WSL install:

cd /mnt/e/REBinaries/TPLINK/emulation
chmod +x ./setup-wsl-podman.sh
./setup-wsl-podman.sh

That bootstrap gives me a repeatable WSL base with:

podman
podman-compose
podman-docker
qemu-user-static
proot
squashfs-tools
uidmap
slirp4netns
fuse-overlayfs

Then the lab comes up with:

cd /mnt/e/REBinaries/TPLINK/emulation
./run.sh up -d

And the first thing I care about after boot is whether the management plane is actually alive:

curl -fsS http://127.0.0.1:8080/webpages/login.html >/dev/null
curl -fsS 'http://127.0.0.1:8080/cgi-bin/luci/;stok=/login?form=login' \
  -X POST \
  -d 'operation=read'
curl -fsS 'http://127.0.0.1:8080/cgi-bin/luci/;stok=/domain_login?form=dlogin' \
  -X POST \
  -d 'operation=read'

The fact that those requests work is the output of the real work. The real work was making the environment survive long enough to serve them.

The Pain Points That Actually Mattered

The project became interesting once I stopped thinking in terms of “make it boot” and started thinking in terms of “remove every failure that does not matter to the research question.”

1. Re-extract The Firmware Inside Linux

If the firmware was extracted in the wrong environment, symlinks and filesystem semantics became unreliable. So the first rule became: always re-extract the SquashFS inside Linux.

That sounds trivial, but it removed a huge amount of noise. Once symlinks are wrong, everything downstream becomes misleading.

2. Separate Stock And Emulated Root Filesystems

I did not want the original extraction mutated beyond recognition. The lab keeps two trees:

a stock extraction
a patched emulation rootfs

That gave me a clean diff boundary between vendor behavior and lab-only changes, which matters a lot when you are trying to avoid confusing emulation fixes with real vulnerability logic.

3. Stub Board-Specific Binaries

Some services were never going to behave correctly without real hardware. Rather than letting those crash every boot, I replaced the obvious board-only paths with safe stubs or idle daemons.

That included hardware-specific utilities around Wi-Fi, thermal handling, switching, and platform glue.

This is a key mindset shift: I did not need every daemon to be “correct.” I needed obviously irrelevant failures to stop destroying the management plane.

4. Harden Lua And ubus Paths

A lot of router management code assumes that every daemon, every ubus object, and every config artifact exists. In emulation, that is rarely true on the first try.

I patched several LuCI and ubus-adjacent paths so missing services would degrade cleanly instead of exploding the web UI. That meant nil-handling, synthetic fallbacks, and making missing objects non-fatal in the places where the management plane clearly could survive without them.

That work is not glamorous, but it is exactly what turns firmware emulation from “almost boots” into something you can actually test.

5. Move Runtime State Off The Windows Bind Mount

One of the more subtle bugs only showed up once I put the lab inside a rootless Podman container. The repo bind mount itself was fine, but using that Windows-backed path as the live emulation rootfs triggered ugly runtime failures in the ARM shell path.

In other words: the code could live on the Windows-backed mount, but the live emulated rootfs should not.

The fix was to keep the source tree bind-mounted, but move the extracted and patched rootfs into a Linux-native named volume managed by Podman.

That one change removed a whole class of Bad address-style pain that looked like firmware instability but was actually environment instability.

6. Time-Bound Hung Init Scripts

Even after the obvious hardware services were masked, some init scripts still hung. dnsmasq was a good example. It could wedge long enough to prevent the handoff to uhttpd even though most of the stack was already usable.

So I changed the runtime strategy:

start a broad userspace stack by default
mask known hardware- or kernel-dependent services
enforce timeouts on rc.d service start
keep failures non-fatal unless they block the management plane directly

That gave me a full profile that is realistic enough to be useful now, while still leaving room to unmask more of the original stack later.

Useful Runtime Modes

For a smaller, debugging-first boot:

TPLINK_PROFILE=core ./run.sh up

To push closer to the original stack:

TPLINK_UNMASK_SERVICES=network,firewall,nat ./run.sh up

To force more services off while isolating a crash:

TPLINK_MASK_SERVICES=miniupnpd,minidlna ./run.sh up

To reset generated state:

./run.sh clean

Those knobs matter because firmware bring-up is not all-or-nothing. The right question is usually:

What is the smallest believable runtime that still lets me validate the behavior I care about?

Where AI Actually Helped

This was not “AI magically did reverse engineering for me.” The useful part was much more operational and, honestly, more valuable.

Once the objective was clear, AI dramatically shortened the ugliest parts of the cycle:

compare boot attempts and highlight the next likely blocker
trace service startup paths through shell, Lua, and config faster than I would manually
generate or refine stubs for obviously non-essential hardware paths
suggest safe emulation-only guards around missing ubus and config dependencies
turn one-off failures into a repeatable patch-and-verify loop
keep wrappers, bootstrap scripts, and regression checks in sync while the lab evolved

The human work still mattered most in the places that always matter:

deciding whether a failure was a real vulnerability, a genuine code path, or just emulation noise
understanding native parser and command-execution paths
choosing which services to stub, which to patch, and which to leave alone
recognizing when the answer required actual reversing instead of another automation layer

The difference in pace was real. Problems that normally become multi-day bring-up slogs collapsed into a much tighter loop because AI could help with the boring-but-necessary infrastructure work while I stayed focused on the parts that needed judgment.

Why Podman And QEMU Made This Click

There are lots of ways to attack firmware emulation, but this combination hit a really good balance for me:

Podman gave me a reproducible, rootless reset boundary.
QEMU user-mode let me run the vendor ARM userspace quickly.
proot let me treat the extracted rootfs as the process root without building a full machine model.
WSL made it easy to mix Windows-side tooling with a Linux-native runtime path.

That stack let me go from firmware blob to reproducible management-plane testing without getting trapped in full-system emulation work before I had earned it.

Why This Matters For Autonomous Testing And AI-Driven RE

Once the firmware stack is reproducible, everything else changes.

Now I have a platform where I can:

re-run web-management tests after each patch
validate service startup and crash behavior
hook targeted fuzzing or grammar-driven testing into reachable services
script endpoint checks as regressions
use AI to summarize deltas between runs and propose next experiments

That is the bridge from ad hoc firmware triage to something much closer to autonomous testing and AI-driven reverse engineering.

The vulnerabilities are the reason I went in.

The emulation platform is the thing that compounds.

What Comes Next

There are two follow-ups I already know I want to publish.

First, once I hear back from the vendor and can follow proper disclosure guidelines, I will publish separate posts for the three vulnerabilities:

the USB stack issue
the TDDP stack issue
the admin bypass issue

Second, I am going to write up the larger automated RE and VR pipeline behind this work: how I am using reproducible emulation, targeted instrumentation, and AI-assisted analysis to move much faster on firmware-heavy assessments without giving up rigor.

That second piece is probably the more important one.

The bugs got my attention.

Podman, QEMU, and AI-assisted bring-up are what turned the research into a platform.