Releases: CelestialCreator/pocket-pi
v0.4.0 — UI automation + baked control skill
v0.4.0 — agent has the phone and the screen
Pocket Pi's first release where the agent can actually drive other apps. On top of v0.3's phone surface (notifications, intents both directions, share-sheet, camera, mic, location, clipboard, deep-link inbox), v0.4 adds a full UI-automation surface and the playbook to use it.
What's new
19 new /ui/* tools for the agent, mirroring an AccessibilityService vendored from KarryViber/orb-eye (MIT). The agent can now:
- Read any app's element tree (
pocket_pi_ui_screen,pocket_pi_ui_find) - Dispatch taps, swipes, long-presses, scrolls, multi-finger gestures (
pocket_pi_ui_tap,_click,_swipe,_scroll,_long_press,_gesture) - Set text on focused fields (
pocket_pi_ui_type) - Take screenshots (Android 11+,
pocket_pi_ui_screenshot) - Fire global actions — back / home / recents / notifications shade / quick settings (
pocket_pi_ui_global) - Buffer system notifications + long-poll a window-change + notification event channel for proactive behaviour (
pocket_pi_ui_notifications,pocket_pi_ui_poll_events)
All routed through the same bearer-token-gated localhost bridge as v0.3 — no companion APK, no root, no shell setup.
Baked-in pocket-pi-android-control skill at ~/.pi/agent/skills/. Shipping the tools alone wasn't enough — flash-tier models would call pocket_pi_intent_send once, hit a wall, and give up. The skill anchors the fallback chain (deep link → generic intent → launch by package + UI drive → verify) with six worked examples and anti-patterns. With it loaded, the same models that bailed yesterday now complete full chains.
One-time setup on first install
Android forbids programmatic enablement of AccessibilityService. The app's onboarding pane deep-links straight to Settings → Accessibility → Pocket Pi — tap "Use Pocket Pi", accept the system consent dialog, and the pane self-dismisses. UI tools fail gracefully with 403 until the toggle is on.
Validated on emulator
- Fresh install → bootstrap → AccessibilityPane → Settings deep-link → toggle → consent → service bound (singleton non-null within 2 s of the poll)
- All 19
/ui/*endpoints return ok end-to-end (info, screen, find, tap, click, type, swipe, scroll, long_press, gesture, global, wait, notifications, screenshot, events/poll) - Driving Settings → Battery navigation via find + tap → verified by screencap
- Proactive notification reactor: agent calls
ui_poll_events(long-poll), notify fires from a side channel, agent unblocks within ~100 ms and reads the title/body back
What's next (v0.5)
- Working shell-session tab in the dashboard (
node-ptyandroid-arm64 prebuild still missing; stub no-ops on tab open) - Background location escalation when a real use case lands
- Custom-prefix bootstrap so
applicationIdcan move offcom.termux
Credits
- orb-eye — KarryViber for the AccessibilityService that powers v0.4's UI surface. Vendored under MIT with attribution preserved in source.
- Pi coding agent — Mario Zechner + earendil-works for the runtime that powers every chat turn.
- pi-agent-dashboard + pi-anthropic-messages — BlackBelt Technology for the WebView chat UI and the Anthropic OAuth bridge.
- Termux — for the Linux-on-Android runtime.
Install
Sideload pocket-pi-v0.4.0.apk (68 MB, arm64-v8a). Allow "install from this source" on first run. First launch downloads/extracts the bundled runtime (~30 s on modern phones).
v0.3.1 — dashboard 0.5.3 + apt-retry fix
v0.3.1 — dashboard 0.5.3 + apt-retry fix
Bumps the bundled chat UI from pi-agent-dashboard@0.4.6 to 0.5.3 and fixes a postinstall bug that masked critical apt packages on flaky mirrors. Backwards-compatible with v0.3.0; users on v0.3.0 should re-run setup or wipe app data to get the postinstall fix.
What's new
Bigger dashboard
- Pinned
@blackbelt-technology/pi-agent-dashboard@0.5.3(was 0.4.6). 7 minor/patch versions of upstream improvements:- Node 25 crash safety net —
unhandledRejection+uncaughtExceptionhandlers incli.tsso a misbehaving plugin can't kill the dashboard under Node 25's fatal-by-default async-fault policy. tsx→jiticonsolidation. The dashboard's spawn shim resolves jiti from pi's tree and re-execs node with--import <jiti>; we no longer hand-roll the loader.- Chat markdown: agent can inline local images (
) and LaTeX math ($$...$$). /skill:fooinvocations render as collapsible cards instead of dumping the expanded body into the chat bubble.- Slash-command
:↔-aliases —/skill-fooand/skill:fooboth work. - Provider catalogue auto-mirrors pi's full provider list (DeepSeek, Fireworks, Cerebras, HuggingFace, Vertex, Bedrock, etc.) — was a hardcoded 8-item subset.
- Tunnel watchdog — auto-recycles stale zrok tunnels (toggle off by default in this build since we don't bundle zrok).
- Node 25 crash safety net —
PiBridgesimplified to invokepi-dashboard startdirectly via the npm-installed shim, instead ofnode --import tsx-loader cli.ts start.
Postinstall robustness
- apt-retry second pass. The original postinstall ran
xargs -a packages.txt pkg install -y 2>/dev/null || trueonce and only retriednodejsifnpmwas missing — so when a Termux mirror flaked mid-batch, packages likegit,python,ripgrepsilently never installed, breakingpi-anthropic-messages(which needsgit clone) and leaving the dashboard with a "1 required missing" amber banner. Now a second pass identifies what didn't land (viacommand -v+dpkg -s) and re-runs just the missing set. - Legacy
@mariozechner/pi-coding-agentscope dropped fromnpm-packages.txt. The canonical@earendil-works/pi-coding-agent@0.74+supersedes it and ships jiti; the old scope tops at 0.73.x and itsbin/pisymlink collides with the new scope. Upgrading from v0.3.0? Postinstall explicitlynpm uninstall -g @mariozechner/pi-coding-agentfirst.
Diagnostic noise suppression
mkdir -p $HOME/.pi-dashboardso the dashboard's "Managed install not created" warning stops firing. Pocket Pi never uses that path; everything lives under$PREFIXviatool-overrides.json.- Pre-seed
~/.pi/dashboard/config.jsonwithtunnel.enabled=false. Zrok-watchdog noise gone; re-enable in Settings → General → Tunnel if you actually want tunneling.
Install
- Download
pocket-pi-v0.3.1.apk(aarch64 only, ~39 MB) below. adb install -r pocket-pi-v0.3.1.apkor sideload normally.- First-launch postinstall is ~5–10 min on Wi-Fi. Grant Camera / Mic / Location / Notifications when prompted.
Upgrading from v0.3.0: if you want the apt-retry fix to take effect, either:
- Force-stop com.termux + clear app data + relaunch — full fresh postinstall. Cleanest. Or
- After install, open the app, wait 15 s for the recovery UI to surface, tap Re-run setup — refreshes the postinstall script from the new APK and runs it. Faster but might leave some legacy state.
Not yet done
Carried over from v0.3.0:
- UI automation — deferred to v0.4. droidrun-portal Accessibility Service vendor when ready.
- Background location — foreground only.
- Other OAuth providers (Gemini CLI, Codex, Copilot, Antigravity) — Sign-In completes but no Pi-side protocol bridge bundled. Use API-key path.
- Working shell-session tab —
node-ptyhas no android-arm64 prebuild.
v0.3.0 — phone-surface bridge for the agent
v0.3.0 — phone-surface bridge for the agent
The agent now has the phone. A localhost HTTP server (127.0.0.1:9998) gated by a per-launch bearer token exposes the device's surface to Pi — without any companion APK.
What's new
- Notifications —
termux_notify(title, content, priority). - Outgoing Android intents —
pocket_pi_intent_send(generic),pocket_pi_intent_open_url,pocket_pi_intent_dial,pocket_pi_intent_settings. The agent can open Wi-Fi/Bluetooth/permission screens, dial a number, view a URL, fire anyIntentaction. - Incoming intents — share-target ("Share to Pocket Pi" for
text/*+image/*) andpi://agent/...deep-link intent-filters. Payloads queue into$HOME/.pi/agent/inbox/<ts>-<rand>.jsonand the agent drains viapocket_pi_inbox_list/pocket_pi_inbox_pop. - Location —
termux_location(gps / network / fused, foreground), via a transientLocationFgServiceso the privacy indicator only shows during a fix. - Camera —
termux_camera_photo(front/back), CameraX-backedCameraFgService. Output lands in~/.pi/agent/captures/<ts>.jpg. - Microphone —
pocket_pi_mic_record(1–300 s, AAC/.m4a), viaMicFgService. - Clipboard + battery + toast + TTS + share-sheet —
termux_clipboard_{get,set},termux_battery_status,termux_toast,termux_tts_speak,termux_share. pocket-pi-apishell shim in$PREFIX/bin/—pocket-pi-api notify '{"title":"…","content":"…"}',pocket-pi-api camera/photo '{"camera":"back"}', etc. — from any Termux session inside the app.- Dashboard tool overrides for
node,npm,git,ps,pgrep(Settings → Tools now shows them as ✓ via override).
Install
- Download
pocket-pi-v0.3.0.apk(aarch64 only, ~68 MB) below. - Sideload — tap the APK on the phone (allow install from unknown sources) or
adb install pocket-pi-v0.3.0.apk. - Open the app. First launch runs the bootstrap (~5–10 min on Wi-Fi). Grant Camera / Mic / Location / Notifications when prompted.
- When the dashboard loads, tap its ⚙ → Providers → add at least one provider (Claude Pro/Max OAuth or any API key). Then chat.
Security model
- The HTTP API binds
127.0.0.1only — not exposed off-device. - A 32-byte bearer token is regenerated on every service start and written to
$PREFIX/etc/pocket-pi/api-token(mode 0600, same UID as Termux). - Apps that don't share Pocket Pi's UID can't read the token; off-device callers can't reach
127.0.0.1.
Not yet done
- UI automation — deferred to v0.4. Plan is to vendor droidrun/droidrun-portal's Kotlin AccessibilityService so the agent can read screens + dispatch taps/swipes against other apps. Irreducible UX cost: the user has to manually enable Accessibility in Settings (Android forbids a runtime dialog).
- Background location ("Allow all the time") — foreground only this release.
- Other OAuth providers end-to-end (Gemini CLI, Codex, Copilot, Antigravity) — Sign-In completes but no Pi-side protocol bridge is bundled. Use the API-key path.
- Working shell-session tab in the dashboard —
node-ptyhas no android-arm64 prebuild; chat/files/tasks work, terminal tab will fail.
Credits
Pocket Pi is just packaging. The actual agent engine + chat UI are someone else's work:
- Pi coding agent by Mario Zechner / earendil-works
- pi-agent-dashboard + pi-anthropic-messages by BlackBelt Technology
- Termux — the Linux-on-Android runtime that lets us ship Node + Python inside one APK without root.
Pocket Pi v0.2.1
Pocket Pi v0.2.1
Patch release. Removes the defaultProvider=nvidia preseed so fresh installs prompt the user to pick a provider through the dashboard's native settings instead of landing on NVIDIA with no key behind it. No API keys (or placeholders for any provider) ship in the APK.
Install
- Download: pocket-pi-v0.2.1.apk (40 MB, aarch64 only)
- Sideload or `adb install pocket-pi-v0.2.1.apk`.
Providers — what works
See the README provider matrix for the authoritative table. Short version:
- OAuth, end-to-end: Anthropic (Claude Pro/Max). Uses pi-anthropic-messages as the protocol bridge. Sign-In opens the device's default browser via an `xdg-open` shim → Android `ACTION_VIEW`. Cross-device manual paste of the callback URL also works.
- OAuth Sign-In completes but unusable (no Pi-side bridge bundled): Google Gemini CLI, ChatGPT Plus/Pro (Codex), GitHub Copilot, Antigravity. Use the API-key path for these vendors instead.
- API key paste — fully working: OpenAI, Anthropic API, Google Gemini (AI Studio key), Mistral, Groq, xAI, Z.ai, OpenRouter, NVIDIA NIM.
Changes vs v0.2.0
- Drop `defaultProvider`/`defaultModel` preseeding in postinstall — settings.json starts empty; dashboard prompts for provider on first launch.
- Drop the empty `~/.config/nvidia/api-key` placeholder from the bootstrap HOME skel.
- README: open with credits to the upstream projects (Pi by Mario Zechner / earendil-works, pi-agent-dashboard + pi-anthropic-messages by BlackBelt Technology, Termux), honest provider matrix, drop "team testers" framing.
Carry-over from v0.2.0
- Chat UI: pi-agent-dashboard. Slash commands, model switcher, session history, native settings cog.
- Compose UI: bootstrap splash + recovery only (inline Restart Pi / Re-run setup after 15s stall).
Known limits
- Shell-session tab inside the dashboard is stubbed (node-pty has no android-arm64 prebuild).
- `applicationId` still `com.termux` (avoids a Docker-based bootstrap rebuild).
- Old Android WebView builds (< Chrome 120) may render a blank dashboard. Real devices auto-update; emulator system images can lag.
Credits
Pocket Pi is packaging. The agent engine is Pi by Mario Zechner (now at earendil-works). The chat UI is pi-agent-dashboard by BlackBelt Technology. Linux-on-Android runtime is Termux.
Pocket Pi v0.2.0
Pocket Pi v0.2.0 — POC, shippable
Drop-in single-APK Pi coding agent for Android. The chat UI is now pi-agent-dashboard (slash commands, session history, model switcher, native settings all built in). NVIDIA NIM is pre-seeded so first chat is one tap away.
Install
- Download: pocket-pi-v0.2.0.apk (40 MB)
- Sideload (allow unknown sources) or
adb install pocket-pi-v0.2.0.apk. - First launch runs the bootstrap (3–5 min on Wi-Fi).
What's new vs v0.1
- Chat UI: pi-agent-dashboard replaces the prior pi-mobile PWA. Built-in slash commands, model switcher, native settings cog.
- Provider config now lives in the dashboard. Anthropic Claude Pro/Max OAuth, ChatGPT, GitHub Copilot, Gemini CLI all surfaced; API-key paste for OpenAI / Anthropic / Mistral / Groq / Gemini.
xdg-openshim in postinstall so OAuth Sign-In opens the device's default browser (Chrome) viaam ACTION_VIEW. Cross-device manual paste of the callback URL also works.- pi-anthropic-messages bundled for proper tool-call rendering on Claude OAuth providers.
- Compose UI reduced to bootstrap splash + recovery (inline Restart Pi / Re-run setup buttons after a 15s stall).
Known limits
- Shell-session tab inside the dashboard is stubbed (node-pty has no android-arm64 prebuild).
applicationIdis stillcom.termux(avoids a Docker-based bootstrap rebuild).- Old Android WebView builds (< Chrome 120) may render a blank dashboard. Real devices auto-update; emulator system images can lag.
Notes
Same fast-tracked POC. Whether to invest in productizing (custom prefix bootstrap, real applicationId, Play Store) vs. building a proper native Android client is what this POC is meant to inform.
Pocket Pi v0.1.3 (POC)
Changes
- New launcher icon — replaces the Greek π with the pi.dev brand mark (the tetris-style "P" + "i" dot from https://pi.dev/logo-auto.svg).
- Resilient npm install — if a single npm package fails (e.g. one with a broken native build), postinstall logs a WARN and keeps going instead of aborting the whole bootstrap. Found while spiking the pi-agent-dashboard alternative UI; this fix is independently useful.
Install
Sideload over your existing install — your keys, selected model, and chat history are preserved on-device.
Pocket Pi v0.1.2 (POC)
Fixes
- Provider chip shows up for every key you've saved. Before, a provider only appeared in the chip row if it was in
models.json. If you saved an OpenRouter key on v0.1.0 (or earlier) it never made it into the picker. v0.1.2 reconciles on every sheet open — saved key → chip in the row. - Add any model. Pi's predefined per-provider list isn't exhaustive (OpenRouter alone has hundreds). v0.1.2 adds a Custom model section in the Config sheet: paste a model id (e.g.
qwen/qwen-2.5-coder-32b-instruct), give it a friendly name, tap Add model, then Use this model.
Install
Sideload over your existing v0.1.x — your keys and selected model are preserved on-device.
Pocket Pi v0.1.1 (POC)
Fixes
- Chat now actually replies. Pi 0.74 reads
defaultProvider/defaultModelfromsettings.json— without them the agent had nothing to call, so sending a message looked like it did nothing. v0.1.1 ships a first-run default (NVIDIA Qwen3 Coder 480B) and an in-app picker so anyone can switch. - Active model picker in the Config sheet (⚙ icon): provider chips + per-model rows + "Use this model" button. Reads what's available from
models.json; writes the active selection tosettings.jsonand restarts Pi. - Saving an API key now also seeds the active model if none is set yet, so the "save key + send hi" flow works on a fresh install.
Install
Sideload the APK below. If you already have v0.1.0, just install over it — your keys and configs are preserved. After install, open the app and tap ⚙ to pick a model.
Pocket Pi v0.1.0 (POC)
First POC build for team testing.
Install
- Download
pocket-pi-v0.1.0.apkbelow. - Sideload on an aarch64 Android phone (
adb install pocket-pi-v0.1.0.apk, or tap the file on the device after enabling install-from-unknown-sources). - Open the app. First launch takes 3–5 min on Wi-Fi (bootstrap + npm install).
- Once you see "Send a message to your Pi agent", tap the ⚙ at the top right.
- Paste at least one API key (NVIDIA NIM is free at https://build.nvidia.com/, OpenRouter / OpenAI / Anthropic / Groq also wired up).
- Tap Save keys → Restart Pi.
- If any extensions are missing, tap Re-run setup (idempotent, safe).
Status
- ✓ All 8 Pi extensions register, 17 tools online
- ✓ NVIDIA NIM + OpenRouter tested end-to-end
- ✓ Recovery UI if pi-webserver doesn't bind within 15s
Known limitations
- Slash commands (
/session,/clear,/model, …) currently fall through to the LLM as plain text — planned for the next iteration. - Chat history doesn't persist across tab switches yet.
applicationIdis stillcom.termux(requires a custom bootstrap rebuild to change).- A cosmetic ghost-icon row may render at the top of the chat on some Android WebView builds.
Feedback wanted
Whether to keep iterating on this Termux-fork-in-an-APK approach or rewrite as a proper native Android client that talks to Pi over the network — that's what this POC is meant to inform.