Sees the real phone screen
Phone Use SDK
Hardware-Level Phone Access for AI Agents
Build agents that can observe, tap, swipe, type, and run workflows on real mobile apps. Browser-use for real phones, without emulators or app APIs.
session = phone_use.connect("lumi-dock")
screen = session.observe_screen()
target = screen.find("Confirm appointment")
session.confirm_action("Tap confirmation button?")
session.tap(target)
session.type("Friday works. Sending calendar invite.")
memory.write({
type: "todo",
source: session.screenshot(),
trace: session.action_log()
})No emulator
Real phone, no API, no app-specific integration.
Lumi is for agent builders who need the messy, useful phone reality: visible screens, source traces, and human-in-the-loop control.
Positions for real taps and swipes
Physical interaction with the screen
Stable alignment for common phones
Session state, voice, and local controls
SDK capabilities
Primitive actions, auditable workflows.
Lumi asks before sensitive or irreversible actions.
Human in the loop
Confirmation is part of the interface.
Agents should ask before risky actions, keep an action log, and expose screenshots as source evidence.
Turn Hidden Phone Promises Into Actionable Tasks
Lumi sees screen and taps the message thread
Promised quote due Friday -> Todo, contact, reminderA Relationship Inbox for the People Who Matter
Lumi checks visible context with approval
Waiting reply, next meeting, promised follow-upTurn Phone Fragments Into an AI-Ready Knowledge Base
Lumi groups fragments from the real phone
Tagged sources with editable memory cardsDeveloper kit
Join the Phone Use SDK Waitlist.
Mobile QA agents that operate real apps / Personal automation with human approval / PKM capture from screenshots and chats / Sales follow-up from phone-native context