Skip to main content
Not every agent loop supports MCP. If your agent can call CLI commands, it can use TapKit. The CLI gives you the same gestures, screenshots, and device controls — just as shell commands instead of tool calls.

Install

npm install -g tapkit

Authenticate

tapkit auth login
This opens your browser to sign in. The token is cached locally for future sessions.

Quick check

tapkit status      # verify auth + phone connection
tapkit phones      # list connected phones
tapkit screenshot  # grab the current screen

Available commands

Gestures

CommandDescription
tapkit tap <x> <y>Tap at coordinates
tapkit double-tap <x> <y>Double tap
tapkit hold <x> <y>Long press
tapkit swipe <direction>Swipe up/down/left/right
tapkit drag <x1> <y1> <x2> <y2>Drag between points

Input

CommandDescription
tapkit type <text>Type into focused field
tapkit escapeDismiss keyboard or alert

Device

CommandDescription
tapkit homePress home button
tapkit open <app>Launch app by name
tapkit spotlight [query]Open Spotlight search
tapkit siriActivate Siri
tapkit lock / unlockLock or unlock device
tapkit volume up / downVolume control
tapkit shortcut <name>Run a Shortcut

Screenshots

CommandDescription
tapkit screenshotCapture phone screen
tapkit screenshot --openCapture and open in Preview
tapkit screenshot --base64Output as base64 (for LLMs)
tapkit screenshot --llmOutput optimized for LLM consumption

Phone management

CommandDescription
tapkit phonesList connected phones
tapkit phone set <name>Set active phone
tapkit phoneShow active phone + status

How it works

The CLI talks to the TapKit Mac app running on your machine. The Mac app executes commands on the connected iPhone. You don’t need a server — everything runs locally. For the full CLI reference including configuration and setup, see the CLI tab.