Skip to main content

What is TapKit CLI?

TapKit CLI is one of three ways to use TapKit — alongside the Python SDK and REST API. It gives you direct iPhone control from your terminal.
tapkit tap 200 400 --phone "iPhone 15 Pro"
It’s a native macOS binary with zero runtime dependencies — no Python, no Node.js, no Docker. Install it and go.

What you can do

  • Gestures: tap, double-tap, hold, swipe, drag — all by coordinates
  • Screenshots: capture the screen, open it locally, or pipe as base64 to an LLM
  • App control: open apps, type text, use Spotlight, Siri, and shortcuts
  • Device control: home, lock/unlock, volume, rotate
  • Skills: install Markdown-based knowledge files that teach AI agents how to use specific apps
  • Agent integration: works with Claude Code, Codex, OpenClaw, Cursor, Windsurf, and 35+ other agents

What makes TapKit unique

  1. Real iPhones, not browsers. TapKit controls physical devices — real app stores, real push notifications, real biometrics. Not simulators or emulators.
  2. Skills that teach agents. Mobile apps are opaque screenshots, not DOMs. Skills are Markdown files that map out an app’s entire UI so agents know where things are and how to interact.

Next steps

Installation

Install the CLI via curl or Homebrew.

Quick Start

Your first screenshot-tap-verify workflow.