Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tapkit.ai/llms.txt

Use this file to discover all available pages before exploring further.

A session is the core concept in TapKit. It represents a complete agent-controlled phone interaction — from the moment an AI agent connects to a phone until the task is done.

What is a session?

Creating a session spins up a full AI agent instance that connects to a phone and executes tasks autonomously. This isn’t just an API connection — it’s a running agent with vision, reasoning, and action capabilities. When you create a session, TapKit:
  1. Assigns a phone to the session
  2. Spins up an AI agent instance (currently Claude Code, Codex support coming)
  3. Connects the agent to the phone
  4. The agent executes the task you described
  5. The session ends when the task is complete (or times out)

The agent loop

The agent follows a screenshot-reason-act loop:
  1. Screenshot — capture the current phone screen
  2. Reason — send the screenshot to the AI model, which decides what to do next
  3. Act — execute the chosen action (tap, swipe, type, etc.)
  4. Repeat — take another screenshot and continue until the task is done
This is not scripted automation. The agent reasons about what it sees on screen and decides its next action in real-time. It handles pop-ups, loading states, and unexpected UI changes because it’s making visual decisions, not following a fixed script.

Session lifecycle

Create → Agent connects → Task execution → Session ends
StatusDescription
creatingSession created, spinning up agent instance
runningAgent is connected and executing the task
pausedSession is paused, can be resumed
completedTask finished successfully
failedTask encountered an error
killedSession was manually stopped

Session parameters

When creating a session, you provide:
ParameterTypeDescription
phone_idstringWhich phone to use (required)
instructionstringNatural language description of what to do (required)

Monitoring a session

While a session is active, you can:
  • Poll statusGET /v1/sessions/{session_id} returns current status, cost, duration, and turn count
  • Stream eventsGET /v1/sessions/{session_id}/events returns real-time agent events (text, tool use, thinking)
  • Send messagesPOST /v1/sessions/{session_id}/message sends a message to the running agent
  • Pause/resumePOST /v1/sessions/{session_id}/pause and /resume to control execution
  • StopPOST /v1/sessions/{session_id}/stop to end the session early
  • Watch live — the Mac app shows the phone screen in real-time during the session

Session results

When a session completes, the response includes:
  • Statuscompleted, failed, or killed
  • Cost — session cost in USD (cost_usd)
  • Duration — total session time in milliseconds (duration_ms)
  • Turns — number of agent turns (num_turns)
  • Error — error message if the session failed
  • Timestampscreated_at, started_at, completed_at

Billing and usage

Sessions consume plan time. Your plan includes a set amount of session minutes per month.
  • Session time is calculated from session start to end
  • Monitor usage in the dashboard under Sessions
  • Overages beyond your plan’s included minutes are billed separately

View your sessions

Check session history and usage in the dashboard.