Sessions - TapKit

A session is the core concept in TapKit. It represents a complete agent-controlled phone interaction — from the moment an AI agent connects to a phone until the task is done.

What is a session?

Creating a session spins up a full AI agent instance that connects to a phone and executes tasks autonomously. This isn’t just an API connection — it’s a running agent with vision, reasoning, and action capabilities. When you create a session, TapKit:

Assigns a phone to the session
Spins up an AI agent instance (currently Claude Code, Codex support coming)
Connects the agent to the phone
The agent executes the task you described
The session ends when the task is complete (or times out)

The agent loop

The agent follows a screenshot-reason-act loop:

Screenshot — capture the current phone screen
Reason — send the screenshot to the AI model, which decides what to do next
Act — execute the chosen action (tap, swipe, type, etc.)
Repeat — take another screenshot and continue until the task is done

This is not scripted automation. The agent reasons about what it sees on screen and decides its next action in real-time. It handles pop-ups, loading states, and unexpected UI changes because it’s making visual decisions, not following a fixed script.

Session lifecycle

Create → Agent connects → Task execution → Session ends

Status	Description
`creating`	Session created, spinning up agent instance
`running`	Agent is connected and executing the task
`paused`	Session is paused, can be resumed
`completed`	Task finished successfully
`failed`	Task encountered an error
`killed`	Session was manually stopped

Session parameters

When creating a session, you provide:

Parameter	Type	Description
`phone_id`	`string`	Which phone to use (required)
`instruction`	`string`	Natural language description of what to do (required)

Monitoring a session

While a session is active, you can:

Poll status — GET /v1/sessions/{session_id} returns current status, cost, duration, and turn count
Stream events — GET /v1/sessions/{session_id}/events returns real-time agent events (text, tool use, thinking)
Send messages — POST /v1/sessions/{session_id}/message sends a message to the running agent
Pause/resume — POST /v1/sessions/{session_id}/pause and /resume to control execution
Stop — POST /v1/sessions/{session_id}/stop to end the session early
Watch live — the Mac app shows the phone screen in real-time during the session

Session results

When a session completes, the response includes:

Status — completed, failed, or killed
Cost — session cost in USD (cost_usd)
Duration — total session time in milliseconds (duration_ms)
Turns — number of agent turns (num_turns)
Error — error message if the session failed
Timestamps — created_at, started_at, completed_at

Billing and usage

Sessions consume plan time. Your plan includes a set amount of session minutes per month.

Session time is calculated from session start to end
Monitor usage in the dashboard under Sessions
Overages beyond your plan’s included minutes are billed separately

View your sessions

Check session history and usage in the dashboard.

Documentation Index

​What is a session?

​The agent loop

​Session lifecycle

​Session parameters

​Monitoring a session

​Session results

​Billing and usage