Tap and Hold by Description
Tap and Hold by Description
Long press an element using natural language description
POST
Tap and Hold by Description
Long press an element on screen by describing it in natural language. Uses vision AI to find and hold the described element.Documentation Index
Fetch the complete documentation index at: https://docs.tapkit.ai/llms.txt
Use this file to discover all available pages before exploring further.
Request
Path Parameters
| Parameter | Type | Description |
|---|---|---|
phone_id | string | The phone identifier |
Query Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
async | boolean | false | Return immediately with job ID |
Request Body
| Field | Type | Default | Description |
|---|---|---|---|
selector | string | required | Natural language description of the element to hold |
duration_ms | integer | 1000 | How long to hold in milliseconds |
Response
Synchronous
Asynchronous
Examples
Hold to Delete App
SDK Usage
The Python SDK provides this through thehold() method with a string argument:
Related Endpoints
- Tap and Hold - Long press at specific coordinates
- Tap by Description - Single tap using natural language