Skip to main content
POST
/
phones
/
{phone_id}
/
pinch
/
select
Pinch by Description
curl --request POST \
  --url https://api.example.com/phones/{phone_id}/pinch/select
Pinch, zoom, or rotate on an element described in natural language. Uses vision AI to find the element and perform the gesture.

Request

curl -X POST https://api.tapkit.ai/phones/{phone_id}/pinch/select \
  -H "X-API-Key: TK_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"selector": "the map", "action": "pinch_out"}'

Path Parameters

ParameterTypeDescription
phone_idstringThe phone identifier

Query Parameters

ParameterTypeDefaultDescription
asyncbooleanfalseReturn immediately with job ID

Request Body

{
  "selector": "the map",
  "action": "pinch_out"
}
FieldTypeDescription
selectorstringNatural language description of the element
actionstring"pinch_in", "pinch_out", "rotate_cw", or "rotate_ccw"

Actions

ActionEffect
pinch_outZoom in (fingers apart)
pinch_inZoom out (fingers together)
rotate_cwRotate clockwise
rotate_ccwRotate counter-clockwise

Response

Synchronous

{
  "id": "job_abc123",
  "status": "completed",
  "result": {},
  "created_at": "2024-01-15T10:30:00Z",
  "completed_at": "2024-01-15T10:30:02Z"
}

Asynchronous

{
  "job_id": "job_abc123"
}

Examples

Zoom In on a Map

curl -X POST https://api.tapkit.ai/phones/abc123/pinch/select \
  -H "X-API-Key: TK_..." \
  -H "Content-Type: application/json" \
  -d '{"selector": "the map view", "action": "pinch_out"}'

SDK Usage

The Python SDK provides this through the pinch() method with a string argument:
phone.pinch("the map", "pinch_out")      # Zoom in
phone.pinch("the photo", "pinch_in")     # Zoom out
phone.pinch("the image", "rotate_cw")    # Rotate clockwise
  • Pinch - Pinch at specific coordinates
  • Double Tap - Alternative zoom method