Skip to main content
Gesture methods accept either coordinates (x, y) or natural language descriptions. When you pass a string, TapKit uses vision AI to find and interact with the described element.

Taps

# Coordinates
phone.tap((100, 200))
phone.double_tap((100, 200))
phone.hold((100, 200), duration_ms=1000)

# Natural language (vision AI)
phone.tap("the blue Submit button")
phone.double_tap("the photo to zoom in")
phone.hold("the app icon to delete")
MethodParametersDescription
tap(target)coordinates or descriptionSingle tap
double_tap(target)coordinates or descriptionDouble tap
hold(target, duration_ms=1000)coordinates or description, hold timeLong press

Swipes

# Coordinates
phone.flick((500, 1000), direction="up")
phone.pan((500, 1000), direction="up", duration_ms=500)

# Natural language (vision AI) - flick only
phone.flick("the message list", "down")
phone.flick("the photo gallery", "left")
MethodParametersSelector SupportDescription
flick(target, direction)coordinates or description, directionYesQuick swipe gesture
pan(point, direction, duration_ms=500)coordinates onlyNoControlled scroll
Directions: "up", "down", "left", "right"

Drag

# Coordinates
phone.drag((100, 200), (300, 400))
phone.hold_and_drag((100, 200), (300, 400), hold_duration_ms=500)

# Natural language (vision AI) - drag only
phone.drag("the slider handle", "the right end of the slider")
phone.drag("the file icon", "the trash folder")
MethodParametersSelector SupportDescription
drag(from_target, to_target)both coordinates or both descriptionsYesDrag between points
hold_and_drag(from_point, to_point, hold_duration_ms=500)coordinates onlyNoHold before dragging
For drag() with selectors, both arguments must be strings. You cannot mix coordinates and descriptions.

Pinch

# Coordinates
phone.pinch((500, 500), action="pinch_out")

# Natural language (vision AI)
phone.pinch("the map", "pinch_out")
phone.pinch("the photo", "pinch_in")
MethodParametersDescription
pinch(target, action)coordinates or description, actionPinch/zoom/rotate gesture
ActionEffect
pinch_outZoom in (fingers apart)
pinch_inZoom out (fingers together)
rotate_cwRotate clockwise
rotate_ccwRotate counter-clockwise

Tips for Selectors

  • Be specific: "the blue Submit button" works better than "button"
  • Include visual details: color, position, text content
  • Works best with clearly visible, distinct UI elements