Quickstart
Begin development with the General Agents API.
Create and export an API keyCopied!
To access General Agents API, you need to create an API key in the platform API Keys dashboard. Set your API key as an environmental variable
export GENERALAGENTS_API_KEY=your-api-key
Make a call to REST APICopied!
Make a test call to /v1/control/predict
endpoint. It takes a screenshot, previous agent actions and returns an action to execute:
curl -X POST "https://api.generalagents.com/v1/control/predict" \
-H "Authorization: Bearer $GENERALAGENTS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"instruction": "Open hello world! Wikipedia page",
"model": "ace-control-small",
"image_url": "https://storage.googleapis.com/general-agents-screenshots/wiki.png",
"previous_actions": [{"kind": "left_click", "coordinate": {"x": 500, "y": 600}}]
}'
For the purpose of a test we provided image as a publicly available URL, but in actual usage you will most likely want to pass it as a base64 encoded data URI:
...
"image_url": "data:image/webp;base64,iVBORw0KGgoAAA..."
...
Run a “hello world” example with computer controlCopied!
To run a full example, you need to execute commands returned by the agent. You can use generalagents
client library to do that. Create a python file hello.py
:
from generalagents import Agent
from generalagents.macos import Computer
agent = Agent(model="ace-small")
computer = Computer()
instruction = "Open hello world! Wikipedia page"
session = agent.start(instruction)
observation = computer.observe()
for _ in range(20): # max actions
action = session.plan(observation)
if action.kind == "stop":
break
observation = computer.execute(action)
Run it:
uv run --with generalagents hello.py
Add General Agents API client to your projectCopied!
General Agents API client is a regular python package. You can install it into you project using uv
, pip
, and other python project managers.
uv add generalagents
Clients for other programming languages coming soon!
API ReferenceCopied!
For full documentation of the REST API see API Reference.