Python Runtime Sandbox for Gemini Computer Use Agent
This example implements a simple Python server in a sandbox container that can run the computer-use-preview agent.
It includes a FastAPI server that can execute browser tasks and a Python script to test it (tester.py).
Setup
To run the agent, you need to provide a Gemini API key. You can do this by setting the GEMINI_API_KEY environment variable.
export GEMINI_API_KEY="YOUR_GEMINI_API_KEY"
You also need to install the required Python libraries:
pip install kubernetes requests
Prerequisites
The Runtime is based on the computer-use-preview repository, you can clone it or Docker build will do it automatically:
git clone https://github.com/google-gemini/computer-use-preview
Running the Docker Test Script (run-test-docker.sh)
The run-test-docker.sh script provides a way to build and test the sandboxed agent locally using Docker. It supports both non-interactive and interactive modes.
Flags
--interactive: Runs the script in interactive mode.--nobuild: Skips the Docker image build step.
Non-Interactive Mode (Default)
By default, the script runs in non-interactive mode. It will:
- Build the Docker image (unless
--nobuildis specified). - Start the container in the background.
- Run the
tester.pyscript, which sends a predefined query to the agent running in the container. - Stop and remove the container.
To run in non-interactive mode:
./run-test-docker.sh
The tester.py script acts as a client to interact with the python API server, sending a query to the /agent endpoint and printing the standard output, standard error, and exit code from the response.
Usage:
python tester.py [ip] [port]
Interactive Mode
You can run the script in interactive mode by using the --interactive flag. This is useful for running custom queries against the agent inside the container. In this mode, the script will:
- Build the Docker image (unless
--nobuildis specified). - Start the container in the background.
- Execute a sample query (
python computer-use-preview/main.py --query "Go to Google and type 'Hello World' into the search bar") inside the container, attaching your terminal to it.
To run in interactive mode:
./run-test-docker.sh --interactive
To run in interactive mode without rebuilding the image:
./run-test-docker.sh --nobuild --interactive
Python Classes in main.py
The main.py file defines the following Pydantic models to ensure type-safe data for the API endpoints:
AgentQuery
This class models the request body for the /agent endpoint.
query: str: The natural language query for the browser agent to execute.api_key: str: The Gemini API key for authenticating with the agent.
AgentResponse
This class models the response body for the /agent endpoint.
stdout: str: The standard output from the agent execution.stderr: str: The standard error from the agent execution.exit_code: int: The exit code of the agent execution.
Testing on a local kind cluster using agent-sandbox
To test the sandbox on a local kind cluster, you can use the run-test-kind.sh script. This script automates the entire process of setting up a local Kubernetes cluster, deploying the sandbox, and running the integration tests.
This script will:
- Create a kind cluster (if it doesn’t exist).
- Build and deploy the agent sandbox controller to the cluster.
- Build the python runtime sandbox image.
- Load the image into the kind cluster.
- Deploy the sandbox and run the
test_computer_use_extension.pyintegration tests. - Clean up all the resources.
To run the script from the project root:
./examples/gemini-cu-sandbox/run-test-kind.sh
Feedback
Was this page helpful?