From xPath to Natural Language: AI Browser Automation with Stagehand and OpenAI

Recently, a client asked me to automatically retrieve the latest product prices from Google for a specific item - for example: Segway-Ninebot F3 Pro D 500W.

Price comparison on Google Shopping for a specific item

The Old Way: Fragile and Frustrating

In the early days of my career, I often used tools like Selenium, Cypress and now Playwright for crawling or end-to-end testing.

Navigating and parsing the DOM with querySelector, xPath or CSS selectors is a nightmare - especially if your selector looks like this one:

/html[1]/body[1]/div[1]/div[3]/form[1]/div[1]/div[1]/div[2]/div[4]/div[2]/div[1]/div[1]/ul[1]/li[1]/div[1]/div[2]/div[1] <– Ouch! 🤯

The New Way: AI-Powered Web Automation

Fortunately, there is now a new player in this space: Stagehand by Browserbase - an open-source AI-powered automation framework that combines Playwright with natural language prompting.

Instead of struggling with selectors, you describe actions in natural language - and Stagehand does the rest.

We’ll show you how to use Stagehand to create an automated price scraper for Google Shopping in just a few minutes.

Stagehand - The AI Browser Auotmation Framework.

So let's return to our use case. We want an automated price comparison for our specific product. In our case, this is the Segway-Ninebot F3 pro D 500W, which is just an example.

Disclaimer: This project is intended for demonstration and prototyping purposes only. It is not ready for production and should not be used in live environments.

Setup: Scaffolding the Project

To get started, make sure you have the latest Node.js LTS version installed.

Then scaffold your project with:

npx create-browser-app

During setup, you will be asked for an API key. I used OpenAI, but you can also use a local Ollama instance if you prefer to run things offline.

Once the project is created, install the dependencies:

npm i

After you have defined the project name, specified the API key for your favourite AI service and defined that you want a local setup, you will receive a set-up project.

Our high level concept

We want to implement the following process 👇

1.Open Google and accept the cookie consent

Navigate to google.com and accept the consent

2. Search for Segway-Ninebot F3 Pro D 500W

Search for the escooter and click 'Google Search'

3.Go through the Google Shopping results and extract each offer in a pre-defined structured format

Go through the results and extract each offer in a pre-defined structured format

And the expected result should be for each entry a JSON object with the attributes name, price and the corresponding vendor which can be used for comparison.

[
    {
    "name": "Segway Ninebot F3 Pro D (20 km/h, 1200 W, 70 km, Grau, Schwarz, Rot)",
    "price": "CHF 539.95",
    "vendor": "Interdiscount"
    },
    ...
]

Let's build it

Let's come back to to stagehand. Stagehand introduces a few powerful primitives:

  • act: Perform actions based on natural language (e.g., click, type, select)
  • observe: Preview what actions are possible
  • extract: Use AI to extract structured data from the DOM

You should now have a Stagehand project with scaffolding and the dependencies installed. So let's get to the code.

Open your index.ts file. Here's how we implement our steps using Stagehand.

1.Open Google and accept the cookie consent

You can use page.goto to navigate to a specific page. In our case to google.com.

await page.goto("https://google.com);

Then we accept the cookie consent via act, which executes an action.

await page.act('Click "Accept all" on the cookie banner');

No fiddling with modal dialogs or selectors - just describe the action. 😄 😄

2.Enter the search query

Now you're able to input your search term in google.

First focus the input box via act

await page.act("Click in the input box");

Then observe and perform the typing:

// Use observe() to plan an action before doing it
const [action] = await page.observe(
    "Type 'Segway-Ninebot F3 Pro D 500W'",
);
await drawObserveOverlay(page, [action]); // Highlight the input box
await page.waitForTimeout(1_000);
await clearOverlays(page); // Remove the highlight before typing
await page.act(action); // Take the action

In the code example above, you can see that we prepare the action via observe. This is an additional help to see the action before it is executed.

Finally, click Google Search button:

await page.act("Click 'Google Search'");

It is recommended to let the browser work and wait depending on the action.

await page.waitForTimeout(1_000); //wait a second

3.Extract Shopping Results as Structured Data

Now you can go through each Google Shopping listing and extract the listing in a structured format. The great thing is that you can use Zod to specify the expected result.

// Use extract() to extract structured data from the page
const {products} = await page.extract({
          instruction:
              "extract all Segway-Ninebot F3 Pro D 500W from the page, including their name, price, vendor",
          schema: z.object({
              products: z.array(
                  z.object({
                      name: z.string(),
                      price: z.string(),
                      vendor: z.string()
                      }),
              ),
          }),
});

In our example, we need a name, a price and a vendor. In our example, these are all strings. Of course, you can also define the price as a number. The structure and format are up to you.

The result is the array proucts with the extracted offers.

Pretty cool, right??? 😄

Run the Browser Automation

Just run 👇

npm run start

See the example in action

And voilà! After a few seconds, you get the results as a JSON array - without xPath and without selector spaghetti.

Final thoughts

Stagehand is a fantastic tool for anyone involved in web scraping or browser automation. It makes the process intuitive, robust and - dare I say it - fun.

TL;DR: Use natural language instead of brittle selectors. Let the AI do the heavy lifting.

Have fun automating! 🚀

Resources

You'll find the example project here 👇

GitHub - bitsmuggler/ai-product-price-search: Example Google Shopping Price Scraper with Stagehand + OpenAI
Example Google Shopping Price Scraper with Stagehand + OpenAI - bitsmuggler/ai-product-price-search