Summarize a webpage

Overview

This recipe demonstrates how to use Airtop to automate the summarization of a webpage. By leveraging Airtop’s cloud browser capabilities, we can extract a concise summary from any webpage using a simple API.

The instructions below will walk through creating a script that connects to Airtop, opens a webpage in a cloud browser session, and retrieves a summary of its content.

The full source code is available on GitHub for TypeScript and Python.

Prerequisites

To get started, ensure you have:

and the following packages installed:

NodeJS
1- Node.js installed on your system.

Getting Started

  1. Clone the repository

    Start by cloning the source code from GitHub:

    NodeJS
    $git clone https://github.com/airtop-ai/recipe-summarize.git
    >cd recipe-summarize
  2. Install dependencies

    Run the following command to install the necessary dependencies, including the Airtop SDK:

    NodeJS
    $npm install
  3. Configure your environment

    You will need to provide your Airtop API key in a .env file. First, copy the provided example .env file:

    $cp .env.example .env

    Now edit the .env file to add your Airtop API key:

    AIRTOP_API_KEY=<YOUR_API_KEY>

Script Walkthrough

The script index.ts for TypeScript or summarize.py for Python performs the following steps:

  1. Initialize the Airtop Client

    First, we initialize the AirtopClient using your provided API key. This client will be used to create browser sessions and interact with the page content.

    NodeJS
    1const client = new AirtopClient({
    2 apiKey: AIRTOP_API_KEY,
    3});
  2. Create a Browser Session

    Creating a browser session will allow us to connect to and control a cloud-based browser.

    NodeJS
    1const createSessionResponse = await client.sessions.create({
    2 configuration: {
    3 timeoutMinutes: 5, // Terminate the session after 5 mins of inactivity, customize it as needed
    4 },
    5});
  3. Connect to the Browser

    The script opens a new page and navigates to the target URL. In this example we use a a Wikipedia page, however you can replace this with the URL of your choice.

    NodeJS
    1const windowResponse = await client.windows.create(
    2 sessionId,
    3 { url: TARGET_URL }, // TARGET_URL is defined at the top of the script
    4);
  4. Summarize the Content

    Leverage Airtop to summarize the webpage’s content using natural language. We utilize the pageQuery API to specify how the summary should be structured.

    Here we instruct Airtop to summarize the content of the page in 1 paragraph, however you can customize this prompt to suit your needs (i.e. asking it to provide bullet points).

    NodeJS
    1const windowInfo = await client.windows.getWindowInfo(sessionId, windowResponse.data.windowId);
    2const contentSummary = await client.windows.pageQuery(session.id, windowInfo.data.windowId, {
    3 prompt: 'Summarize the content of the page in 1 paragraph',
    4});
    5
    6// Print the summary to the console or otherwise use it as desired
    7console.log(contentSummary.data.modelResponse);
  5. Clean Up

    Finally, the script closes the window and terminates the session.

    NodeJS
    1await client.windows.close(session.id, windowInfo.data.windowId);
    2await client.sessions.terminate(session.id);

Running the Script

To run the script, execute the following command in your terminal:

NodeJS
$npm run start

Summary

Airtop makes extracting key information from web pages as simple as writing a few lines of code. By combining the power of cloud browser automation with AI summarization, you can efficiently gather and understand content from any website on the internet.