> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.airtop.ai/llms.txt.
> For full documentation content, see https://docs.airtop.ai/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.airtop.ai/_mcp/server.

## Overview

This recipe demonstrates how to use Airtop to automate the generation of leads for reaching out to therapists of any kind. The recipe compiles a list of therapists from a provided URL, extracts their information and generates personalized outreach messages for each one. It's a great example of how to combine Airtop with Agentic Frameworks like LangChain to create multi-step intelligent workflows.

The instructions below will walk you through the application and how it leverages Airtop and LangGraph capabilities to create the list of leads.

## Demo

A live demo of this recipe is available [here](https://examples.airtop.ai/lead-generation). You can [sign up](https://portal.airtop.ai/) to create an API key for free and try it out yourself!
You will also need an OpenAI API key to run the recipe, which you can get for free [here](https://platform.openai.com/api-keys).

## Prerequisites

To get started, ensure you have:

* Node.js installed on your system.
* PNPM package manager installed. See [here](https://pnpm.io/installation) for installation steps.
* A Node Version Manager (NVM preferably)
* An Airtop API key. You can [get one for free](https://portal.airtop.ai/api-keys).
* An OpenAI API key. You can get one for free [here](https://platform.openai.com/api-keys).

## Getting Started

1. **Clone the repository**

   Start by cloning the source code from [GitHub](https://github.com/airtop-ai/examples-typescript):

   ```bash
   git clone https://github.com/airtop-ai/examples-typescript
   cd examples-typescript/examples/lead-generation
   ```

2. **Install dependencies**

   Run the following command to install the necessary dependencies, including the Airtop SDK:

   ```bash
   pnpm install
   ```

## Running the Script

To run the script, go to the examples/lead-generation directory and run the following command in your terminal:

```bash
pnpm run cli
```

## Script walkthrough

The script executes the following tasks in order:

1. **Accepts the API Keys and the URLs to run the recipe**

The script starts by requesting the URLs containing the list of therapists, followed by accepting the Airtop and OpenAI API keys.

```typescript
import { confirm, input } from "@inquirer/prompts";

// Collect URls
  while (true) {
    const url = await input({
      message:
        "Enter a URL (i.e. https://www.findapsychologist.org/cities/psychologists-in-san-francisco/ or press Enter with empty input to finish):",
    });

    if (!url) break;
    urls.push(url);

    const addMore = await confirm({
      message: "Do you want to add another URL?",
      default: true,
    });

    if (!addMore) break;
  }


// Collect the Airtop API key
  const apiKey = await input({
    message: "Enter your Airtop API key:",
    validate: (value) => {
      if (!value) return "Please enter a valid API key";
      return true;
    },
  });

  // Collect the OpenAI API Key
  const openAiKey = await input({
    message: "Enter your OpenAI API key:",
    validate: (value) => {
      if (!value) return "Please enter a valid API key";

      if (!value.startsWith("sk-")) return "Please enter a valid OpenAI API key";
      return true;
    },
  });
```

2. **Define the graph flow**

Using LangChain's LangGraph flow, we define the flow of the script as follows:

* Step 1: Validate the URLs to determine if they contain a list of therapists
* Step 2: Compile the list of therapists from the URLs, or go to an error handling step if no valid URLs are provided
* Step 3: Enrich the information of each therapists by generating a summary of their profile
* Step 4: Generate a personalized outreach message for each therapist
* Step 5: Generate a CSV file with the compiled information

```typescript
const graphBuilder = new StateGraph(StateAnnotation, ConfigurableAnnotation)
    .addNode(URL_VALIDATOR_NODE_NAME, urlValidatorNode, { ends: [FETCH_THERAPISTS_NODE_NAME, ERROR_HANDLER_NODE_NAME] })
    .addNode(FETCH_THERAPISTS_NODE_NAME, fetchTherapistsNode)
    .addNode(ENRICH_THERAPISTS_NODE_NAME, enrichTherapistNode)
    .addNode(OUTREACH_MESSAGE_NODE_NAME, outreachMessageNode)
    .addNode(CSV_GENERATOR_NODE_NAME, csvGeneratorNode)
    .addNode(ERROR_HANDLER_NODE_NAME, errorHandlerNode);

  // Edges
  graphBuilder.addEdge(START, URL_VALIDATOR_NODE_NAME);

  graphBuilder.addEdge(FETCH_THERAPISTS_NODE_NAME, ENRICH_THERAPISTS_NODE_NAME);
  graphBuilder.addEdge(ENRICH_THERAPISTS_NODE_NAME, OUTREACH_MESSAGE_NODE_NAME);
  graphBuilder.addEdge(OUTREACH_MESSAGE_NODE_NAME, CSV_GENERATOR_NODE_NAME);
  graphBuilder.addEdge(CSV_GENERATOR_NODE_NAME, END);
  graphBuilder.addEdge(ERROR_HANDLER_NODE_NAME, END);

  const graph = graphBuilder.compile();
```

3. **Enter the graph**

We run the graph by passing instances of the Airtop and LangChain's ChatOpenAI to the graph.

```typescript
const result = await graph.invoke(
    {
      urls: graphInputs.map((url) => ({ url: url })),
    },
    {
      configurable: {
        airtopClient: new AirtopClient({ apiKey: config.apiKey }),
        openAiClient: new ChatOpenAI({ apiKey: config.openAiKey }),
      },
    },
  );

  return result;
```

4. **URL Validation Node**

Using Airtop's `pageQuery` API in conjunction with the `batchOperate` functionality, we analyze each URL in parallel to verify if they contain a list of therapists. We then filter out the URLs that do not satisfy this criteria, and continue with the graph flow using LangGraph's Command API.

```typescript
/**
 * Uses Airtop's PageQuery API to validate an URL to determine if it contains a list of therapists
 * @param state - The graph state containing the URLs to validate
 * @param config - The graph config containing the Airtop client
 * @returns The graph state with the validated URLs and the next node to execute
 */
export const urlValidatorNode = async (
  state: typeof StateAnnotation.State,
  config: RunnableConfig<typeof ConfigurableAnnotation.State>,
) => {
  const log = getLogger().withPrefix("[urlValidatorNode]");
  log.withMetadata({ urls: state.urls }).debug("Validating URLs");

  const airtopClient = config.configurable!.airtopClient;

  const links = state.urls.map((url) => ({ url: url.url })).filter((url) => isUrl(url.url));

  const validateUrl = async (input: BatchOperationInput): Promise<BatchOperationResponse<Url>> => {
    const modelResponse = await airtopClient.windows.pageQuery(input.sessionId, input.windowId, {
      prompt: URL_VALIDATOR_PROMPT,
      configuration: {
        outputSchema: zodToJsonSchema(URL_VALIDATOR_OUTPUT_SCHEMA),
      },
    });

    if (!modelResponse.data.modelResponse || modelResponse.data.modelResponse === "") {
      throw new Error("An error occurred while validating the URL");
    }

    const response = JSON.parse(modelResponse.data.modelResponse) as UrlOutput;

    if (response.error) {
      throw new Error(response.error);
    }

    if (!response.isValid) {
      throw new Error("The URL does not match the criteria");
    }

    return {
      data: { url: input.operationUrl.url, isValid: response.isValid },
    };
  };

  const handleError = async ({ error }: BatchOperationError) => {};

  const validatedUrls = await airtopClient.batchOperate(links, validateUrl, { onError: handleError });

  log.withMetadata({ urls: validatedUrls }).debug("Urls that were validated");

  return new Command({
    update: {
      ...state,
      urls: validatedUrls.filter((url) => url.isValid),
    },
    goto: validatedUrls.filter((url) => url.isValid).length > 0 ? FETCH_THERAPISTS_NODE_NAME : ERROR_HANDLER_NODE_NAME,
  });
};
```

5. **Fetch Therapists Node**

Using Airtop's `pageQuery` API in conjunction with the `batchOperate` functionality, we compile the list of therapists from the URLs, extracting their name, phone number, personal website and email address (if available in the URL).

```typescript
const FETCH_THERAPISTS_PROMPT = `
You are looking at a webpage that contains a list of therapists.
Your task is to try to extract the following information from the webpage:
For each therapist, extract the following information:
- Name
- Email
- Phone
- Personal website or detail page about the therapist in the webpage.
- Source of the webpage
Some of the information may not be available in the webpage, in that case just leave it blank.
For example, if the webpage does not contain any email address, you should leave the email field blank.

For the personal website or detail page about the therapist, you should extract the URL of the website.
Only extract the first 5 therapists in the list.

If you cannot find the information, use the error field to report the problem.
If no errors are found, set the error field to an empty string.
`;

// Name of the fetch therapists node
export const FETCH_THERAPISTS_NODE_NAME = "therapist-fetcher-node";

/**
 * Fetches the therapists from the URLs in the state
 * @param state - The state of the URL validator node.
 * @param config - The graph config containing the Airtop client
 * @returns The updated state of the URL validator node.
 */
export const fetchTherapistsNode = async (
  state: typeof StateAnnotation.State,
  config: RunnableConfig<typeof ConfigurableAnnotation.State>,
) => {
  const log = getLogger().withPrefix("[fetchTherapistsNode]");

  const websiteLinks = state.urls.map((url) => ({ url: url.url }));

  const airtopClient = config.configurable!.airtopClient;

  const fetchTherapists = async (input: BatchOperationInput): Promise<BatchOperationResponse<TherapistState>> => {
    const modelResponse = await airtopClient.windows.pageQuery(input.sessionId, input.windowId, {
      prompt: FETCH_THERAPISTS_PROMPT,
      configuration: {
        outputSchema: THERAPISTS_OUTPUT_JSON_SCHEMA,
      },
    });

    if (!modelResponse.data.modelResponse || modelResponse.data.modelResponse === "") {
      throw new Error("An error occurred while fetching the therapists");
    }

    const response = JSON.parse(modelResponse.data.modelResponse) as z.infer<typeof THERAPISTS_OUTPUT_SCHEMA>;

    if (response.error) {
      throw new Error(response.error);
    }

    return {
      data: { therapists: response.therapists },
    };
  };

  const handleError = async ({ error }: BatchOperationError) => {
    log.withError(error).error("An error occurred while fetching the therapists");
  };

  const results = await airtopClient.batchOperate(websiteLinks, fetchTherapists, { onError: handleError });

  log.withMetadata({ results }).debug("Fetched therapists successfully");

  // We expect the response to be an array of one object with the therapists.
  // For that reason, we set the state field of therapists to that single object
  return {
    ...state,
    therapists: results.flatMap((result) => result.therapists),
  };
};
```

6. **Enrich Therapist Node**

In this node, we run a similiar process as the node above (to pick up any extra information from the therapist's personal webiste that might have not been extracted in the previous step), but we generate a summary of their profile as well. This summary is going to be used to generate the personalized outreach message later in the app.

```typescript
const ENRICH_THERAPISTS_PROMPT = `
You are looking at a webpage that contains info about a specific therapist.
Your task is to enrich the therapist information with the following information:
- Name
- Email
- Phone
- Personal website of the therapist
- Summary of the therapist's information from the webpage
- Source of the webpage

Some of the information may not be available in the webpage, in that case just leave it blank.
For example, if the webpage does not contain any email address, you should leave the email field blank.

For the personal website of the therapist, you should extract the URL of the website.

If you cannot find the information, use the error field to report the problem.
If no errors are found, set the error field to an empty string.`;

/**
 * Enrich the therapists with the information from the website
 * @param state - The state of the therapist node.
 * @param config - The graph config containing the Airtop client
 * @returns The updated state of the therapist node.
 */
export const enrichTherapistNode = async (
  state: typeof StateAnnotation.State,
  config: RunnableConfig<typeof ConfigurableAnnotation.State>,
) => {
  const log = getLogger().withPrefix("[enrichTherapistNode]");
  log.debug("Enriching therapists");

  const client = config.configurable!.airtopClient;

  const enrichmentInput: BatchOperationUrl[] = state.therapists
    .map((therapist) => {
      if (therapist.website) {
        return {
          url: therapist.website,
          context: { therapist: therapist },
        };
      }
      return null;
    })
    .filter(Boolean) as BatchOperationUrl[];

  const enrichOperation = async (input: BatchOperationInput) => {
    const response = await client.windows.pageQuery(input.sessionId, input.windowId, {
      prompt: ENRICH_THERAPISTS_PROMPT,
      configuration: {
        outputSchema: ENRICHED_THERAPIST_JSON_SCHEMA,
      },
    });

    if (!response.data.modelResponse || response.data.modelResponse === "") {
      throw new Error("An error occurred while enriching the therapist");
    }

    const enrichedTherapist = JSON.parse(response.data.modelResponse) as z.infer<typeof ENRICHED_THERAPIST_SCHEMA>;

    if (enrichedTherapist.error) {
      throw new Error(enrichedTherapist.error);
    }

    return {
      data: enrichedTherapist,
    };
  };

  const handleError = async ({ error }: BatchOperationError) => {
    console.error("An error occurred while enriching the therapist", error);
  };

  const enrichedTherapists = await client.batchOperate(enrichmentInput, enrichOperation, { onError: handleError });

  return {
    ...state,
    therapists: enrichedTherapists,
  };
};
```

7. **Outreach Message Node**

LangChain's LLM tools make it super easy to use the OpenAI API. We use the `ChatOpenAI` tool to generate a personalized message for each therapist. We pair it with the `withStructuredOutput` tool to ensure the output is in the correct format.

```typescript
const responseSchema = z.object({
  message: z.string().describe("The outreach message for the therapist"),
  error: z.string().optional().describe("Error message if the request cannot be fulfilled"),
});

const outreachMessagePrompt = (therapist: Therapist) => {
  return `
Generate a small outreach message for the following therapist:
${therapist.name}

Use the following information to generate the message:
${therapist.summary}

The message should be a small message that is 100 words or less.
The goal of the message is to connect with the therapist to sell them an app that serves as a 
companion for their practice.
`;
};

/**
 * Adds an outreach message to each therapist using OpenAI's LangChain Tool
 * @param therapist - The therapist to add the outreach message to.
 * @returns The updated therapist with the outreach message.
 */
const addMessageToTherapist = async (therapist: Therapist, openAiClient: ChatOpenAI): Promise<Therapist> => {
  const result = await openAiClient
    .withStructuredOutput(responseSchema)!
    .invoke([
      new SystemMessage("You are an AI assistant that generates outreach messages for therapists."),
      new HumanMessage(outreachMessagePrompt(therapist)),
    ]);

  if (result.message) {
    return {
      ...therapist,
      outreachMessage: result.message,
    };
  }

  return therapist;
};

// Name of the outreach message node
export const OUTREACH_MESSAGE_NODE_NAME = "outreach-message-node";

/**
 * Node that adds an outreach message to each therapist using OpenAI's LangChain Tool
 * @param state - The state of the therapist node.
 * @returns The updated state of the therapist node with the outreach messages.
 */
export const outreachMessageNode = async (
  state: typeof StateAnnotation.State,
  config: RunnableConfig<typeof ConfigurableAnnotation.State>,
) => {
  const log = getLogger().withPrefix("[outreachMessageNode]");
  log.debug("Adding outreach messages to therapists");

  const openAiClient = config.configurable!.openAiClient;

  const therapistsWithOutreachMessage = await Promise.all(
    state.therapists.map((therapist) => addMessageToTherapist(therapist, openAiClient)),
  );

  return {
    ...state,
    therapists: therapistsWithOutreachMessage,
  };
};
```

8. **CSV Generator Node**

In this node, we generate the content of the CSV file with the compiled information of all therapists. In the Live Demo, we provide both the CSV file and a preview of the content.

```typescript
/**
 * LangGraph Node: Generate a CSV file from the therapists state.
 * @param state - The therapists state.
 * @returns The updated state with the CSV file path and content.
 */
export const csvGeneratorNode = async (state: typeof StateAnnotation.State) => {
  const log = getLogger().withPrefix("[csvGeneratorNode]");
  log.debug("Generating CSV file");

  const CSV_FILE_NAME = "lead-generation-results.csv";
  const columns = ["name", "email", "phone", "website", "source", "message"];
  let csvContent = `${columns.join(",")}\n`;

  state.therapists.forEach((therapist) => {
    // Wrap name and message in quotes to properly escape them
    const escapedName = `"${therapist.name?.replace(/"/g, '""')}"`;
    const escapedMessage = `"${therapist.outreachMessage?.replace(/"/g, '""')}"`;

    csvContent += `${escapedName},${therapist.email},${therapist.phone},${therapist.website},${therapist.source},${escapedMessage}\n`;
  });

  fs.writeFileSync(CSV_FILE_NAME, csvContent);

  log.info(`CSV file ${CSV_FILE_NAME} created successfully`);
  log.info(`File location: ${process.cwd()}/${CSV_FILE_NAME}`);

  return {
    ...state,
    csvContent,
    csvPath: CSV_FILE_NAME,
  };
};
```

## Summary

This recipe demonstrates how to use Airtop to automate the generation of leads for approaching therapists of any kind. It leverages several Airtop's and LangChain's APIs to scrape different pages, collects and enriches information, and provides it in a structured friendly format for its consumers.