Generate leads from a website with therapists

Overview

This recipe demonstrates how to use Airtop to automate the generation of leads for reaching out to therapists of any kind. The recipe compiles a list of therapists from a provided URL, extracts their information and generates personalized outreach messages for each one. It’s a great example of how to combine Airtop with Agentic Frameworks like LangChain to create multi-step intelligent workflows.

The instructions below will walk you through the application and how it leverages Airtop and LangGraph capabilities to create the list of leads.

Demo

A live demo of this recipe is available here. You can sign up to create an API key for free and try it out yourself! You will also need an OpenAI API key to run the recipe, which you can get for free here.

Prerequisites

To get started, ensure you have:

  • Node.js installed on your system.
  • PNPM package manager installed. See here for installation steps.
  • A Node Version Manager (NVM preferably)
  • An Airtop API key. You can get one for free.
  • An OpenAI API key. You can get one for free here.

Getting Started

  1. Clone the repository

    Start by cloning the source code from GitHub:

    $git clone https://github.com/airtop-ai/examples-typescript
    >cd examples-typescript/examples/lead-generation
  2. Install dependencies

    Run the following command to install the necessary dependencies, including the Airtop SDK:

    $pnpm install

Running the Script

To run the script, go to the examples/lead-generation directory and run the following command in your terminal:

$pnpm run cli

Script walkthrough

The script executes the following tasks in order:

  1. Accepts the API Keys and the URLs to run the recipe

The script starts by requesting the URLs containing the list of therapists, followed by accepting the Airtop and OpenAI API keys.

1import { confirm, input } from "@inquirer/prompts";
2
3// Collect URls
4 while (true) {
5 const url = await input({
6 message:
7 "Enter a URL (i.e. https://www.findapsychologist.org/cities/psychologists-in-san-francisco/ or press Enter with empty input to finish):",
8 });
9
10 if (!url) break;
11 urls.push(url);
12
13 const addMore = await confirm({
14 message: "Do you want to add another URL?",
15 default: true,
16 });
17
18 if (!addMore) break;
19 }
20
21
22// Collect the Airtop API key
23 const apiKey = await input({
24 message: "Enter your Airtop API key:",
25 validate: (value) => {
26 if (!value) return "Please enter a valid API key";
27 return true;
28 },
29 });
30
31 // Collect the OpenAI API Key
32 const openAiKey = await input({
33 message: "Enter your OpenAI API key:",
34 validate: (value) => {
35 if (!value) return "Please enter a valid API key";
36
37 if (!value.startsWith("sk-")) return "Please enter a valid OpenAI API key";
38 return true;
39 },
40 });
  1. Define the graph flow

Using LangChain’s LangGraph flow, we define the flow of the script as follows:

  • Step 1: Validate the URLs to determine if they contain a list of therapists
  • Step 2: Compile the list of therapists from the URLs, or go to an error handling step if no valid URLs are provided
  • Step 3: Enrich the information of each therapists by generating a summary of their profile
  • Step 4: Generate a personalized outreach message for each therapist
  • Step 5: Generate a CSV file with the compiled information
1const graphBuilder = new StateGraph(StateAnnotation, ConfigurableAnnotation)
2 .addNode(URL_VALIDATOR_NODE_NAME, urlValidatorNode, { ends: [FETCH_THERAPISTS_NODE_NAME, ERROR_HANDLER_NODE_NAME] })
3 .addNode(FETCH_THERAPISTS_NODE_NAME, fetchTherapistsNode)
4 .addNode(ENRICH_THERAPISTS_NODE_NAME, enrichTherapistNode)
5 .addNode(OUTREACH_MESSAGE_NODE_NAME, outreachMessageNode)
6 .addNode(CSV_GENERATOR_NODE_NAME, csvGeneratorNode)
7 .addNode(ERROR_HANDLER_NODE_NAME, errorHandlerNode);
8
9 // Edges
10 graphBuilder.addEdge(START, URL_VALIDATOR_NODE_NAME);
11
12 graphBuilder.addEdge(FETCH_THERAPISTS_NODE_NAME, ENRICH_THERAPISTS_NODE_NAME);
13 graphBuilder.addEdge(ENRICH_THERAPISTS_NODE_NAME, OUTREACH_MESSAGE_NODE_NAME);
14 graphBuilder.addEdge(OUTREACH_MESSAGE_NODE_NAME, CSV_GENERATOR_NODE_NAME);
15 graphBuilder.addEdge(CSV_GENERATOR_NODE_NAME, END);
16 graphBuilder.addEdge(ERROR_HANDLER_NODE_NAME, END);
17
18 const graph = graphBuilder.compile();
  1. Enter the graph

We run the graph by passing instances of the Airtop and LangChain’s ChatOpenAI to the graph.

1const result = await graph.invoke(
2 {
3 urls: graphInputs.map((url) => ({ url: url })),
4 },
5 {
6 configurable: {
7 airtopClient: new AirtopClient({ apiKey: config.apiKey }),
8 openAiClient: new ChatOpenAI({ apiKey: config.openAiKey }),
9 },
10 },
11 );
12
13 return result;
  1. URL Validation Node

Using Airtop’s pageQuery API in conjunction with the batchOperate functionality, we analyze each URL in parallel to verify if they contain a list of therapists. We then filter out the URLs that do not satisfy this criteria, and continue with the graph flow using LangGraph’s Command API.

1/**
2 * Uses Airtop's PageQuery API to validate an URL to determine if it contains a list of therapists
3 * @param state - The graph state containing the URLs to validate
4 * @param config - The graph config containing the Airtop client
5 * @returns The graph state with the validated URLs and the next node to execute
6 */
7export const urlValidatorNode = async (
8 state: typeof StateAnnotation.State,
9 config: RunnableConfig<typeof ConfigurableAnnotation.State>,
10) => {
11 const log = getLogger().withPrefix("[urlValidatorNode]");
12 log.withMetadata({ urls: state.urls }).debug("Validating URLs");
13
14 const airtopClient = config.configurable!.airtopClient;
15
16 const links = state.urls.map((url) => ({ url: url.url })).filter((url) => isUrl(url.url));
17
18 const validateUrl = async (input: BatchOperationInput): Promise<BatchOperationResponse<Url>> => {
19 const modelResponse = await airtopClient.windows.pageQuery(input.sessionId, input.windowId, {
20 prompt: URL_VALIDATOR_PROMPT,
21 configuration: {
22 outputSchema: zodToJsonSchema(URL_VALIDATOR_OUTPUT_SCHEMA),
23 },
24 });
25
26 if (!modelResponse.data.modelResponse || modelResponse.data.modelResponse === "") {
27 throw new Error("An error occurred while validating the URL");
28 }
29
30 const response = JSON.parse(modelResponse.data.modelResponse) as UrlOutput;
31
32 if (response.error) {
33 throw new Error(response.error);
34 }
35
36 if (!response.isValid) {
37 throw new Error("The URL does not match the criteria");
38 }
39
40 return {
41 data: { url: input.operationUrl.url, isValid: response.isValid },
42 };
43 };
44
45 const handleError = async ({ error }: BatchOperationError) => {};
46
47 const validatedUrls = await airtopClient.batchOperate(links, validateUrl, { onError: handleError });
48
49 log.withMetadata({ urls: validatedUrls }).debug("Urls that were validated");
50
51 return new Command({
52 update: {
53 ...state,
54 urls: validatedUrls.filter((url) => url.isValid),
55 },
56 goto: validatedUrls.filter((url) => url.isValid).length > 0 ? FETCH_THERAPISTS_NODE_NAME : ERROR_HANDLER_NODE_NAME,
57 });
58};
  1. Fetch Therapists Node

Using Airtop’s pageQuery API in conjunction with the batchOperate functionality, we compile the list of therapists from the URLs, extracting their name, phone number, personal website and email address (if available in the URL).

1const FETCH_THERAPISTS_PROMPT = `
2You are looking at a webpage that contains a list of therapists.
3Your task is to try to extract the following information from the webpage:
4For each therapist, extract the following information:
5- Name
6- Email
7- Phone
8- Personal website or detail page about the therapist in the webpage.
9- Source of the webpage
10Some of the information may not be available in the webpage, in that case just leave it blank.
11For example, if the webpage does not contain any email address, you should leave the email field blank.
12
13For the personal website or detail page about the therapist, you should extract the URL of the website.
14Only extract the first 5 therapists in the list.
15
16If you cannot find the information, use the error field to report the problem.
17If no errors are found, set the error field to an empty string.
18`;
19
20// Name of the fetch therapists node
21export const FETCH_THERAPISTS_NODE_NAME = "therapist-fetcher-node";
22
23/**
24 * Fetches the therapists from the URLs in the state
25 * @param state - The state of the URL validator node.
26 * @param config - The graph config containing the Airtop client
27 * @returns The updated state of the URL validator node.
28 */
29export const fetchTherapistsNode = async (
30 state: typeof StateAnnotation.State,
31 config: RunnableConfig<typeof ConfigurableAnnotation.State>,
32) => {
33 const log = getLogger().withPrefix("[fetchTherapistsNode]");
34
35 const websiteLinks = state.urls.map((url) => ({ url: url.url }));
36
37 const airtopClient = config.configurable!.airtopClient;
38
39 const fetchTherapists = async (input: BatchOperationInput): Promise<BatchOperationResponse<TherapistState>> => {
40 const modelResponse = await airtopClient.windows.pageQuery(input.sessionId, input.windowId, {
41 prompt: FETCH_THERAPISTS_PROMPT,
42 configuration: {
43 outputSchema: THERAPISTS_OUTPUT_JSON_SCHEMA,
44 },
45 });
46
47 if (!modelResponse.data.modelResponse || modelResponse.data.modelResponse === "") {
48 throw new Error("An error occurred while fetching the therapists");
49 }
50
51 const response = JSON.parse(modelResponse.data.modelResponse) as z.infer<typeof THERAPISTS_OUTPUT_SCHEMA>;
52
53 if (response.error) {
54 throw new Error(response.error);
55 }
56
57 return {
58 data: { therapists: response.therapists },
59 };
60 };
61
62 const handleError = async ({ error }: BatchOperationError) => {
63 log.withError(error).error("An error occurred while fetching the therapists");
64 };
65
66 const results = await airtopClient.batchOperate(websiteLinks, fetchTherapists, { onError: handleError });
67
68 log.withMetadata({ results }).debug("Fetched therapists successfully");
69
70 // We expect the response to be an array of one object with the therapists.
71 // For that reason, we set the state field of therapists to that single object
72 return {
73 ...state,
74 therapists: results.flatMap((result) => result.therapists),
75 };
76};
  1. Enrich Therapist Node

In this node, we run a similiar process as the node above (to pick up any extra information from the therapist’s personal webiste that might have not been extracted in the previous step), but we generate a summary of their profile as well. This summary is going to be used to generate the personalized outreach message later in the app.

1const ENRICH_THERAPISTS_PROMPT = `
2You are looking at a webpage that contains info about a specific therapist.
3Your task is to enrich the therapist information with the following information:
4- Name
5- Email
6- Phone
7- Personal website of the therapist
8- Summary of the therapist's information from the webpage
9- Source of the webpage
10
11Some of the information may not be available in the webpage, in that case just leave it blank.
12For example, if the webpage does not contain any email address, you should leave the email field blank.
13
14For the personal website of the therapist, you should extract the URL of the website.
15
16If you cannot find the information, use the error field to report the problem.
17If no errors are found, set the error field to an empty string.`;
18
19/**
20 * Enrich the therapists with the information from the website
21 * @param state - The state of the therapist node.
22 * @param config - The graph config containing the Airtop client
23 * @returns The updated state of the therapist node.
24 */
25export const enrichTherapistNode = async (
26 state: typeof StateAnnotation.State,
27 config: RunnableConfig<typeof ConfigurableAnnotation.State>,
28) => {
29 const log = getLogger().withPrefix("[enrichTherapistNode]");
30 log.debug("Enriching therapists");
31
32 const client = config.configurable!.airtopClient;
33
34 const enrichmentInput: BatchOperationUrl[] = state.therapists
35 .map((therapist) => {
36 if (therapist.website) {
37 return {
38 url: therapist.website,
39 context: { therapist: therapist },
40 };
41 }
42 return null;
43 })
44 .filter(Boolean) as BatchOperationUrl[];
45
46 const enrichOperation = async (input: BatchOperationInput) => {
47 const response = await client.windows.pageQuery(input.sessionId, input.windowId, {
48 prompt: ENRICH_THERAPISTS_PROMPT,
49 configuration: {
50 outputSchema: ENRICHED_THERAPIST_JSON_SCHEMA,
51 },
52 });
53
54 if (!response.data.modelResponse || response.data.modelResponse === "") {
55 throw new Error("An error occurred while enriching the therapist");
56 }
57
58 const enrichedTherapist = JSON.parse(response.data.modelResponse) as z.infer<typeof ENRICHED_THERAPIST_SCHEMA>;
59
60 if (enrichedTherapist.error) {
61 throw new Error(enrichedTherapist.error);
62 }
63
64 return {
65 data: enrichedTherapist,
66 };
67 };
68
69 const handleError = async ({ error }: BatchOperationError) => {
70 console.error("An error occurred while enriching the therapist", error);
71 };
72
73 const enrichedTherapists = await client.batchOperate(enrichmentInput, enrichOperation, { onError: handleError });
74
75 return {
76 ...state,
77 therapists: enrichedTherapists,
78 };
79};
  1. Outreach Message Node

LangChain’s LLM tools make it super easy to use the OpenAI API. We use the ChatOpenAI tool to generate a personalized message for each therapist. We pair it with the withStructuredOutput tool to ensure the output is in the correct format.

1const responseSchema = z.object({
2 message: z.string().describe("The outreach message for the therapist"),
3 error: z.string().optional().describe("Error message if the request cannot be fulfilled"),
4});
5
6const outreachMessagePrompt = (therapist: Therapist) => {
7 return `
8Generate a small outreach message for the following therapist:
9${therapist.name}
10
11Use the following information to generate the message:
12${therapist.summary}
13
14The message should be a small message that is 100 words or less.
15The goal of the message is to connect with the therapist to sell them an app that serves as a
16companion for their practice.
17`;
18};
19
20/**
21 * Adds an outreach message to each therapist using OpenAI's LangChain Tool
22 * @param therapist - The therapist to add the outreach message to.
23 * @returns The updated therapist with the outreach message.
24 */
25const addMessageToTherapist = async (therapist: Therapist, openAiClient: ChatOpenAI): Promise<Therapist> => {
26 const result = await openAiClient
27 .withStructuredOutput(responseSchema)!
28 .invoke([
29 new SystemMessage("You are an AI assistant that generates outreach messages for therapists."),
30 new HumanMessage(outreachMessagePrompt(therapist)),
31 ]);
32
33 if (result.message) {
34 return {
35 ...therapist,
36 outreachMessage: result.message,
37 };
38 }
39
40 return therapist;
41};
42
43// Name of the outreach message node
44export const OUTREACH_MESSAGE_NODE_NAME = "outreach-message-node";
45
46/**
47 * Node that adds an outreach message to each therapist using OpenAI's LangChain Tool
48 * @param state - The state of the therapist node.
49 * @returns The updated state of the therapist node with the outreach messages.
50 */
51export const outreachMessageNode = async (
52 state: typeof StateAnnotation.State,
53 config: RunnableConfig<typeof ConfigurableAnnotation.State>,
54) => {
55 const log = getLogger().withPrefix("[outreachMessageNode]");
56 log.debug("Adding outreach messages to therapists");
57
58 const openAiClient = config.configurable!.openAiClient;
59
60 const therapistsWithOutreachMessage = await Promise.all(
61 state.therapists.map((therapist) => addMessageToTherapist(therapist, openAiClient)),
62 );
63
64 return {
65 ...state,
66 therapists: therapistsWithOutreachMessage,
67 };
68};
  1. CSV Generator Node

In this node, we generate the content of the CSV file with the compiled information of all therapists. In the Live Demo, we provide both the CSV file and a preview of the content.

1/**
2 * LangGraph Node: Generate a CSV file from the therapists state.
3 * @param state - The therapists state.
4 * @returns The updated state with the CSV file path and content.
5 */
6export const csvGeneratorNode = async (state: typeof StateAnnotation.State) => {
7 const log = getLogger().withPrefix("[csvGeneratorNode]");
8 log.debug("Generating CSV file");
9
10 const CSV_FILE_NAME = "lead-generation-results.csv";
11 const columns = ["name", "email", "phone", "website", "source", "message"];
12 let csvContent = `${columns.join(",")}\n`;
13
14 state.therapists.forEach((therapist) => {
15 // Wrap name and message in quotes to properly escape them
16 const escapedName = `"${therapist.name?.replace(/"/g, '""')}"`;
17 const escapedMessage = `"${therapist.outreachMessage?.replace(/"/g, '""')}"`;
18
19 csvContent += `${escapedName},${therapist.email},${therapist.phone},${therapist.website},${therapist.source},${escapedMessage}\n`;
20 });
21
22 fs.writeFileSync(CSV_FILE_NAME, csvContent);
23
24 log.info(`CSV file ${CSV_FILE_NAME} created successfully`);
25 log.info(`File location: ${process.cwd()}/${CSV_FILE_NAME}`);
26
27 return {
28 ...state,
29 csvContent,
30 csvPath: CSV_FILE_NAME,
31 };
32};

Summary

This recipe demonstrates how to use Airtop to automate the generation of leads for approaching therapists of any kind. It leverages several Airtop’s and LangChain’s APIs to scrape different pages, collects and enriches information, and provides it in a structured friendly format for its consumers.