Overview

This recipe demonstrates how to use Airtop to automate the process of finding LinkedIn profiles for a list of contacts. While email addresses are relatively easy to obtain, finding the corresponding LinkedIn profiles can be challenging and time-consuming. This project shows how to leverage Airtop’s parallel processing capabilities to efficiently search and extract LinkedIn profile URLs from Google search results.

The script processes a CSV file containing basic contact information (email, first name, last name), searches for each person on Google, and uses AI to identify and extract their LinkedIn profile URL from the search results. By running multiple browser sessions in parallel, we can process large lists of contacts efficiently.

Prerequisites

To get started, ensure you have:

Getting Started

  1. Clone the repository:
$git clone https://github.com/airtop-ai/recipe-linkedin-profile-enrichement.git
>cd recipe-linkedin-profile-enrichment
  1. Install dependencies:
$npm install
  1. Configure your environment: Create a .env file in the project root:
AIRTOP_API_KEY=<YOUR_API_KEY>
  1. Prepare your input data: Place your CSV file in the data directory with the following format:
email,firstName,lastName
john.doe@company.com,John,Doe
jane.smith@startup.io,Jane,Smith

Script Walkthrough

1. Initialize the Airtop Client

First, we set up the AirtopClient with your API key:

1const client = new AirtopClient({
2 apiKey: process.env.AIRTOP_API_KEY,
3});

2. Load and Process Profiles

The script reads profiles from a CSV file and generates Google search queries for each person:

1const generateGoogleSearchQuery = (userProfile: UserProfile) => {
2 const query = `${userProfile.firstName} ${userProfile.lastName} ${userProfile.email} linkedin`;
3 return `https://www.google.com/search?q=${encodeURIComponent(query)}`;
4};

3. Parallel Processing Setup

To optimize performance, the script processes profiles in parallel batches. You can control the batch size through the BATCH_SIZE constant:

1const BATCH_SIZE = 1; // Process all profiles in parallel
2// const BATCH_SIZE = 2; // Process profiles in batches of 2

4. AI-Powered Profile Extraction

The script uses Airtop’s AI capabilities to identify and extract LinkedIn profile URLs from search results. The AI is specifically instructed to:

  • Look for URLs starting with “https://www.linkedin.com/in/
  • Consider country-specific LinkedIn domains
  • Match profiles based on name and email domain
  • Exclude LinkedIn post URLs
  • Return only the profile URL or ‘Error’ if not found
1const searchForLinkedInProfile = async (session, window, client, profile) => {
2 const result = await client.windows.pageQuery(session.id, window.windowId, {
3 prompt: `You are tasked with retrieving a person's LinkedIn profile URL.
4 Please locate the LinkedIn profile for the specified individual and return only the URL.
5 LinkedIn profile URLs begin with https://www.linkedin.com/in/ so use that to identify the profile.
6 There may be profiles with country based subdomains like https://nl.linkedin.com/in/ that you should also use.
7 If there are multiple links, return the one that most closely matches the profile based on the email domain and the name.
8 Do not return any other text than the URL.
9 Do not return any urls corresponding to posts that may begin with https://www.linkedin.com/posts/
10 If you are unable to find the profile, return 'Error'`,
11 });
12 return result.data.modelResponse;
13};

5. Results Processing

The script saves the enriched profiles to a CSV file in the output directory:

1const saveProfilesToFile = async (profiles: ProfileWithLinkedInProfile[]): Promise<void> => {
2 const projectRoot = path.resolve(__dirname, '..');
3 const outputDir = path.join(projectRoot, CONFIG.PATHS.OUTPUT_DIR);
4 const filePath = path.join(outputDir, CONFIG.PATHS.OUTPUT_FILE);
5
6 await fs.mkdir(outputDir, { recursive: true });
7
8 const csvHeaders = ['email', 'firstName', 'lastName', 'linkedInProfile'];
9 const csvRows = profiles.map((profile) => [
10 profile.email,
11 profile.firstName,
12 profile.lastName,
13 profile.linkedInProfile,
14 ]);
15
16 const csvContent = [csvHeaders.join(','), ...csvRows.map((row) => row.join(','))].join('\n');
17
18 await fs.writeFile(filePath, csvContent);
19
20 console.log(`Saved ${profiles.length} profiles to ${filePath}`);
21};

Running the Script

Execute the script with:

$npm run start

The script will:

  1. Read profiles from your input CSV
  2. Create parallel browser sessions based on the BATCH_SIZE
  3. Search for each person on Google
  4. Extract LinkedIn profile URLs using AI
  5. Save the results to a new CSV file

Best Practices and Considerations

Parallel Processing

  • The BATCH_SIZE parameter lets you control resource usage
  • Setting BATCH_SIZE = 1 processes all profiles in parallel (fastest)
  • Increase BATCH_SIZE to reduce concurrent sessions if needed

Profile Matching Accuracy

The AI prompt is designed to:

  • Prioritize exact name matches
  • Consider email domain correlation
  • Handle international LinkedIn domains
  • Exclude non-profile LinkedIn URLs

Error Handling

The script includes robust error handling for:

  • Session creation failures
  • Window creation issues
  • Profile search timeouts
  • Invalid or missing profile data
  • Retries for when timeouts happen or the AI is unable to find a match.

Summary

This recipe demonstrates how Airtop can automate the tedious process of finding LinkedIn profiles for your contacts. By combining parallel processing capabilities with AI-powered data extraction, you can efficiently enrich your contact database with valuable LinkedIn profile information. The script is particularly useful for sales teams, recruiters, or anyone needing to build comprehensive contact databases with LinkedIn presence.