Scrape Twitter Followers with Python and Selenium

Example of Scraped Data

{
  "userId": "1710236730010349568",
  "isBlueVerified": false,
  "following": false,
  "canDm": false,
  "canMediaTag": true,
  "createdAt": "Fri Oct 06 10:13:15 +0000 2023",
  "defaultProfile": true,
  "defaultProfileImage": true,
  "description": "",
  "fastFollowersCount": 0,
  "favouritesCount": 456,
  "followersCount": 64,
  "friendCount": 7320,
  "hasCustomTimelines": false,
  "isTranslator": false,
  "listedCount": 0,
  "location": "",
  "mediaCount": 0,
  "name": "Paislie Dimitrov",
  "normalFollowersCount": 64,
  "pinnedTweetIdsStr": [],
  "possiblySensitive": false,
  "profileImageUrlHttps": "https://abs.twimg.com/sticky/default_profile_images/default_profile_normal.png",
  "profileInterstitialType": "",
  "username": "PaisliDimit",
  "statusesCount": 0,
  "translatorType": "none",
  "verified": false,
  "wantRetweets": false,
  "withheldInCountries": []
}

Run Code Directly Without Setup
Our guide provides full, ready-to-use code to scrape Twitter followers data seamlessly. With Python and Selenium, automate data collection and capture performance logs efficiently. Unlock Twitter insights with no extra setup required!

Step 1: Set Up Your Environment

First, install Selenium, which will allow us to automate browser actions:

pip install -r requirements.txt

Step 2: Download ChromeDriver

You can find the corresponding chromeDriver from here download chromeDriver

Step 3: Setting Chrome Options

self.options = webdriver.ChromeOptions()
user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'
self.options.add_argument(f'user-agent={user_agent}')
self.options.add_argument('--disable-gpu')
self.options.add_argument('--no-sandbox')
self.options.add_argument('--disable-dev-shm-usage')
self.options.add_argument(f"--remote-debugging-port={remote_debugging_port}")


js_script_name = modify_random_canvas_js()
self.browser = self.get_browser(script_files=[js_script_name], record_network_log=True, headless=True)

Step 4: Access To The Target Page

self.browser.switch_to.new_window('tab')
url= 'https://x.com/1_usd_promotion/verified_followers'
self.browser.get(url=url)

time.sleep(2)

exist_entry_id = []

self.get_network(exist_entry_id, result_list)

print(f'tweet result length = {len(result_list)}')

Step 5: Get The Browser Performance Log

performance_log = self.browser.get_log("performance")
for packet in performance_log:

    msg = packet.get("message")
    message = json.loads(packet.get("message")).get("message")
    packet_method = message.get("method")

    if "Network" in packet_method and 'Following' in msg:

        request_id = message.get("params").get("requestId")

        resp = self.browser.execute_cdp_cmd('Network.getResponseBody', {'requestId': request_id})

Step 6: Extract Data From Response

body = resp.get("body")
body = json.loads(body)
instructions = body['data']['user']['result']['timeline']['timeline'].get('instructions', None)
if not instructions:
    continue
for instruction in instructions:
    entries = instruction.get('entries', None)

Step 7: Important Considerations

Login to Twitter then get auth_token.Learn How to Get Auth Token
You can use api from Apify
You can get full code from GitHub
Join our discussion group! Click Here

FAQ: Frequently Asked Questions

Q: What is Web Scraping?
Web scraping is like using a special tool to collect information from websites automatically. Imagine a robot that helps gather data from a page so you don’t have to do it manually. In this case, we're focusing on Twitter data using Python and Selenium.
Q: How do I start scraping Twitter data?
To start scraping Twitter data, you first need to set up your computer. This includes installing software called Selenium, which helps you control web browsers. Then, you download ChromeDriver, a helper tool for Google Chrome that allows Selenium to work with it.
Q: What is ChromeDriver and why do I need it?
ChromeDriver is like a translator for Selenium and Google Chrome. It helps Selenium understand how to interact with the Chrome browser. You need it so that Selenium can automate actions like clicking buttons or entering information on Twitter.
Q: What is a performance log in scraping?
A performance log is like a diary that records everything happening during your web scraping. It keeps track of all the data exchanges between your scraper (Selenium) and the Twitter page, helping you understand what requests your program is making.
Q: What should I consider before scraping Twitter?
Before scraping Twitter, you need to log in to your Twitter account and get something called an auth_token, which proves you are allowed to access Twitter's data. Also, be careful to respect Twitter's rules so you do not get blocked.
Q: How do I avoid getting blocked while scraping?
To avoid getting blocked, make sure to introduce delays between requests, rotate proxies, and avoid overwhelming Twitter's servers with too many requests in a short period.