Scrape Twitter Tweets with Python and Selenium

Example of Scraped Data

{
  "type": "tweet",
  "id": 1843447413824209200,
  "viewCount": "51275823",
  "url": "https://x.com/elonmusk/status/1843447413824209160",
  "twitterUrl": "https://twitter.com/elonmusk/status/1843447413824209160",
  "text": "It is a surefire way for the Dems to turn America in a one-party state, just like California",
  "isQuote": true,
  "retweetCount": 59493,
  "replyCount": 11090,
  "likeCount": 250068,
  "quoteCount": 1661,
  "createdAt": "Tue Oct 08 00:24:47 +0000 2024",
  "lang": "en",
  "quoteId": "1843379457605939258",
  "bookmarkCount": 11177,
  "isReply": false,
  "source": "Twitter for iPhone",
  "author": {
    "type": "user",
    "username": "elonmusk",
    "url": "https://x.com/elonmusk",
    "twitterUrl": "https://x.com/elonmusk",
    "id": "44196397",
    "name": "Elon Musk",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": "",
    "profilePicture": "https://pbs.twimg.com/profile_images/1849727333617573888/HBgPUrjG_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/44196397/1726163678",
    "description": "Read @America to understand why I’m supporting Trump for President",
    "location": "",
    "followers": 202400789,
    "following": 794,
    "protected": false,
    "status": "",
    "canDm": false,
    "canMediaTag": false,
    "createdAt": "Tue Jun 02 20:12:29 +0000 2009",
    "advertiserAccountType": "",
    "analyticsType": "",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {
        "urls": [
          {
            "display_url": "TheAmericaPAC.org",
            "expanded_url": "http://TheAmericaPAC.org",
            "url": "https://t.co/DjyKIO6ePx",
            "indices": [
              0,
              23
            ]
          }
        ]
      }
    },
    "fastFollowersCount": 0,
    "favouritesCount": 83676,
    "geoEnabled": false,
    "hasCustomTimelines": true,
    "hasExtendedProfile": false,
    "isTranslator": false,
    "mediaCount": 2637,
    "profileBackgroundColor": "",
    "statusesCount": 55447,
    "translatorTypeEnum": "",
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {
      "label": {
        "url": {
          "url": "https://twitter.com/X",
          "urlType": "DeepLink"
        },
        "badge": {
          "url": "https://pbs.twimg.com/profile_images/1683899100922511378/5lY42eHs_bigger.jpg"
        },
        "description": "X",
        "userLabelType": "BusinessLabel",
        "userLabelDisplayType": "Badge"
      }
    }
  },
  "quote": {
    "type": "tweet",
    "id": "1843379457605939258",
    "text": "Elon Musk explains how this will be our last real election if Kamala Harris wins.\n\nEveryone must watch this. https://t.co/DoBh9qM7K7",
    "retweetCount": 10725,
    "replyCount": 1848,
    "likeCount": 38268,
    "quoteCount": 790,
    "createdAt": "Mon Oct 07 19:54:45 +0000 2024",
    "lang": "en",
    "bookmarkCount": 5143,
    "author": {
      "type": "user",
      "username": "EndWokeness",
      "url": "https://x.com/EndWokeness",
      "twitterUrl": "https://x.com/EndWokeness",
      "id": "1552795969959636992",
      "name": "End Wokeness",
      "isVerified": false,
      "isBlueVerified": true,
      "verifiedType": "",
      "profilePicture": "https://pbs.twimg.com/profile_images/1563691268793946117/OedvhFeS_normal.jpg",
      "coverPicture": "https://pbs.twimg.com/profile_banners/1552795969959636992/1720913469",
      "description": "Fighting, exposing, and mocking wokeness. DM for submissions",
      "location": "",
      "followers": 3107102,
      "following": 1177,
      "protected": false,
      "status": "",
      "canDm": true,
      "canMediaTag": true,
      "createdAt": "Thu Jul 28 23:20:28 +0000 2022",
      "advertiserAccountType": "",
      "analyticsType": "",
      "entities": {
        "description": {
          "urls": []
        }
      },
      "fastFollowersCount": 0,
      "favouritesCount": 13138,
      "geoEnabled": false,
      "hasCustomTimelines": true,
      "hasExtendedProfile": false,
      "isTranslator": false,
      "mediaCount": 7219,
      "profileBackgroundColor": "",
      "statusesCount": 15502,
      "translatorTypeEnum": "",
      "withheldInCountries": [],
      "affiliatesHighlightedLabel": {}
    }
  }
}

Run Code Directly Without Setup
Our guide provides full, ready-to-use code to scrape Twitter tweets data seamlessly. With Python and Selenium, automate data collection and capture performance logs efficiently. Unlock Twitter insights with no extra setup required!

Advanced Tweet Filtering
Utilize the advanced search capabilities of Twitter to target specific tweets that match your criteria. With the ability to filter by keywords, dates, and hashtags, you can collect precise data, ensuring your research is relevant and focused.

Step 1: Set Up Your Environment

First, install Selenium, which will allow us to automate browser actions:

pip install -r requirements.txt

Step 2: Download ChromeDriver

You can find the corresponding chromeDriver from here download chromeDriver

Step 3: Run Chrome for Testing

This step is for debugging to see the effect. If you don’t want to see it, you can skip this step.

@echo off
start C:\software\chrome-win64\chrome.exe --remote-debugging-port=9223

Step 4: Setting Chrome Options

self.options = webdriver.ChromeOptions()
user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'
self.options.add_argument(f'user-agent={user_agent}')
self.options.add_argument('--disable-gpu')
self.options.add_argument('--no-sandbox')
self.options.add_argument('--disable-dev-shm-usage')


self.options.add_experimental_option("debuggerAddress", "localhost:9223")

js_script_name = modify_random_canvas_js()
self.browser = self.get_browser(script_files=[js_script_name], record_network_log=True, headless=True)

Step 5: Search for Tweet Data Using Selenium

self.browser.switch_to.new_window('tab')
url = "https://x.com/explore"
self.browser.get(url=url)
search_box = self.browser.find_element(By.CSS_SELECTOR, '[data-testid="SearchBox_Search_Input"]')
search_box.send_keys(Keys.CONTROL + "a")  # Select all text
search_box.send_keys(Keys.DELETE)
self.browser.implicitly_wait(20)
search_box.send_keys(search_query)

# Press Enter to submit the search
search_box.send_keys(Keys.RETURN)
self.browser.implicitly_wait(1000)
second_div = self.browser.find_element(By.CSS_SELECTOR,'[data-testid="ScrollSnap-List"] [role="presentation"]:nth-of-type(2)')

Step 6: Monitor the browser network response

performance_log = self.browser.get_log("performance")
for packet in performance_log:

    msg = packet.get("message")
    message = json.loads(packet.get("message")).get("message")
    packet_method = message.get("method")

    if "Network" in packet_method and 'SearchTimeline' in msg:
        document_url = message['params'].get('documentURL')
        if (not document_url) or ('&f=live' not in document_url):
            continue
        request_id = message.get("params").get("requestId")

Step 7: Extract Data from response

entries = json.loads(body)['data']['search_by_raw_query']['search_timeline']['timeline']['instructions'][0].get('entries', '')
if not entries:
    continue
for entry in entries:
    item_content = entry['content'].get('itemContent', '')
    if not item_content:
        continue
    tweet_result = entry['content']['itemContent']['tweet_results']['result']
    entry_id = entry['entryId']

Step 8: Important Considerations

Login to Twitter then get Twitter Cookie.Learn How to Get Twitter Cookie
You can use api from Apify
You can get full code from GitHub
Join our discussion group! Click Here

FAQ: Frequently Asked Questions

Q: What is Twitter scraping with Python and Selenium?
Twitter scraping is the process of collecting data from Twitter using automated tools like Python and Selenium. These tools allow you to simulate a browser, search for tweets, and gather information without manually searching.
Q: Why would I want to scrape Twitter data?
Scraping Twitter data can help you gather information for research, track specific topics, or analyze trends. It's useful for students, businesses, and anyone interested in understanding public opinions and discussions.
Q: What is ChromeDriver and why do I need it?
ChromeDriver is like a translator for Selenium and Google Chrome. It helps Selenium understand how to interact with the Chrome browser. You need it so that Selenium can automate actions like clicking buttons or entering information on Twitter.
Q: What is a performance log in scraping?
A performance log is like a diary that records everything happening during your web scraping. It keeps track of all the data exchanges between your scraper (Selenium) and the Twitter page, helping you understand what requests your program is making.
Q: Can I run the scraping script without seeing the browser?
Yes, you can run your script in a 'headless' mode, where the browser operates in the background, so you won't see it on your screen.
Q: Are there any tools I need to scrape Twitter?
Yes, you will need Python installed on your computer, along with the Selenium library and ChromeDriver. These tools together allow you to control the web browser and capture the data you want.