How to scrape Google reviews

 element under that button. So, you can extract the profile URL and username with:

user_contrib_element = review_html_element.locator("button[data-href*="/maps/contrib/"]:not([aria-label])")
user_url = await user_contrib_element.get_attribute("data-href")

username_element = user_contrib_element.locator("div").first
username = await username_element.text_content()

Note the use of the following two Playwright methods for data extraction:

Now, focus on the star rating:

The easiest way to get that information is by selecting the element whose aria-label attribute includes the word “star”:

stars_element = review_html_element.locator("[aria-label*="star"]")

With some custom logic, you can get the review score by looking at the number (from 1 to 5) contained inside the aria-label text:

stars_label = await stars_element.get_attribute("aria-label")

# Extract the review score from the stars label
stars = None
for i in range(1, 6):
    if stars_label and str(i) in stars_label:
        stars = i
        break

In the previous image, see how the review time info element is the next sibling after the stars node. Since CSS doesn’t provide a simple way to get siblings, it’s better to rely on XPath instead:

time_sibling = stars_element.locator("xpath=following-sibling::span")
time = await time_sibling.text_content()

Finally, focuses on the review text node. Reviews often contain long text, which is initially truncated and can be expanded with a “More” button:

The HTML of the “More” button

To access the full review, you first need to click the “More” button you can identify by targeting its aria-label content. Then, extract the full text from the surrounding container that has this CSS selector:

div[tabindex="-1"][id][lang]

Remember that not all reviews have a “More” button. So, you need to click the button only if it actually visible:

more_element = review_html_element.locator("button[aria-label="See more"]")
if await more_element.count() > 0:
    await more_element.click()

Next, retrieve the review text with:

text_element = review_html_element.locator("div[tabindex="-1"][id][lang]")
text = await text_element.text_content()

5. Collect the scraped data

At this point, you have all the scraped data stored in Python variables. Use them to define a dictionary representing a single review, and then append it to the reviews list:

review = {
    "user_url": user_url,
    "username": username,
    "stars": stars,
    "time": time,
    "text": text
}
reviews.append(review)

This is the final step inside your for loop to scrape Google reviews.

6. Export to CSV

Outside the for loop – and after the async Playwright block – use Python’s built-in csv module to export the scraped reviews to CSV:

with open("google_maps_reviews.csv", mode="w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["user_url", "username", "stars", "time", "text"])
    writer.writeheader()
    writer.writerows(reviews)

This will create a google_maps_reviews.csv file, add a header row to it, and then populate it with the review data from the reviews list.

7. Complete code

Your scraper.py file should now contain:

import asyncio
from playwright.async_api import async_playwright
import csv

async def run():
    # Where to store the scraped data
    reviews = []

    async with async_playwright() as p:
        # Initialize a new Playwright instance
        browser = await p.chromium.launch(
            headless=False  # Set to True in production
        )
        context = await browser.new_context()
        page = await context.new_page()

        # The URL of the Google Maps reviews page
        url = "https://www.google.com/maps/place/Apple+Fifth+Avenue/@40.8391069,-74.2188908,10z/data=!4m12!1m2!2m1!1sapple+store!3m8!1s0x89c258f0741ceda7:0x4fd23cddb7a3d144!8m2!3d40.7638478!4d-73.9729785!9m1!1b1!15sCgthcHBsZSBzdG9yZSIDiAEBWg0iC2FwcGxlIHN0b3JlaAGSARFlbGVjdHJvbmljc19zdG9yZeABAA!16s%2Fg%2F1yl47t1xt?entry=ttu&g_ep=EgoyMDI1MDQwOC4wIKXMDSoASAFQAw%3D%3D"

        # Navigate to the target Google Maps page
        await page.goto(url)

        # Selecting the review HTML elements for the specific page
        review_html_elements = page.locator("div[data-review-id][jsaction]")
        # Wait for the element to be loaded and visible on the page
        await review_html_elements.first.wait_for(state="visible")

        # Iterate over the elements and scrape data from each of them
        for review_html_element in await review_html_elements.all():
            # Scraping logic
            user_contrib_element = review_html_element.locator("button[data-href*="/maps/contrib/"]:not([aria-label])")
            user_url = await user_contrib_element.get_attribute("data-href")

            username_element = user_contrib_element.locator("div").first
            username = await username_element.text_content()

            stars_element = review_html_element.locator("[aria-label*="star"]")
            stars_label = await stars_element.get_attribute("aria-label")

            # Extract the review score from the stars label
            stars = None
            for i in range(1, 6):
                if stars_label and str(i) in stars_label:
                    stars = i
                    break

            # Get the next sibling of the previous element with an XPath expression
            time_sibling = stars_element.locator("xpath=following-sibling::span")
            time = await time_sibling.text_content()

            # Select the "More" button and if it is present, click it
            more_element = review_html_element.locator("button[aria-label="See more"]")
            if await more_element.count() > 0:
                await more_element.click()

            text_element = review_html_element.locator("div[tabindex="-1"][id][lang]")
            text = await text_element.text_content()

            # Populate a new object with the scraped data and append it to the list
            review = {
                "user_url": user_url,
                "username": username,
                "stars": stars,
                "time": time,
                "text": text
            }
            reviews.append(review)

        # Close the browser and release its resources
        await browser.close()

    # Export to CSV
    with open("google_maps_reviews.csv", mode="w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=["user_url", "username", "stars", "time", "text"])
        writer.writeheader()
        writer.writerows(reviews)

if __name__ == "__main__":
    asyncio.run(run())

As you can see, it’s possible to build a simple Google Reviews scraper with just a few lines of Python code.

To test your script, execute:

python scraper.py

If you configured Playwright to run in GUI mode, a browser window will open and navigate to the target Google Maps reviews page. Then, the script will start scraping the review data. A google_maps_reviews.csv file will appear in the project’s folder. Open that file, and you’ll see the scraped results:

The resulting CSV with the scraped data

Wonderful! Your Python script to scrape Google reviews is working like a charm.

8. Deploy to Apify

Deploying your scraper to Apify gives you access to infrastructure, such as anti-blocking features, scheduling, and integrations. It involves turning your scraper into an Actor so you can make it publicly available on Apify Store and earn passive income from those who need your solution.

To deploy your Google reviews scraper to Apify, you need:

To create a new Google reviews scraping project on Apify:

  1. Log in to your Apify account
  2. Go to Apify Console
  3. Under the “Actors” dropdown, select “Development,” and then press the “Develop new” button:
Pressing the “Develop new” button

Next, click the “View all templates” button to see all Apify templates:

Clicking the “View all templates” button

In the Python section, select the “Playwright + Chrome” template, as that is the setup we used in this tutorial:

Selecting the “Playwright + Chrome” starter

Review the starter project code and click “Use this template” to create your own copy:

Clicking the “Use this template” button

You’ll then be taken to Apify’s built-in online IDE:

The Apify online IDE

There, you can customize your Actor by writing your Google reviews scraping logic directly in the cloud.

Inside main.py, you’ll find some starter code that reads the target URLs from Apify input arguments. The script then uses Playwright to launch a browser and navigate to the specified page.

From there, integrate the scraping logic you wrote in steps 3, 4, and 5 of this tutorial. Put everything together, and you’ll end up with the following Apify Actor code:

from __future__ import annotations

from urllib.parse import urljoin

from apify import Actor, Request
from playwright.async_api import async_playwright

async def main() -> None:
    async with Actor:
        # Retrieve the Actor input, and use default values if not provided
        actor_input = await Actor.get_input() or {}
        start_urls = actor_input.get("start_urls")

        # Exit if no start URLs are provided
        if not start_urls:
            Actor.log.info("No start URLs specified in actor input, exiting...")
            await Actor.exit()

        # Open the default request queue for handling URLs to be processed
        request_queue = await Actor.open_request_queue()

        # Enqueue the start URLs
        for start_url in start_urls:
            url = start_url.get("url")
            Actor.log.info(f"Enqueuing {url} ...")
            new_request = Request.from_url(url)
            await request_queue.add_request(new_request)

        # Launch Playwright and open a new browser context
        async with async_playwright() as playwright:
            # Configure the browser to launch in headless mode as per Actor configuration
            browser = await playwright.chromium.launch(
                headless=Actor.config.headless,
                args=["--disable-gpu"],
            )
            context = await browser.new_context()

            # Process the URLs from the request queue
            while request := await request_queue.fetch_next_request():
                # The URL of the Google Maps reviews page
                url = request.url
                page = await context.new_page()

                # Navigate to the target Google Maps page
                await page.goto(url)

                # Selecting the review HTML elements for the specific page
                review_html_elements = page.locator("div[data-review-id][jsaction]")
                # Wait for the element to be loaded and visible on the page
                await review_html_elements.first.wait_for(state="visible")

                # Iterate over the elements and scrape data from each of them
                for review_html_element in await review_html_elements.all():
                    # Scraping logic
                    user_contrib_element = review_html_element.locator("button[data-href*="/maps/contrib/"]:not([aria-label])")
                    user_url = await user_contrib_element.get_attribute("data-href")

                    username_element = user_contrib_element.locator("div").first
                    username = await username_element.text_content()

                    stars_element = review_html_element.locator("[aria-label*="star"]")
                    stars_label = await stars_element.get_attribute("aria-label")

                    # Extract the review score from the stars label
                    stars = None
                    for i in range(1, 6):
                        if stars_label and str(i) in stars_label:
                            stars = i
                            break

                    # Get the next sibling of the previous element with an XPath expression
                    time_sibling = stars_element.locator("xpath=following-sibling::span")
                    time = await time_sibling.text_content()

                    # Select the "More" button and if it is present, click it
                    more_element = review_html_element.locator("button[aria-label="See more"]")
                    if await more_element.count() > 0:
                        await more_element.click()

                    text_element = review_html_element.locator("div[tabindex="-1"][id][lang]")
                    text = await text_element.text_content()

                    # Populate a new object with the scraped data and append it to the list
                    review = {
                        "user_url": user_url,
                        "username": username,
                        "stars": stars,
                        "time": time,
                        "text": text
                    }

                    # Store the extracted data in the default dataset
                    await Actor.push_data(review)

                # Close the page in the browser
                await page.close()

                # Mark the request as handled to ensure it is not processed again
                await request_queue.mark_request_as_handled(request)

Note that you no longer need the CSV export logic, as that’s handled push_data() method:

await Actor.push_data(review)

This method lets you retrieve the scraped data via the API or export it in various formats directly from the Apify dashboard.

Next, click the “Save, Build & Start” button:

Clicking the “Save, Build & Start” button

Go to the “Input” tab and enter the target Google Maps URL manually:

Configuring the target page

Click “Save & Start” to run your Google reviews scraper Actor. Once the run is over, the results should look like this:

The resulting scraped reviews

Switch to the “Storage” tab:

Note the export options

From here, you can export your scraped data in multiple formats—including JSON, CSV, XML, Excel, HTML Table, RSS, and JSONL.

Et voilà! You’ve successfully scraped Google reviews on the Apify platform.

9. Next steps

This tutorial has walked you through the basics of scraping reviews from Google Maps. To take your scraper to the next level, consider implementing these advanced techniques:

  • Automated search navigation: Extend your Python web browser automation logic to visit Google Maps and search for the desired places for you. That reduces manual input and helps mimic real user behavior, lowering the risk of detection or blocking.
  • Load more reviews: Currently, the scraper only captures the initial set of visible reviews. By simulating scroll behavior and waiting for the page to load more content dynamically, you can collect a much larger dataset.
  • Proxy integration: Use proxy servers to rotate IP addresses and avoid rate limits or blocks. You can explore Apify’s built-in proxy tools or integrate your own. Discover more in the official documentation.

Use a ready-made Google reviews scraper

Scraping reviews from Google Maps can be more complicated than what we’ve demoed in this article. When retrieving data at scale, you’ll have to deal with anti-scraping measures like rate limiters, IP bans, CAPTCHAs, and more.

The easiest way to overcome these obstacles is by using a pre-built Google reviews scraper that handles everything for you. Some benefits of this approach include:

  • No coding required: Start scraping instantly and with no technical knowledge required.
  • Block bypass: Avoid IP bans and CAPTCHAs automatically.
  • API access: Easily integrate scraped data into your applications.
  • Scalability: Handle large volumes of reviews with no effort.
  • Regular updates: Stay compliant with Google’s latest changes.
  • Reliable data extraction: Minimize errors and inconsistencies.

Apify offers thousands of Actors for various websites, including nearly 250 specifically for Google. If you’re interested in scraping reviews from Google Maps without building a scraper yourself, simply visit Apify Store and search for the “google reviews” keyword:

Selecting the “Google Maps Reviews Scraper

Select the “Google Maps Reviews Scraper” Actor, then click “Try for free” on its public page:

Clicking the “Try for free” button

The Actor will be added to your personal Apify dashboard. Configure it as needed, then click “Save & Start” to launch the Actor:

Clicking the “Save & Start” button

Wait for the Actor to finish, and enjoy your review data from Yellowstone National Park (the default location you can visually modify in the Apify platform in the “Input” section):

The resulting data from Google Maps Scraper on Apify

And that’s it! You’ve successfully scraped review data from Google Maps with just a few clicks.

Conclusion

In this tutorial, you used Playwright with Python to build a Google reviews scraper that automates the extraction of review data from Google Maps. You focused on scraping reviews from a popular location and successfully deployed the scraper to Apify.

This project demonstrated how Apify helps streamline development by enabling scalable, cloud-based scraping with minimal setup. Feel free to explore other Actor templates and SDKs to expand your automation toolkit.

As shown in this blog post, using a ready-made Google Maps Reviews Scraper is the most efficient way to collect review data effectively and at scale.

Frequently asked questions

Can you scrape Google reviews?

Yes, you can scrape Google reviews, but it requires using a browser automation tool like Selenium, Playwright, or Puppeteer. That’s because Google Maps is highly interactive. Also, Google has recently cracked down on traditional scrapers that rely on HTTP clients and HTML parsers.

Yes, it’s legal to scrape reviews from Google since they’re publicly available data. However, for full legal protection, you may also want to review and comply with Google’s Terms and Conditions, as scraping can sometimes violate platform policies even if that isn’t against the law. You can learn more in this comprehensive guide to the legality of web scraping.

How to scrape Google reviews?

To scrape Google reviews, use a headless browser controlled via a browser automation API like Playwright or Selenium. These tools let you load the Google Maps page, scroll through reviews, extract elements like usernames and ratings, and export the data to CSV or JSON.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *