Using jQuery to Parse HTML and Extract Data

Your web page may sometimes need to use information from other web pages that do not provide an API. For instance, you may need to fetch stock price information from a web page in real time and display it in a widget of your web page. However, some of the stock price aggregation websites don’t provide APIs.

In such cases, you need to retrieve the source HTML of the web page and manually find the information you need. This process of retrieving and manually parsing HTML to find specific information is known as web scraping.

In this tutorial, you’ll learn how to scrape a web page using jQuery, a fast and versatile tool for parsing and manipulating HTML. Although jQuery is traditionally used for efficiently interacting with HTML and CSS from client-side JavaScript, its DOM traversal and manipulation capabilities combined with its AJAX feature makes it a solid choice for web scraping.


cover image

What Is Client-Side Scraping?

Client-side scraping involves fetching a web page’s source as HTML using the page URL and parsing the information to obtain specific information.

For example, you might want to build a code search engine. A website such as Stack Overflow provides an API to access their questions and answers programmatically. However, other tutorial websites, such as this one from Draft.dev, have code blocks but do not supply an API for consuming information. To read their code blocks, you will have to use client-side scraping, as explained in this tutorial.

Implementing Client-Side Scraping Using jQuery

This tutorial shows you how to scrape a web page using jQuery. jQuery is a fast and powerful JavaScript library that supports HTML document traversal and manipulating HTML element attributes. It also has features that can handle events of HTML elements. jQuery uses CSS selectors to select objects.

Prerequisites

Start by adding a reference to the jQuery library using the