How to parse a HTML page with Node.js?

Sometimes, we want to parse a HTML page with Node.js.

In this article, we’ll look at how to parse a HTML page with Node.js.

How to parse a HTML page with Node.js?

To parse a HTML page with Node.js, we can use the Cheerio library.

To install it, we run

npm i cheerio

Then we write

const cheerio = require('cheerio');
const $ = cheerio.load('<h2 class="title">Hello world</h2>');

$('h2.title').text('Hello there!');
$('h2').addClass('welcome');

$.html();

to call cheerio.load to load the HTML string into the $ object.

Then we call $ with the CSS selector string to select the elements we want to manipulate.

And we call text to set the text of the h2 element with class title.

And we call addClass to add the welcome class to all h2 elements.

We return the HTML of the manipulated document with $.html.

To get the HTML string from a web page, we can make a request with any HTTP client.

Conclusion

To parse a HTML page with Node.js, we can use the Cheerio library.