log( "CHILD: url received from parent process", url) Ĭonst browser = await puppeteer. The code snippet below is a simple example of running parallel downloads with Puppeteer.Ĭonst downloadPath = path. Puppeteer is a headless Node library that provides a high level API for controlling Chromium or. 💡 If you are not familiar with how child process work in Node I highly encourage you to give this article a read. We can combine the child process module with our Puppeteer script and download files in parallel. Child process is how Node.js handles parallel programming. We can fork multiple child_proces in Node. Our CPU cores can run multiple processes at the same time. 💡 Learn more about the single threaded architecture of node here Puppeteer is a headless Node library that provides a high level API for controlling Chromium or Chrome over the DevTools protocol. Therefore if we have to download 10 files each 1 gigabyte in size and each requiring about 3 mins to download then with a single process we will have to wait for 10 x 3 = 30 minutes for the task to finish. It can only execute one process at a time. You see Node.js in its core is a single-threaded system.
However, if you have to download multiple large files things start to get complicated. In this next part, we will dive deep into some of the advanced concepts.