How to scrape content and email the results to yourself on an automated schedule

Use case

Scrape content using Dynamic scraper (or any tool really) then send (or post) the results when the task completes.

Fully automate this task so it runs once a day.

Setup Dynamic scraper

This can be applied to any task as they all support ‘chain task’ ie after done run.

We will use the dynamic scraper as the example and scrape Wikipedia for content.

  1. Scrape the Historical table data.
  2. In the example we are using #mw-content-text table as the selector.
  3. Add item

Set up the output

  1. Add new item to macro code
  2. Add current page as url target

Create schedule to run task once a day

  1. Find and hover over the schedule textbox
  2. A popup appears, select every hour and 0

This will run our task automatically every day at hour 0, ie midnight 12:00am

Click run to verify the scraper is saving content

image

Setup Post emailer

We want to email the results to ourselves. (We could also choose to post it on WP blog etc)

Create post emailer task

  1. Set article folder as Dynamic scraper output folder

An easy shortcut to find the output folder of the Dynamic scraper is to use ‘copy path’ from the preview article dropdown menu

Because we chain task, no need to send emails on a schedule

Clear the default 15min send email schedule.

Set email start to a date in the past. This will make emails send immediately.

  1. Set date in past
  2. Schedule 1 email for each article created
    ie 1 article for dynamic scraper example
  3. Select email to send to
  4. Check send list shows our email as send now

If there is no subject, change it to ‘Subject: filename’

image

This task setting is complete

Click run to test the email is being sent

Automate the workflow using chain task

Lets automate the 2 tasks

  1. Dynamic scraper runs once a day at midnight
  2. Chain it to start post emailer
  3. Post emailer runs and sends content to inbox

Edit the Dynamic scraper task

Find ‘After done. run’ box.

It is under the schedule text box.

Click get tasks

This loads all tasks into the dropdown

Find the post emailer task to start from the list

image

Completed settings should look like this.

Now click run

Output

Our Dynamic page task runs immediately

image

When it finishes it starts the Post emailer

Time stamps verifies it working

Double check the Dynamic scraper and verify its been scheduled to run again at midnight

image

Done

These 2 tasks are automated and will run endlessly.

1 Like