Individual URL scraping

I am new to SCM.
Is there a way to scrape content within individual URLs?

  1. get the URL of each job page on the job list page
  2. extract the content in each URL obtained in 1.

I was able to do 1, but I don’t know how to do 2.
I would also like to know if it is possible to execute JavaScript when doing 2.
Specifically, I want to retrieve the text contain the @ symbol in the page text for 2. If there is a way to do this without using JavaScript, I would like to know that as well.

Just paste the url of those pages into the google maps scraper

1 Like

Does that mean it is impossible to do it all at once?

Yep, just paste in as many urls as you need

Understood.
It would be helpful to be able to get a list and scrape individual pages, like a loop function.

Not sure what you mean.

You want something to generate list of pages to scrape?

Sorry for the lack of clarity.
As a result, I want to retrieve the list and the content of the individual items in the list at once.
For example, scraping the content of the Amazon product list page and the individual pages for each of those products at once.
By looping, I mean going from the list page to the individual page, then back to the list page and then to the individual page of the next item.
Is this clear to you?

Gotcha,

Unfortunately it will have to be a 2 step process, that exists in 2 tasks.

  1. Gather all individual pages
  2. Process each page individually

The dynamic scraper doesn’t have any flow control or looping

The proposal would be to allow this via webhooks

eg:

  1. Task runs and scrapes a list of urls
  2. Output set to webhook → post json data
  3. The webhook calls SCM api to duplicate or edit existing task and just update the list of target urls using the output
  4. Webhook runs new task

I mention web hooks because after the google sheets integration is finished I want to allow SCM to start integrating with other external tools via web hooks
eg Webhooks's triggers, queries, and actions - IFTTT

Of course webhooks is complicated and like programming, so I will need to find ways to make it as pain less as possible.

1 Like

I see.
It wouldn’t take that much work to get them separately, so I’ll work on them separately now.
Thanks for the support!

By the way, can I use Webhook?
Is there a documentation page?

not yet ready!

But it will be here:

Feature is here:

2 Likes