Scrape to single CSV instead of separate TXTs?

The static scraper only allows scraping website content to individual TXT files or one large TXT file. However, to further process the scraped data, I want it in a CSV/Google Sheet with one row for each URL, like column A = URL and column B = scraped content. How can this be achieved?

The static scraper doesn’t have CSV or Google spreadsheet support right now.

You could do it using the Dynamic scraper instead

eg:

Selectors are:

Project sample

title body csv.zip (1.7 KB)

1 Like

I did not think of that - thank you!

Let me know how it goes.

Also if the Google export works fine as well

Initial feedback:

  1. I use innerText with body selector because “detect article” skips too much of the content I need. The result is a broken CSV unfortunately, I guess because of the quotes + commas + semicolons inside the scraped content which results in new rows. I’m currently testing this with Google Sheets output, but seeing that it first writes everything into a CSV as well instead of an XLSX or something, I assume I’ll run into the same issue. Edit: yes, same issue. While the Google export does work fine, I end up with way more rows than input URLs because depending on the scraped content it creates a new row.

  2. I can’t figure out how to have the input URL in column A of the output file and the scraped content for that URL in column B. This is to be able to later merge the scraped data into existing sheets by matching the URLs within the existing sheets to parse the scraped content into the correct rows.

Can you export the project task for me?

I will have a closer look into it.

1 Like

I’ve sent the export via DM.

P.S. Point 2 might be something you may have an answer to