Site scraper

A scraper which loads the target.url in a headless browser, compares the text of an element selected using a target.selector to the value specified by target.positive_text.

When a positive outcome is observed for a certain target, an email will be sent to the target.subscribers. Succeeding runs will send an email only if the result differs from the run before it.

Requirements

Python 3.9 with pip
A SendGrid account
Chromium (technically only the static libraries it depends on)

Running

Copy config.json.example to config.json and populate the values in it. The browser value is optional, pyppeteer will download a browser if not specified. Copy and modify SiteScraper.service.example to SiteScraper.service and hardlink it to your systemd unit library. Copy and modify SiteScraper.timer.example to SiteScraper.timer and hardlink it to your systemd unit library.

Run pip3 install -r requirements.txt to install dependencies

Run systemctl enable --now SiteScraper.timer

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
systemd		systemd
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.json.example		config.json.example
requirements.txt		requirements.txt
scraper.py		scraper.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Site scraper

Requirements

Running

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

abjugard/SiteScraper

Folders and files

Latest commit

History

Repository files navigation

Site scraper

Requirements

Running

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages