_ _ _ ____
_ _ _ __| |_ ____ _| |_ ___| |__ |___ \
| | | | '__| \ \ /\ / / _` | __/ __| '_ \ __) |
| |_| | | | |\ V V / (_| | || (__| | | | / __/
\__,_|_| |_| \_/\_/ \__,_|\__\___|_| |_| |_____|
... monitors webpages for you
urlwatch is intended to help you watch changes in webpages and get notified (via e-mail, in your terminal or through various third party services) of any changes. The change notification will include the URL that has changed and a unified diff of what has changed.
Quick Start¶
Run
urlwatch
once to migrate your old data or start freshUse
urlwatch --edit
to customize your job list (this will create/editurls.yaml
)Use
urlwatch --edit-config
if you want to set up e-mail sendingAdd
urlwatch
to your crontab (crontab -e
) to monitor webpages periodically
The checking interval is defined by how often you run urlwatch
. You
can use e.g. crontab.guru to figure out the
schedule expression for the checking interval, we recommend not more
often than 30 minutes (this would be */30 * * * *
). If you have
never used cron before, check out the crontab command
help.
On Windows, cron
is not installed by default. Use the Windows Task
Scheduler
instead, or see this StackOverflow
question for
alternatives.
The Handbook¶
- Introduction
- Dependencies
- Jobs
- Filters
- Built-in filters
- Picking out elements from a webpage
- Chaining multiple filters
- Extracting only the
<body>
tag of a page - Filtering based on an XPath expression
- Filtering based on CSS selectors
- Using XPath and CSS filters with XML and exclusions
- Filtering PDF documents
- Sorting of webpage content
- Reversing of lines or separated items
- Watching Github releases
- Remove or replace text using regular expressions
- Using a shell script as a filter
- Reporters
- Advanced Topics
- Adding URLs from the command line
- Using word-based differences
- Ignoring connection errors
- Overriding the content encoding
- Changing the default timeout
- Supplying cookie data
- Comparing with several latest snapshots
- Receving a report every time urlwatch runs
- Using Redis as a cache backend
- Watching changes on .onion (Tor) pages
- Watching Facebook Page Events
- Migration from 1.x