Releases: webrecorder/browsertrix-crawler
Releases · webrecorder/browsertrix-crawler
Browsertrix Crawler v1.7.0
What's Changed
- Add option to save local/sessionStorage by @ikreymer in #856
- Support downloading seed file from URL by @tw4l in #852
- Use consistent profile directory name (merge 1.6.4 change) by @ikreymer in #859
- async fetch: allow retrying async fetch if interrupted by @ikreymer in #863
- Support option to fail crawl on content check by @ikreymer in #861
- Fix docs mistaking --waitUntil with --pageLoadTimeout by @tw4l in #864
- deps update: by @ikreymer in #867
- url queueing: log skipped URLs as errors if depth === 0 by @ikreymer in #868
- Add documentation for
--failOnContentCheck
and update CLI options in docs by @tw4l in #869 - Capitalization fix for log messages by @SuaYoo in #870
- quickfix: WACZ upload retry support: by @ikreymer in #871
- Don't trim to limit if limit is default of 0 by @tw4l in #873
- behavior logging: remove last line dupe check for behavior logs by @ikreymer in #874
- deps: bump base to brave 1.80.125 by @ikreymer in #875
New Contributors
Full Changelog: v1.6.4...v1.7.0
Browsertrix Crawler v1.7.0-beta.1
What's Changed
- Fix docs mistaking --waitUntil with --pageLoadTimeout by @tw4l in #864
- deps update: by @ikreymer in #867
- url queueing: log skipped URLs as errors if depth === 0 by @ikreymer in #868
- Add documentation for
--failOnContentCheck
and update CLI options in docs by @tw4l in #869
Full Changelog: v1.7.0-beta.0...v1.7.0-beta.1
Browsertrix Crawler v1.7.0-beta.0
What's Changed
- base: bump to brave 1.80.113 by @ikreymer in #857
- Add option to save local/sessionStorage by @ikreymer in #856
- Support downloading seed file from URL by @tw4l in #852
- Use consistent profile directory name (merge 1.6.4 change) by @ikreymer in #859
- async fetch: allow retrying async fetch if interrupted by @ikreymer in #863
- Support option to fail crawl on content check by @ikreymer in #861
Full Changelog: v1.6.3...v1.7.0-beta.0
Browsertrix Crawler v1.6.4
What's Changed
- profiles: use a fixed profile dir instead of creating a new temp dir on each load by @ikreymer in #858
Full Changelog: v1.6.3...v1.6.4
Browsertrix Crawler v1.6.3
What's Changed
- content-type compare for rewriting: use case-insensitive check by @ikreymer in #849
- Disable disk utilization check by default by @tw4l in #850
- cleanup: remove dead pywb code from argparser and docs by @rien333 in #847
- version: bump to 1.6.3 by @ikreymer in #851
New Contributors
Full Changelog: v1.6.2...v1.6.3
Browsertrix Crawler v1.6.2
What's Changed
- lang code fixes: by @ikreymer in #834
- Add WARC-Protocol header by @ikreymer in #715
- tmpdir: use os.tmpdir() instead of hardcoded '/tmp' by @ikreymer in #842
- Remove hardcoded /tmp prefix from path by @tw4l in #843
- optimization: normalize dedup status: treat 0 (response code not yet known) or 206 as 200… by @ikreymer in #835
- remove early serialization which may result in missing WARC-Protocol and security metadata by @ikreymer in #844
- deps: bump brave 1.79.118 by @ikreymer in #845
Full Changelog: v1.6.1...v1.6.2
Browsertrix Crawler v1.6.1
What's Changed
- state: add trimqueue() redis command to trim queue / seen list by @ikreymer in #821
- Config Policy Update by @ikreymer in #822
- Deps update 1.6.1 by @ikreymer in #826
- more dependency updates: by @ikreymer in #827
- support pause interrupt: by @ikreymer in #825
Full Changelog: v1.6.0...v1.6.1
Browsertrix Crawler v1.6.0
Browsertrix Crawler v1.6.0-beta.1
What's Changed
Full Changelog: v1.6.0-beta.0...v1.6.0-beta.1
Browsertrix Crawler v1.6.0-beta.0
What's Changed
Full Changelog: v1.5.11...v1.6.0-beta.0