Tags: TheJokersThief/daft-scraper
Tags
Fix three issues I faced when using the scraper (#35) * Update USER_AGENT for scraper To mitigate denied error. * Update images in ListingMedia(Schema) This change mitigate the following issue: ```python Traceback (most recent call last): File "/Users/myuser/Documents/daft-main/main.py", line 49, in <module> for listing in listings: ^^^^^^^^ File "/Users/myuser/Documents/daft-main/.venv/lib/python3.12/site-packages/daft_scraper/search/init.py", line 57, in search yield from self._get_listings(listing_data) File "/Users/myuser/Documents/daft-main/.venv/lib/python3.12/site-packages/daft_scraper/search/init.py", line 92, in _get_listings Listing(ListingSchema().load(listing['listing'])), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/myuser/Documents/daft-main/.venv/lib/python3.12/site-packages/marshmallow/schema.py", line 792, in load return self._do_load( ^^^^^^^^^^^^^^ File "/Users/myuser/Documents/daft-main/.venv/lib/python3.12/site-packages/marshmallow/schema.py", line 999, in _do_load raise exc marshmallow.exceptions.ValidationError: {'media': {'images': {0: defaultdict(<class 'dict'>, {'imageLabels': {'value': ['Not a valid string.']}}), 2: defaultdict(<class 'dict'>, {'imageLabels': {'value': ['Not a valid string.']}}), 4: defaultdict(<class 'dict'>, {'imageLabels': {'value': ['Not a valid string.']}})}}} ``` * Update Page Logic `from = <value>` no longer works, Update the search to use `page = <value>` Have tested in my local env. Before updates the scrapper would keep load same page for 50 times..
Bump urllib3 from 1.26.18 to 1.26.19 (#29) Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.18 to 1.26.19. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/1.26.19/CHANGES.rst) - [Commits](urllib3/urllib3@1.26.18...1.26.19) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
PreviousNext