If you are web scraping with Scrapy, you may want to scrape many categories, but not just scrape all links with a crawler.
If you can find a sitemap, in JSON format, you can flatten the structure, with its lists and dictionaries and then make a new list to use for your URLS or form query string parameters for your URLs to scrape.
Sound like hard work? Not really, 8 lines of code inside a function and off you go. Just print "type" regularly to check what type you are iterating through...
Timings:
0:00 Intro - About sitemaps
4:05 - Start the code
19:00 - Using slice to get 'code' and 'name'
Any questions, add a comment, I'll be pleased to reply!
Dr Pi.
#webscraping #json #sitemap
Watch video How to use Python to parse JSON sitemaps | Flatten nested dictionaries to get codes for WEB SCRAPING online, duration hours minute second in high quality that is uploaded to the channel Python 360 12 October 2020. Share the link to the video on social media so that your subscribers and friends will also watch this video. This video clip has been viewed 269 times and liked it 8 visitors.