Web Scraping News Sites to a database with Python, MySQL and Scrapy

Published: 01 January 2021
on channel: Python 360
2,286
20

Starting with a working Scrapy spider that yields to a CSV I modify the code, add pipelines.py and configure a MySQL database and troubleshoot user permissions and Scrapy files until we have a working spider sending data to MySQL.

The MySQL server is "remote" so I highlight the difference between:

GRANT ALL PRIVILEGES ON newz.* TO 'user1'@'localhost';
and
GRANT ALL PRIVILEGES ON newz.* TO 'user1'@'%';

There is still more to do on this, but if you can follow/use my code then you will be able to add more spiders easily as they will use the same items.py file, and use the same columns in the database.

Pipelines.py will also be set up ready, so the only adjustment to make will be to stop it deleting the table each time.

(# Comment out line 39 in pipelines.py)

All of the code is at : 🌏 https://github.com/RGGH/Scrapy14

I am purposefully documenting this project in depth as hopefull you can use it for reference when making your own Scrapy/MySQL spiders.

Chapters ##

0:00 Intro - me talking
3:12 Showing GitHub - Scrapy14
4:29 "ModuleNotFoundError" - a solution
5:42 Python script to format headers into a headers dictionary
5:58 Running Scrapy Spider as a Script
11:11 Empty database!
17:19 Comparing against a previous successful MySQL project
20:22 Pipelines now working
24:16 Allow larger VARCHAR for URL
26:00 Working!

How to Web Scrape Amazon using Python, Scrapy and MySQL | View output in phpmyadmin
==========================================================================
🌏    • How to Web Scrape Amazon using Python...  

Visit redandgreen blog for more Tutorials
=========================================
🌏 http://redandgreen.co.uk/about/blog/

Subscribe to the YouTube Channel
=================================
🌏    / drpicode  

Follow on Twitter - to get notified of new videos
=================================================
🌏   / rngweb  

Buy Dr Pi a coffee (or Tea)
☕ https://www.buymeacoffee.com/DrPi

Thumbs up yeah? (cos Algos..)

#webscraping #MySQL #python


Watch video Web Scraping News Sites to a database with Python, MySQL and Scrapy online, duration hours minute second in high quality that is uploaded to the channel Python 360 01 January 2021. Share the link to the video on social media so that your subscribers and friends will also watch this video. This video clip has been viewed 2,286 times and liked it 20 visitors.