bugsfamily: Python 爬蟲 Beautiful Soup + import re #正規表示式

星期四, 3月 14, 2019

Python 爬蟲 Beautiful Soup + import re #正規表示式

"""
Crawler
@author: Dazhuang
"""
import requests
from bs4 import BeautifulSoup
import re #正規表示式

s = 0
r = requests.get('https://book.douban.com/')
soup = BeautifulSoup(r.text, 'lxml')
pattern = soup.find_all('a') # 之前此处标签为'p', 'comment-content'
for item in pattern:
print(item.string)
pattern_s = re.compile('p = re.findall(pattern_s, r.text)
for star in p:
s += int(star)
print(s)

Google Code Prettify - 輕量級的語法上色工具

星期四, 3月 14, 2019

Python 爬蟲 Beautiful Soup + import re #正規表示式