这些东西没什么实用价值,但每个网站的反爬手段都不一样,多爬些网站,练习不同的反反爬。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
from lxml import etree import requests from fake_useragent import UserAgent url = 'https://www.qidian.com/rank/yuepiao?style=1' headers = { 'User-agent':UserAgent().random } response = requests.get(url,headers=headers).text <em># 实例化etree对象</em> e = etree.HTML(response) <em># 解析数据获取书名列表</em> names = e.xpath('//*[@id=""rank-view-list""]/div/ul/li/div[2]/h4/a/text()') <em># 解析数据获取作者列表</em> authors = e.xpath('//*[@id=""rank-view-list""]/div/ul/li/div[2]/p[1]/a[1]/text()') for name,authors in zip(names,authors): print(name,':',author) |
近期评论