Python提取网页标签内容 10
<citeclass="CitationContent"id="CR1">Anderson,C.(2008).Theendoftheory:Thedatadelugema...
<cite class="CitationContent" id="CR1">
Anderson, C. (2008). The end of theory: The data deluge makes the scientific method obsolete.
<em class="EmphasisTypeItalic">Wired,</em>
<em class="EmphasisTypeItalic">16</em>, 07.
</cite>
如何只提取出Anderson, C. (2008). The end of theory: The data deluge makes the scientific method obsolete. 展开
Anderson, C. (2008). The end of theory: The data deluge makes the scientific method obsolete.
<em class="EmphasisTypeItalic">Wired,</em>
<em class="EmphasisTypeItalic">16</em>, 07.
</cite>
如何只提取出Anderson, C. (2008). The end of theory: The data deluge makes the scientific method obsolete. 展开
2个回答
展开全部
from bs4 import BeautifulSoup
html = """
<cite class="CitationContent" id="CR1">
Anderson, C. (2008). The end of theory: The data deluge makes the scientific method obsolete.
<em class="EmphasisTypeItalic">Wired,</em>
<em class="EmphasisTypeItalic">16</em>, 07.
</cite>
"""
soup = BeautifulSoup(html, 'html5lib')
print soup.find('cite').get_text()
推荐律师服务:
若未解决您的问题,请您详细描述您的问题,通过百度律临进行免费专业咨询