文章/答案/技术大牛

发布

社区首页 >问答首页 >如何从我从网络抓取中得到的数字中删除浮点数中的空空间？错误:无法将字符串转换为浮动：'1\xa0364‘

问如何从我从网络抓取中得到的数字中删除浮点数中的空空间？错误:无法将字符串转换为浮动：'1\xa0364‘
EN

Stack Overflow用户

提问于 2022-04-01 22:18:46

回答 3查看 326关注 0票数 -1

在代码中，我尝试从网站获取价格数据。该网站在价格中使用一个空空间，浮动类升起一个标志:无法将字符串转换为浮动：'1\xa0364‘这个代码应该从网站中提取价格，但是网站信息中价格中的空空间会导致错误。我不确定该代码是否有效，但它没有进一步研究其他功能。

这实际上是价格: 1364，但它给出了:1\ is 0364‘

请参阅代码：

URL = 'https://www.reebok.se/zig-kinetica-ii-edge-gore-tex/H05172.html'
headers={"user-Agent":'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0'}
def check_price():
    page = requests.get(URL , headers=headers)
    soup = BeautifulSoup(page.content, 'html.parser')
    
    title = soup.find( class_  = 'gl-heading gl-heading--regular gl-heading--italic name___1EbZs').get_text()
    print(title)
    price=soup.find( class_ ='gl-price-item gl-price-item--sale notranslate').get_text()
    converted_price= float(price[0:5])

python

python-3.x

web-scraping

precision

回答 3

Stack Overflow用户

回答已采纳

发布于 2022-04-01 23:36:30

如果您只想删除空白，您可以通过以下方式实现

split + join

>>> ''.join("1\xa0364".split())
'1364'

regex replace

>>> import re
>>> re.sub("\s", "", "1\xa0364")
'1364'

您可能还会发现这个答案很有用，它基本上从字符串中提取数字和小数点，而忽略了其他所有内容：Python Remove Comma In Dollar Amount --它有时可能会给出一些假阳性。

>>> other_option("Error: 404 file not found.  Try again in 10 seconds")
404.10

票数 1

Stack Overflow用户

发布于 2022-04-01 22:27:40

您可以将替换用于这类事情，您的代码应该如下所示：

price_str = "1\xa0364"
if len(price_str) >= 4 : # removing white space just for values with 4 or more chars
    price = float(price_str.replace(u'\xa0', u''))
else:
    price = float(price_str)

票数 1

Stack Overflow用户

发布于 2022-04-01 23:49:03

您还可以使用regex从脚本标记中提取已经格式化的脚本标记，以便使用“”进行简单的浮动转换。

import requests, re

URL = 'https://www.reebok.se/zig-kinetica-ii-edge-gore-tex/H05172.html'
HEADERS ={"user-Agent":'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0'}

def check_price():
    page = requests.get(URL , headers=HEADERS)  
    name, price = [re.search(f'(?<!Brand",)"{i}":"?(.*?)[",]', page.text).group(1) for i in ['name', 'price']]
    print(f'{name}: {float(price)}')
    
check_price()

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/71713275

复制

相似问题

问如何从我从网络抓取中得到的数字中删除浮点数中的空空间？错误:无法将字符串转换为浮动：'1\xa0364‘
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何从我从网络抓取中得到的数字中删除浮点数中的空空间？错误:无法将字符串转换为浮动：'1\xa0364‘EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何从我从网络抓取中得到的数字中删除浮点数中的空空间？错误:无法将字符串转换为浮动：'1\xa0364‘
EN