我使用漂亮的汤和python请求从一个网站的主html源(我想这就是所谓的动态引用)中的一个二级url (我想这就是所谓的动态引用)中获取了一些数据,它是以.js文件链接的形式出现的。我使用漂亮的soup获得了数据(一个列表列表),但它都是字符串格式,具有一些16000+的长度。它将每个条目、逗号等计数为一个条目。虽然后来我能够使用selenium获得所需的数据,但仍然有一种方法可以将我拥有的字符串数据转换为列表。
类似于由主url /网站引用的样本二级url。让我们来看看这个,
http://www.tennisabstract.com/cgi-bin/player.cgi?p=KeiNishikori
当我转到它的html代码时,它引用了下面这个文件中的数据。
<script type="text/javascript"
src="http://www.minorleaguesplits.com/tennisabstract/cgi-
bin/jsmatches/KeiNishikori.js"></script> 但是当我从这里提取我的数据时(我需要一个名为matchmx的变量),我得到了类似这样的东西,
["20170102",“布里斯班”,“硬”,"A","L","5","3","","F","6-2 2-6 6-3","3","Grigor Dimitrov","17","7","","R","25.6344969199","188","BUL","0","108","4","0","69","49","36","9","12","2","5","7","2","77","52","41","12","13","5","7","1","20170107-M-Brisbane-F-Grigor_Dimitrov-Kei_Nishikori.html","","","2017-M020-300","",
"20170102",“布里斯班”,“硬”,"A","W","5","3","","QF",“6-16-1”,"3","Jordan Thompson","79","","WC","R","22.7049965777","","AUS","0","61","3","0","34","19","18","10","7","0","0","1","2","47","28","15","5","7","3","8","2","2017-M020-295","","3","2",.....诸如此类,但都是单独的字符串,给我提供了类似于1000的长度。
发布于 2018-02-08 16:09:37
嗨,试试下面的代码
import ast
p='[["abcd","abcd"],["abcd","abcd"]]'
print ast.literal_eval(p) #[["abcd","abcd"],["abcd","abcd"]]
print type(ast.literal_eval(p)) #list关于post
https://stackoverflow.com/questions/48672245
复制相似问题