我正在使用go语言和在Go中构建的colly web scraping框架开发小型web scraping应用程序。
这是网站的html代码
<div clas="cc">
<div class="list">
<span class="countrybg" style="background-image: url(countryimage);"></span>
<span class="continet">Asia</span>
<span class="country">india</span>
</div>
<div class="list">
<span class="countrybg" style="background-image: url(countryimage);"></span>
<span class="continet">Africa</span>
<span class="country">Brazil</span>
</div>
</div> 现在,我想逐个获取所有三个span元素,并将其附加到array中
我尝试使用此代码,但它不起作用,但它返回为AsiaAfrica
但是我想要单独的值,并且想要获取countrybg类的图像url
c := make([]string, 10)
element.ForEach(".list span", func(_ int, elem *colly.HTMLElement) {
result := element.ChildText("span:nth-child(2)")
c = append(c, result)
})示例输出应如下所示
countrybg = ['image1url' ,'image2url']
continet = ['Asia' ,'Africa']
country = ['india' ,'Brazil']有谁能帮我拿到这个吗
发布于 2021-10-23 05:58:06
我在端口8081上运行了一台本地服务器,并尝试获取您要查找的值。有很多方法可以做你需要的事情,这只是其中之一:
package main
import (
"fmt"
"regexp"
"github.com/gocolly/colly"
)
func main() {
c := colly.NewCollector()
countrybgs := []string{}
continents := []string{}
countries := []string{}
r := regexp.MustCompile(`background-image: url\((.*)\);`)
/*
<div clas="cc">
<div class="list">
<span class="countrybg" style="background-image: url(image1url);"></span>
<span class="continet">Asia</span>
<span class="country">india</span>
</div>
<div class="list">
<span class="countrybg" style="background-image: url(image2url);"></span>
<span class="continet">Africa</span>
<span class="country">Brazil</span>
</div>
</div>
*/
c.OnHTML("span", func(e *colly.HTMLElement) {
switch class := e.Attr("class"); class {
case "countrybg":
countrybgs = append(countrybgs, r.FindStringSubmatch(e.Attr("style"))[1])
case "continet":
continents = append(continents, e.Text)
case "country":
countries = append(countries, e.Text)
}
})
c.Visit("http://localhost:8081")
fmt.Println(countrybgs)
fmt.Println(continents)
fmt.Println(countries)
}输出:
> go run .
[image1url image2url]
[Asia Africa]
[india Brazil]https://stackoverflow.com/questions/69647647
复制相似问题