我有点新的去,并试图刮几个网页使用colly。两个页面有不完整的链接,下面是代码和输出
func PaloNet() {
c := colly.NewCollector(
colly.AllowedDomains("security.paloaltonetworks.com"),
)
c.OnHTML(".list", func(e *colly.HTMLElement) {
PaloNetlinks := e.ChildAttrs("a", "href")
fmt.Println("\n\n PaloAlto Security: \n\n", PaloNetlinks)
})
c.Visit("https://security.paloaltonetworks.com/")
}输出:
/CVE-2022-0031 /CVE-2022-003/CVE-2022-0006 /CVE-2022-0030 /CVE-2022-0029 /PAN-SA-2022-0005 /CVE-2022-28199 /PAN-SA-2022-0004 /CVE-2022-0028 /PAN-2022-0003 /CVE-2022-0024 /CVE-2022-0026 /CVE-2022-0025 /CVE-2022-0027 /PAN-2022-0001/PAN-2022 0002/CVE-2022-2022/CVE 2022-2022/CVE-2022-0022 /CVE-2022-0022 /CVE-2021-44142 /CVE-2022-0016 /CVE-2022-0017 /CVE-2022-0020 /CVE-2022-0011 /csv?
如您所见,链接缺少“https://security.paloaltonetworks.com/”部分。添加链接开始的最佳方式是什么?
发布于 2022-11-22 11:14:43
你可以这样做
func PaloNet() {
visitUrl := "https://security.paloaltonetworks.com"
urls := []string{}
c := colly.NewCollector(
colly.AllowedDomains("security.paloaltonetworks.com"),
)
c.OnHTML(".list", func(e *colly.HTMLElement) {
PaloNetlinks := e.ChildAttrs("a", "href")
for i := 0; i < len(PaloNetlinks); i++ {
urls = append(urls, visitUrl+PaloNetlinks[i])
}
fmt.Println("\n\n PaloAlto Security: \n\n", urls)
})
c.Visit(visitUrl)
}https://stackoverflow.com/questions/74531470
复制相似问题