问我怎样才能在一个特定的网站上获得卖家的名字？
EN

Stack Overflow用户

提问于 2022-08-17 09:54:25

回答 1查看 53关注 0票数 -1

我在做一个刮网机。给定一个特定的网页，我试图获取位于右上角的卖方的名称(在这个olx站点上，您可以看到卖方的名字是Ionut)。当我运行下面的代码时，它应该在index.csv文件中写入名称，但是文件是空的。我认为问题在HTML解析器上，尽管在我看来它很好。

package main

import (
    "encoding/csv"
    "fmt"
    "log"
    "os"
    "path/filepath"

    "github.com/gocolly/colly"
)

func main() {
    //setting up the file where we store collected data
    fName := filepath.Join("D:\\", "go projects", "cwst go", "CWST-GO", "target folder", "index.csv")
    file, err := os.Create(fName)
    if err != nil {
        log.Fatalf("Could not create file, error :%q", err)
    }
    defer file.Close()
    //writer that writes the collected data into our file
    writer := csv.NewWriter(file)
    //after the file is written, what it is in the buffer goes in writer and then passed to file
    defer writer.Flush()

    //collector
    c := colly.NewCollector(
        colly.AllowedDomains("https://www.olx.ro/"),
    )

    //HTML parser
    c.OnHTML(".css-1fp4ipz", func(e *colly.HTMLElement) { //div class that contains wanted info

        writer.Write([]string{
            e.ChildText("h4"), //specific tag of the info
        })
    })

    fmt.Printf("Scraping page :  ")
    c.Visit("https://www.olx.ro/d/oferta/bmw-xdrixe-seria-7-2020-71000-tva-IDgp7iN.html")

    log.Printf("\n\nScraping Complete\n\n")
    log.Println(c)

}

web

web-scraping

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-08-17 10:53:53

您不需要在允许的域中添加https或/。

c := colly.NewCollector(
    colly.AllowedDomains("www.olx.ro"),
)

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/73386406

复制

相似问题

问我怎样才能在一个特定的网站上获得卖家的名字？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我怎样才能在一个特定的网站上获得卖家的名字？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我怎样才能在一个特定的网站上获得卖家的名字？
EN