不幸的是,我还不是一个有经验的刮板工。然而,我需要从雅虎金融公司( Yahoo )获得多只股票的关键统计数据。
我有点熟悉使用read_html、html_nodes()和html_text()直接从html中抓取数据。然而,这个网页MSFT的关键数据有点复杂,我不确定是否所有的统计数据都保存在XHR、JS或Doc中。我猜数据存储在JSON中。
如果有人知道用R提取和解析这个网页的数据的好方法,请回答我的问题,非常感谢!
或者如果有更方便的方法通过quantmod或Quandl提取这些指标,请告诉我,这将是一个非常好的解决方案!
目标是将票证/符号作为行名/行标签,而将统计信息标识为列。我的需求的一个例子可以在这个Finviz链接上找到:
https://finviz.com/screener.ashx
我之所以想搜集雅虎财务数据,是因为雅虎也考虑了企业、EBITDA的关键统计.
编辑:我指的是密钥统计页面。例如..。: https://finance.yahoo.com/quote/MSFT/key-statistics/ .代码应该导致一个数据帧、股票符号行和键统计数据的列.。
发布于 2018-12-30 19:09:41
代码
library(rvest)
library(tidyverse)
# Define stock name
stock <- "MSFT"
# Extract and transform data
df <- paste0("https://finance.yahoo.com/quote/", stock, "/financials?p=", stock) %>%
read_html() %>%
html_table() %>%
map_df(bind_cols) %>%
# Transpose
t() %>%
as_tibble()
# Set first row as column names
colnames(df) <- df[1,]
# Remove first row
df <- df[-1,]
# Add stock name column
df$Stock_Name <- stock结果
Revenue `Total Revenue` `Cost of Revenu… `Gross Profit`
<chr> <chr> <chr> <chr>
1 6/30/2… 110,360,000 38,353,000 72,007,000
2 6/30/2… 96,571,000 33,850,000 62,721,000
3 6/30/2… 91,154,000 32,780,000 58,374,000
4 6/30/2… 93,580,000 33,038,000 60,542,000
# ... with 25 more variables: ...编辑:
或者,为了方便起见,作为一种功能:
get_yahoo <- function(stock){
# Extract and transform data
x <- paste0("https://finance.yahoo.com/quote/", stock, "/financials?p=", stock) %>%
read_html() %>%
html_table() %>%
map_df(bind_cols) %>%
# Transpose
t() %>%
as_tibble()
# Set first row as column names
colnames(x) <- x[1,]
# Remove first row
x <- x[-1,]
# Add stock name column
x$Stock_Name <- stock
return(x)
}用法:get_yahoo(stock)
发布于 2018-12-30 19:55:19
我希望这就是你想要的:
library(quantmod)
library(plyr)
what_metrics <- yahooQF(c("Price/Sales",
"P/E Ratio",
"Price/EPS Estimate Next Year",
"PEG Ratio",
"Dividend Yield",
"Market Capitalization"))
Symbols<-c("XOM","MSFT","JNJ","GE","CVX","WFC","PG","JPM","VZ","PFE","T","IBM","MRK","BAC","DIS","ORCL","PM","INTC","SLB")
metrics <- getQuote(paste(Symbols, sep="", collapse=";"), what=what_metrics)以获得度量的列表
yahooQF()发布于 2018-12-30 19:23:52
您可以使用获得多个盗版。
library(quantmod)
Symbols<-c("XOM","MSFT","JNJ","GE","CVX","WFC","PG","JPM","VZ","PFE","T","IBM","MRK","BAC","DIS","ORCL","PM","INTC","SLB")
StartDate <- as.Date('2015-01-01')
Stocks <- lapply(Symbols, function(sym) {
Cl(na.omit(getSymbols(sym, from=StartDate, auto.assign=FALSE)))
})
Stocks <- do.call(merge, Stocks)在这种情况下,我得到函数Cl()中的收盘价。
https://stackoverflow.com/questions/53980350
复制相似问题