下面是我想要运行的代码。这是Coursera的代码。我无法在以下代码中运行sqldf函数:-
data = read.delim(file = 'purchases.txt', header = FALSE, sep = '\t', dec = '.')
str(data)
colnames(data) = c('customer_id', 'purchase_amount', 'date_of_purchase')
data$date_of_purchase = as.Date(data$date_of_purchase, "%Y-%m-%d")
data$days_since = as.numeric(difftime(time1 = "2016-01-01",
time2 = data$date_of_purchase,
units = "days"))
head(data)
summary(data)
library(sqldf)
customers = sqldf("SELECT customer_id ,
MIN(days_since) AS 'recency',
COUNT(*) AS 'frequency',
AVG(purchase_amount) AS 'amount'
FROM data GROUP BY 1")发布于 2018-07-17 18:29:36
用户必须在R中安装sqldf包,才能在使用library()函数加载sqldf()函数后使用它。
要在R中安装sqldf,请使用install.packages()函数。
下面是操作码的一个完全可重现的版本,包括用于安装sqldf的install.packages()
textFile <- "
001,42.5,2017-01-01
001,38.7,2017-05-02
002,47.9,2017-06-05"
# commented out original data read section
# data = read.delim(file = 'purchases.txt', header = FALSE, sep = '\t', dec = '.')
# str(data)
# replace with inline data and read.csv()
data <- read.csv(text=textFile,header=FALSE,stringsAsFactors=FALSE)
colnames(data) = c('customer_id', 'purchase_amount', 'date_of_purchase')
data$date_of_purchase = as.Date(data$date_of_purchase, "%Y-%m-%d")
data$days_since = as.numeric(difftime(time1 = "2016-01-01",
time2 = data$date_of_purchase,
units = "days"))
head(data)
summary(data)
# only need to run install.packages() once
install.packages("sqldf")
library(sqldf)
customers = sqldf("SELECT customer_id ,
MIN(days_since) AS 'recency',
COUNT(*) AS 'frequency',
AVG(purchase_amount) AS 'amount'
FROM data GROUP BY 1")
customers对输出执行...and操作:
> customers
customer_id recency frequency amount
1 1 -486.7917 2 40.6
2 2 -520.7917 1 47.9
> https://stackoverflow.com/questions/51378771
复制相似问题