首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何根据r中的条件求差

如何根据r中的条件求差
EN

Stack Overflow用户
提问于 2021-07-29 18:14:28
回答 1查看 119关注 0票数 0

我在r中有下面提到的df:

df<-

代码语言:javascript
复制
ID    Date                 Instance    Price    Volume     Grade
K-1   2021-07-01 04:01:20     1        1000     2          5000
K-1   2021-07-02 22:01:29     2        1000     1.4        4500
K-1   2021-07-04 13:05:04     3        800      1.4        4500
K-2   2021-07-01 09:11:26     1        7000     1.5        3500
K-2   2021-07-01 14:01:40     2        2500     1.5        3000
K-3   2021-07-05 12:23:24     1        1000     2.1        2000
K-4   2021-07-06 14:11:40     1        15000    2.5        1500
K-4   2021-07-08 14:19:25     2        18000    2.7        2500

通过使用上面的dataframe,我需要根据特定的条件导出以下细节。

在哪里,我需要根据Price列,VolumeGrade中的变化是在更高的方面还是在更低的方面来识别。Instance只不过是一个特殊的惟一ID的原始修改。

  • Diff_DateTime是唯一ID的最后一个实例减去该ID的第一个实例之间的日期差异。如果小于24小时,则生成一个值(以小时为单位)或在Days.
  • Rng_Price中为桶0-500的差值范围,500-15001500-30003000+
  • Rng_Vol为桶0-0.5的差值范围,0.5-11-22+
  • Rng_Grd为桶0-5001, 500-1000, 1000-2000and2000+`

的差值范围。

初始和最终是基于实例的第一和最后的价格、数量和等级值。

所需Df<-

代码语言:javascript
复制
ID   Count_Instance    Chng_Pri    Chng_Vol  Chng_Grd     Diff_DateTime  Rng_Pric   Rng_Vol    Rng_Grd      Initial_Pri     Final_Pri      Initial_Vol   Final_Vol      Initial_Grd   Final_Grd
    
K-1    3             Low         Low       Low          3.5 Days       0-500      0.5-1.0     0-500        1000            800            2             1.4            5000          4500
K-2    2             Low         Constant  Low          5h             3000+      0        0-500        7000            2500           1.5           1.5            3500          3000
K-3    1             No Inst     No Inst   No Inst      No Inst        No Inst    No Inst     No Inst      No Inst       No Inst        No Inst       No Inst        No Inst       No Inst 
K-4    2             High        High      High         2 Days         1500-3000  0-0.5      500-1000     15000           18000          2.5           2.7            1500          2500
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-07-29 19:42:11

你可以用

代码语言:javascript
复制
library(tidyverse)

df %>% 
  group_by(ID) %>% 
  arrange(ID, Instance) %>% 
  summarise(
    Count_Instance = n(),
    Diff_Date = if_else(Count_Instance != 1, 
                        difftime(last(Date), first(Date), units=c("hours")),
                        NA_real_),
    Rng_Price = case_when(
      Count_Instance == 1                     ~ NA_character_,
      abs(last(Price) - first(Price)) <= 500  ~ "0-500",
      abs(last(Price) - first(Price)) <= 1500 ~ "500 - 1500",
      abs(last(Price) - first(Price)) <= 3000 ~ "1500 - 3000",
      TRUE                                    ~ "3000+"
      ),
    Rng_Vol = case_when(
      Count_Instance == 1                      ~ NA_character_,
      abs(last(Volume) - first(Volume)) == 0   ~ "0",
      abs(last(Volume) - first(Volume)) <= 0.5 ~ "0 - 0.5",
      abs(last(Volume) - first(Volume)) <= 1   ~ "0.5 - 1",
      abs(last(Volume) - first(Volume)) <= 2   ~ "1 - 2",
      TRUE                                     ~ "2+"
      ),
    Rng_Grd = case_when(
      Count_Instance == 1                     ~ NA_character_,
      abs(last(Grade) - first(Grade)) == 0    ~ "0",
      abs(last(Grade) - first(Grade)) <= 500  ~ "0 - 500",
      abs(last(Grade) - first(Grade)) <= 1000 ~ "500 - 1000",
      abs(last(Grade) - first(Grade)) <= 2000 ~ "1000 - 2000",
      TRUE                                    ~ "2000+"
      ),
    Initital_Pri = if_else(Count_Instance != 1, first(Price), NA_real_),
    Final_Pri    = if_else(Count_Instance != 1, last(Price), NA_real_),
    Initial_Vol  = if_else(Count_Instance != 1, first(Volume), NA_real_),
    Final_Vol    = if_else(Count_Instance != 1, last(Volume), NA_real_),
    Initial_Grd  = if_else(Count_Instance != 1, first(Grade), NA_real_),
    Final_Grd    = if_else(Count_Instance != 1, last(Grade), NA_real_)
    )

回传

代码语言:javascript
复制
# A tibble: 4 x 12
  ID    Count_Instance Diff_Date       Rng_Price   Rng_Vol Rng_Grd   Initital_Pri Final_Pri Initial_Vol Final_Vol
  <chr>          <int> <drtn>          <chr>       <chr>   <chr>            <dbl>     <dbl>       <dbl>     <dbl>
1 K-1                3 81.062222 hours 0-500       0.5 - 1 0 - 500           1000       800         2         1.4
2 K-2                2  4.837222 hours 3000+       0       0 - 500           7000      2500         1.5       1.5
3 K-3                1        NA hours NA          NA      NA                  NA        NA        NA        NA  
4 K-4                2 48.129167 hours 1500 - 3000 0 - 0.5 500 - 10~        15000     18000         2.5       2.7
# ... with 2 more variables: Initial_Grd <dbl>, Final_Grd <dbl>

缺少三件事:

您需要一些函数才能将provided.

  • Instead转换成您想要的格式。由于没有算法来创建实例计数为1的“format”值,所以缺少了character/string.

,所以不能将No Inst放入具有实数的列中,除非您想将这些数字转换为character/string.。

数据

代码语言:javascript
复制
df <- structure(list(ID = c("K-1", "K-1", "K-1", "K-2", "K-2", "K-3", 
"K-4", "K-4"), Date = structure(c(1625104880, 1625256089, 1625396704, 
1625123486, 1625140900, 1625480604, 1625573500, 1625746765), class = c("POSIXct", 
"POSIXt"), tzone = ""), Instance = c(1, 2, 3, 1, 2, 1, 1, 2), 
    Price = c(1000, 1000, 800, 7000, 2500, 1000, 15000, 18000
    ), Volume = c(2, 1.4, 1.4, 1.5, 1.5, 2.1, 2.5, 2.7), Grade = c(5000, 
    4500, 4500, 3500, 3000, 2000, 1500, 2500)), row.names = c(NA, 
-8L), class = c("tbl_df", "tbl", "data.frame"))
票数 4
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68581067

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档