下面是我正在编写的scala代码中函数的输入数据:
List(
(1,SubScriptionState(CNN,ONLINE,Seq(12))),
(1,SubScriptionState(SKY,ONLINE,Seq(12))),
(1,SubScriptionState(FOX,ONLINE,Seq(12))),
(2,SubScriptionState(CNN,ONLINE,Seq(12))),
(2,SubScriptionState(SKY,ONLINE,Seq(12))),
(2,SubScriptionState(FOX,ONLINE,Seq(12))),
(2,SubScriptionState(CNN,OFFLINE,Seq(13))),
(2,SubScriptionState(SKY,ONLINE,Seq(13))),
(2,SubScriptionState(FOX,ONLINE,Seq(13))),
(3,SubScriptionState(CNN,OFFLINE,Seq(13))),
(3,SubScriptionState(SKY,ONLINE,Seq(13))),
(3,SubScriptionState(FOX,ONLINE,Seq(13)))
)SubscriptionState只是这里的一个案例类:
case class SubscriptionState(channel: Channel, state: ChannelState, subIds: Seq[Long])我想把它转化为:
Map(
1 -> Map(
SubScriptionState(SKY,ONLINE,Seq(12)) -> 1,
SubScriptionState(CNN,ONLINE,Seq(12)) -> 1,
SubScriptionState(FOX,ONLINE,Seq(12)) -> 1),
2 -> Map(
SubScriptionState(SKY,ONLINE,Seq(12,13)) -> 2,
SubScriptionState(CNN,ONLINE,Seq(12)) -> 1,
SubScriptionState(FOX,ONLINE,Seq(12,13)) -> 2,
SubScriptionState(CNN,OFFLINE,Seq(13)) -> 1),
3 -> Map(
SubScriptionState(SKY,ONLINE,Seq(13)) -> 1,
SubScriptionState(FOX,ONLINE,Seq(13)) -> 1,
SubScriptionState(CNN,OFFLINE,Seq(13)) -> 1)
)我该如何在scala中这样做呢?
发布于 2022-11-16 23:24:30
这是我处理这个问题的方法。我认为这可能不是一个完美的解决方案,但正如你所期望的那样有效。
val result: Map[Int, Map[SubscriptionState, Int]] = list
.groupBy(_._1)
.view
.mapValues { statesById =>
statesById
.groupBy { case (_, subscriptionState) => (subscriptionState.channel, subscriptionState.state) }
.map { case (_, groupedStatesById) =>
val subscriptionState = groupedStatesById.head._2 // groupedStatesById should contain at least one element
val allSubIds = groupedStatesById.flatMap(_._2.subIds)
val updatedSubscriptionState = subscriptionState.copy(subIds = allSubIds)
updatedSubscriptionState -> allSubIds.size
}
}.toMap发布于 2022-11-17 08:33:17
这是一个使用groupMap和groupMapReduce的“简单”解决方案。
list
.groupMap(_._1)(_._2)
.view
.mapValues{
_.groupMapReduce(ss => (ss.channel, ss.state))(_.subIds)(_ ++ _)
.map{case (k,v) => SubScriptionState(k._1, k._2, v) -> v.length}
}
.toMapgroupMap将数据转换为Map[Int, List[SubScriptionState]],mapValues将每个List转换为适当的Map。( view和toMap包装器使mapValues更加高效和安全。)
groupMapReduce将List[SubScriptionState]转换为Map[(Channel, ChannelState), List[SubId]]。
这个内部map上的Map会根据需要对这些值进行修改,以生成Map[SubScriptionState, Int]。
我不清楚内部Map的目的是什么。值是subIds字段的长度,因此它可以直接从键中获得,而不需要在Map中查找。
发布于 2022-11-17 12:42:00
使用foldLeft的一种尝试
list.foldLeft(Map.empty[Int, Map[SubscriptionState, Int]]) { (acc, next) =>
val subMap = acc.getOrElse(next._1, Map.empty[SubscriptionState, Int])
val channelSub = subMap.find { case (sub, _) => sub.channel == next._2.channel && sub.state == next._2.state }
acc + (next._1 -> channelSub.fold(subMap + (next._2 -> next._2.subIds.length)) { case (sub, _) =>
val subIds = sub.subIds ++ next._2.subIds
(subMap - sub) + (sub.copy(subIds = subIds) -> subIds.length)
})
}我注意到count不是在折叠时使用的,可以使用storeIds进行计算。而且,由于storeIds可能会有所不同,内部Map是相当无用的,因为您必须使用find而不是get从Map获取值。因此,如果您控制您的ADT,您可以使用中间ADT,例如:
case class SubscriptionStateWithoutIds(channel: Channel, state: ChannelState)然后,您可以重写您的foldLeft如下:
list.foldLeft(Map.empty[Int, Map[SubscriptionStateWithoutIds, Seq[Long]]]) { (acc, next) =>
val subMap = acc.getOrElse(next._1, Map.empty[SubscriptionStateWithoutIds, Seq[Long]])
val withoutId = SubscriptionStateWithoutIds(next._2.channel, next._2.state)
val channelSub = subMap.get(withoutId)
acc + (next._1 -> (subMap + channelSub.fold(withoutId -> next._2.subIds) { seq => withoutId -> (seq ++ next._2.subIds) }))
}中间层ADT的最大优点是您可以拥有一个更干净的groupMapReduce版本:
list.groupMap(_._1)(sub => SubscriptionStateWithoutIds(sub._2.channel, sub._2.state) -> sub._2.subIds)
.map { case (key, value) => key -> value.groupMapReduce(_._1)(_._2)(_ ++ _) }https://stackoverflow.com/questions/74467744
复制相似问题