如何获得每年日期之间的最小值、最大值和长度?
How to get Min, Max and Length between dates for each year?
我有一个类型为 RDD[String] 的 rdd 作为示例,这里是它的一部分:
1990,1990-07-08
1994,1994-06-18
1994,1994-06-18
1994,1994-06-22
1994,1994-06-22
1994,1994-06-26
1994,1994-06-26
1954,1954-06-20
2002,2002-06-26
1954,1954-06-23
2002,2002-06-29
1954,1954-06-16
2002,2002-06-30
...
结果:
(1982,52)
(2006,64)
(1962,32)
(1966,32)
(1986,52)
(2002,64)
(1994,52)
(1974,38)
(1990,52)
(2010,64)
(1978,38)
(1954,26)
(2014,64)
(1958,35)
(1998,64)
(1970,32)
I group it nicely, but my problem is this v.size part, I do not know to to calculate that length.
Just to put it in perspective, here are expected results:
It is not a mistake that there is two times for 2002. But ignore that.
定义日期格式:
val formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd")
并订购:
implicit val localDateOrdering: Ordering[LocalDate] = Ordering.by(_.toEpochDay)
创建一个接收 "v" 和 returns MAX(date_of_matching_year) - MIN(date_of_matching_year)) = LENGTH(以天为单位)的函数:
def f(v: Iterable[Array[String]]): Int = {
val parsedDates = v.map(LocalDate.parse(_(1), formatter))
parsedDates.max.getDayOfYear - parsedDates.min.getDayOfYear
然后用 f(v)
替换 v.size
我有一个类型为 RDD[String] 的 rdd 作为示例,这里是它的一部分:
1990,1990-07-08
1994,1994-06-18
1994,1994-06-18
1994,1994-06-22
1994,1994-06-22
1994,1994-06-26
1994,1994-06-26
1954,1954-06-20
2002,2002-06-26
1954,1954-06-23
2002,2002-06-29
1954,1954-06-16
2002,2002-06-30
...
结果: (1982,52) (2006,64) (1962,32) (1966,32) (1986,52) (2002,64) (1994,52) (1974,38) (1990,52) (2010,64) (1978,38) (1954,26) (2014,64) (1958,35) (1998,64) (1970,32)
I group it nicely, but my problem is this v.size part, I do not know to to calculate that length.
Just to put it in perspective, here are expected results:
It is not a mistake that there is two times for 2002. But ignore that.
定义日期格式:
val formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd")
并订购:
implicit val localDateOrdering: Ordering[LocalDate] = Ordering.by(_.toEpochDay)
创建一个接收 "v" 和 returns MAX(date_of_matching_year) - MIN(date_of_matching_year)) = LENGTH(以天为单位)的函数:
def f(v: Iterable[Array[String]]): Int = {
val parsedDates = v.map(LocalDate.parse(_(1), formatter))
parsedDates.max.getDayOfYear - parsedDates.min.getDayOfYear
然后用 f(v)
替换 v.size