bigdata - Group by time AND another dimension in R (xts matrix)? -
i trying use apply.daily/weekly/monthly functions xts in r, need have apply function work on subsets @ time. example,
x=xts(data.frame(value=1:100,code=rep(1:5,20)),                  seq(as.date('2011-01-01'),by=1,length.out=100))  step 1: i'd roll-up week , "code", i'd have like
row 1: week = week 1, code = 1, sum(all entries have code of 1 fall in week 1)
 row 2: week = week 1, code = 2, sum(all entries have code of 2 fall in week 1) ... row 70: week = week 10, code = 1, sum(all entries have code of 1 fall in week 10)
step 2: i'd same number of rows each week, because want matrix--one row per code , 1 column per week. i'd prefer not create separate week variables first answer suggests because i'm going need cut again month, day, hour, minute, , maybe custom time duration. i'm happy bypass first step, because that's intermediate output.
unfortunately in real data can't subset manually because have 10,000+ "codes" , 53m rows.
if data  x <- data.frame(value=1:100,code=rep(1:5,20), date = seq(as.date('2011-01-01'),by=1,length.out=100)) i'd try
library(dplyr)  # make new column includes week of date x$date <- as.posixct(x$date) x$week <- as.character(strftime(x$date,format="%w"))  result <- x %>% group_by(week,code) %>% summarize(sum = sum(value))   #    week code sum # 1    00    1   1 # 2    00    2   2 # 3    01    1   6 # 4    01    2   7 # 5    01    3  11 # 6    01    4  13 # 7    01    5   5 # 8    02    1  27 # 9    02    2  12 # 10   02    3  13 # ..  ...  ... ... 
Comments
Post a Comment