bigdata - Group by time AND another dimension in R (xts matrix)? -
i trying use apply.daily/weekly/monthly functions xts in r, need have apply function work on subsets @ time. example,
x=xts(data.frame(value=1:100,code=rep(1:5,20)), seq(as.date('2011-01-01'),by=1,length.out=100))
step 1: i'd roll-up week , "code", i'd have like
row 1: week = week 1, code = 1, sum(all entries have code of 1 fall in week 1)
row 2: week = week 1, code = 2, sum(all entries have code of 2 fall in week 1) ... row 70: week = week 10, code = 1, sum(all entries have code of 1 fall in week 10)
step 2: i'd same number of rows each week, because want matrix--one row per code , 1 column per week. i'd prefer not create separate week variables first answer suggests because i'm going need cut again month, day, hour, minute, , maybe custom time duration. i'm happy bypass first step, because that's intermediate output.
unfortunately in real data can't subset manually because have 10,000+ "codes" , 53m rows.
if data x <- data.frame(value=1:100,code=rep(1:5,20), date = seq(as.date('2011-01-01'),by=1,length.out=100))
i'd try
library(dplyr) # make new column includes week of date x$date <- as.posixct(x$date) x$week <- as.character(strftime(x$date,format="%w")) result <- x %>% group_by(week,code) %>% summarize(sum = sum(value)) # week code sum # 1 00 1 1 # 2 00 2 2 # 3 01 1 6 # 4 01 2 7 # 5 01 3 11 # 6 01 4 13 # 7 01 5 5 # 8 02 1 27 # 9 02 2 12 # 10 02 3 13 # .. ... ... ...
Comments
Post a Comment