r - Split dataframe into two groups -
i've simulated data.frame
:
library(plyr); library(ggplot2) count <- rev(seq(0, 500, 20)) tide <- seq(0, 5, length.out = length(count)) df <- data.frame(count, tide) count_sim <- unlist(llply(count, function(x) rnorm(20, x, 50))) count_sim_df <- data.frame(tide=rep(tide,each=20), count_sim)
and can plotted this:
ggplot(df, aes(tide, count)) + geom_jitter(data = count_sim_df, aes(tide, count_sim), position = position_jitter(width = 0.09)) + geom_line(color = "red")
i want split count_sim_df
2 group: high
, low
. when plot split count_sim_df
, should (everything in green , blue photoshopped). bit i'm finding tricky getting overlap between high
, low
around middle values of tide
.
this how want split count_sim_df
high , low:
- assign half of
count_sim_df
high
, half ofcount_sim_df
low
- reassign values of
count
create overlap betweenhigh
,low
around middle values oftide
here's way generate sample dataset , groupings using relatively little code , base r:
library(ggplot2) count <- rev(seq(0, 500, 20)) tide <- seq(0, 5, length.out = length(count)) df <- data.frame(count, tide) count_sim_df <- data.frame(tide = rep(tide,each=20), count = rnorm(20 * nrow(df), rep(count, each = 20), 50)) margin <- 0.3 count_sim_df$`tide level` <- with(count_sim_df, factor((tide >= quantile(tide, 0.5 + margin / 2) | (tide >= quantile(tide, 0.5 - margin / 2) & sample(0:1, length(tide), true))), labels = c("low", "high"))) ggplot(df, aes(x = tide, y = count)) + geom_line(colour = "red") + geom_point(aes(colour = `tide level`), count_sim_df, position = "jitter") + scale_colour_manual(values = c(high = "green", low = "blue"))
Comments
Post a Comment