R variable names in loop, get, etc -
still relatively new r. trying have dynamic variables in loop running sorts of problems. initial code looks (but bigger)
data.train$pclass_f <- as.factor(data.train$pclass) data.test$pclass_f <- as.factor(data.test$pclass)
which i'm trying build loop, imagining this
datalist <- c("data.train", "data.test") (i in datalist){ i$pclass_f <- as.factor(i$pclass) }
which doesn't work. little research implies inorder convert string datalist
variable need use get
function. next attempt
datalist <- c("data.train", "data.test") (i in datalist){ get(i$pclass_f) <- as.factor(get(i$pclass)) }
which still doesn't work error in i$pclass : $ operator invalid atomic vectors
. tried
datalist <- c("data.train", "data.test") (i in datalist){ get(i)$pclass_f <- as.factor(get(i)$pclass) }
which still doesn't work error in get(i)$pclass_f <- as.factor(get(i)$pclass) : not find function "get<-"
. tried
datalist <- c("data.train", "data.test") (i in datalist){ get(i[pclass_f]) <- as.factor(get(i[pclass])) }
which still doesn't work error in get(i[pclass]) : object 'pclass' not found
. tried
datalist <- c("data.train", "data.test") (i in datalist){ get(i)[pclass_f] <- as.factor(get(i)[pclass]) }
which still doesn't work error in '[.data.frame'(get(i), pclass) : object 'pclass' not found
now realized never included data nobody can run themselves, show it's not data problem
> class(data.train$pclass) [1] "integer" > class(data.test$pclass) [1] "integer" > datalist [1] "data.train" "data.test"
the problem have relates way data frames , other objects treated in r. in many programming languages, objects (or @ least can be) passed functions reference. in c++ if pass pointer object function manipulates object, original modified. not way things work part in r.
when object created this:
x <- list(a = 5, b = 9)
and copied this:
y <- x
initially y
, x
point same object in ram. y modified @ all, copy created. assigning y$c <- 12
has no effect on x
.
get()
doesn't return named object in way can modified without first assigning variable (which mean original variable left unaltered).
the correct way of doing in r storing data frames in named list
. can loop through list , use replacement syntax change columns.
datalist <- list(data.train = data.train, data.test = data.test) (df in names(datalist)){ datalist[[df]]$pclass_f <- as.factor(datalist[[df]]$pclass_f) }
you use:
datalist <- setnames(lapply(list(data.train, data.test), function(data) { data$pclass_fb <- as.factor(data$pclass_fb) data }), c("data.train", "data.test"))
this using lapply process each member of list, returning new list modified columns.
in theory, achieve trying using [[
operator on global environment, unconventional way of doing things , may lead confusion later on.
Comments
Post a Comment