R variable names in loop, get, etc -


still relatively new r. trying have dynamic variables in loop running sorts of problems. initial code looks (but bigger)

data.train$pclass_f <- as.factor(data.train$pclass) data.test$pclass_f <- as.factor(data.test$pclass) 

which i'm trying build loop, imagining this

datalist <- c("data.train", "data.test") (i in datalist){   i$pclass_f <- as.factor(i$pclass) } 

which doesn't work. little research implies inorder convert string datalist variable need use get function. next attempt

datalist <- c("data.train", "data.test") (i in datalist){   get(i$pclass_f) <- as.factor(get(i$pclass)) } 

which still doesn't work error in i$pclass : $ operator invalid atomic vectors. tried

datalist <- c("data.train", "data.test") (i in datalist){   get(i)$pclass_f <- as.factor(get(i)$pclass) } 

which still doesn't work error in get(i)$pclass_f <- as.factor(get(i)$pclass) : not find function "get<-". tried

datalist <- c("data.train", "data.test") (i in datalist){   get(i[pclass_f]) <- as.factor(get(i[pclass])) } 

which still doesn't work error in get(i[pclass]) : object 'pclass' not found. tried

datalist <- c("data.train", "data.test") (i in datalist){   get(i)[pclass_f] <- as.factor(get(i)[pclass]) } 

which still doesn't work error in '[.data.frame'(get(i), pclass) : object 'pclass' not found

now realized never included data nobody can run themselves, show it's not data problem

> class(data.train$pclass) [1] "integer" > class(data.test$pclass) [1] "integer" > datalist [1] "data.train" "data.test"  

the problem have relates way data frames , other objects treated in r. in many programming languages, objects (or @ least can be) passed functions reference. in c++ if pass pointer object function manipulates object, original modified. not way things work part in r.

when object created this:

x <- list(a = 5, b = 9) 

and copied this:

y <- x 

initially y , x point same object in ram. y modified @ all, copy created. assigning y$c <- 12 has no effect on x.

get() doesn't return named object in way can modified without first assigning variable (which mean original variable left unaltered).

the correct way of doing in r storing data frames in named list. can loop through list , use replacement syntax change columns.

datalist <- list(data.train = data.train, data.test = data.test) (df in names(datalist)){   datalist[[df]]$pclass_f <- as.factor(datalist[[df]]$pclass_f) } 

you use:

datalist <- setnames(lapply(list(data.train, data.test), function(data) {   data$pclass_fb <- as.factor(data$pclass_fb)   data }), c("data.train", "data.test")) 

this using lapply process each member of list, returning new list modified columns.

in theory, achieve trying using [[ operator on global environment, unconventional way of doing things , may lead confusion later on.


Comments

Popular posts from this blog

android - Gradle sync Error:Configuration with name 'default' not found -

java - Andrioid studio start fail: Fatal error initializing 'null' -

html - jQuery UI Sortable - Remove placeholder after item is dropped -