R语言base包 factor函数使用说明

返回R语言base包函数列表


功能\作用概述:

函数因子用于将向量编码为因子(术语“category”和“enumerated type”也用于因子)。如果参数ordered为TRUE,则假定factorlevels是有序的。为了与S兼容,还有一个函数是有序的。


语法\用法:

factor(x = character(), levels, labels = levels,
exclude = NA, ordered = is.ordered(x), nmax = NA)

ordered(x, ...)

is.factor(x)
is.ordered(x)

as.factor(x)
as.ordered(x)

addNA(x, ifany = FALSE)


参数说明:

x : 一种数据向量,通常取少量的离散值。

levels : x可能采用的唯一值(作为字符串)的可选向量。默认值是作为角色(x) ,按x的递增顺序排序。请注意,此集合可以指定为小于sort(unique(x))。

labels : 级别标签的可选字符向量(与删除exclude中的级别后的级别顺序相同),或长度为1的字符串。标签中的重复值可用于将x的不同值映射到同一因子级别。

exclude : 在形成水平集时要排除的值向量。这可能是与xor应该是字符的级别集相同的因子。

ordered : 逻辑标志,用于确定级别是否应重新排序(按给定的顺序)。

nmax : 级别数的上限;见“细节”。

... : (在命令中):除命令本身以外的上述任何一种。

ifany : 仅在使用时添加NA水平,如有(is.na公司(x) )。


示例\实例:

(ff < - factor(substring("statistics", 1:10, 1:10), levels = letters))
as.integer(ff) # the internal codes
(f. < - factor(ff)) # drops the levels that do not occur
ff[, drop = TRUE] # the same, more transparently

factor(letters[1:20], labels = "letter")

class(ordered(4:1)) # "ordered", inheriting from "factor"
z < - factor(LETTERS[3:1], ordered = TRUE)
## and "relational" methods work:
stopifnot(sort(z)[c(1,3)] == range(z), min(z) < max(z))


## suppose you want "NA" as a level, and to allow missing values.
(x < - factor(c(1, 2, NA), exclude = NULL))
is.na(x)[2] < - TRUE
x # [1] 1
is.na(x)
# [1] FALSE TRUE FALSE

## More rational, since R 3.4.0 :
factor(c(1:2, NA), exclude = "" ) # keeps , as
factor(c(1:2, NA), exclude = NULL) # always did
## exclude =
z # ordered levels 'A < B < C'
factor(z, exclude = "C") # does exclude
factor(z, exclude = "B") # ditto

## Now, labels maybe duplicated:
## factor() with duplicated labels allowing to "merge levels"
x < - c("Man", "Male", "Man", "Lady", "Female")
## Map from 4 different values to only two levels:
(xf < - factor(x, levels = c("Male", "Man" , "Lady", "Female"),
labels = c("Male", "Male", "Female", "Female")))
#> [1] Male Male Male Female Female
#> Levels: Male Female

## Using addNA()
Month < - airquality$Month
table(addNA(Month))
table(addNA(Month, ifany = TRUE))