INTRODUCTION TO R
Factor levels
When you first get a data set, you will often notice that it contains
factors with specific factor levels. However, sometimes you will want to
change the names of these levels for clarity or other reasons. R allows
you to do this with the function levels():
o levels(factor_vector) <- c("name1", "name2",...)
A good illustration is the raw data that is provided to you by a survey. A
common question for every questionnaire is the sex of the respondent.
Here, for simplicity, just two categories were recorded, "M" and "F".
(You usually need more categories for survey data; either way, you use a
factor to store the categorical data.)
o survey_vector <- c("M", "F", "F", "M", "M")
Recording the sex with the abbreviations "M" and "F" can be convenient
if you are collecting data with pen and paper, but it can introduce
confusion when analyzing the data. At that point, you will often want to
change the factor levels to "Male" and "Female" instead of "M" and "F"
for clarity.
Watch out: the order with which you assign the levels is important. If
you type levels(factor_survey_vector), you'll see that it outputs [1] "F"
"M". If you don't specify the levels of the factor when creating the
vector, R will automatically assign them alphabetically. To correctly map
"F" to "Female" and "M" to "Male", the levels should be set to
c("Female", "Male"), in this order.
o In other words, place levels in ABC order in the code when
assigning levels