char_tolower and char_toupper are replacements for tolower and toupper based on the stringi package. The stringi functions for case conversion are superior to the base functions because they correctly handle case conversion for Unicode. In addition, the *_tolower functions provide an option for preserving acronyms.

char_tolower(x, keep_acronyms = FALSE, ...)

char_toupper(x, ...)

Arguments

x

the input object whose character/tokens/feature elements will be case-converted

keep_acronyms

logical; if TRUE, do not lowercase any all-uppercase words (applies only to *_tolower functions)

...

additional arguments passed to stringi functions, (e.g. stri_trans_tolower), such as locale

Examples

txt <- c(txt1 = "b A A", txt2 = "C C a b B") char_tolower(txt)
#> txt1 txt2 #> "b a a" "c c a b b"
char_toupper(txt)
#> txt1 txt2 #> "B A A" "C C A B B"
# with acronym preservation txt2 <- c(text1 = "England and France are members of NATO and UNESCO", text2 = "NASA sent a rocket into space.") char_tolower(txt2)
#> text1 #> "england and france are members of nato and unesco" #> text2 #> "nasa sent a rocket into space."
char_tolower(txt2, keep_acronyms = TRUE)
#> text1 #> "england and france are members of NATO and UNESCO" #> text2 #> "NASA sent a rocket into space."
char_toupper(txt2)
#> text1 #> "ENGLAND AND FRANCE ARE MEMBERS OF NATO AND UNESCO" #> text2 #> "NASA SENT A ROCKET INTO SPACE."