Hello. How would you suggest that I replace the character values in one variable, with made up labels? My goal is to anonymize the names that are contained in the variable. The actual data has over 50 observations and over 15 unique name values.
# package library
library(tidyverse)
# create sample data
sample_data <- tibble(
name = c("james", "mary", "michael", "patricia", "james", "mary", "michael", "patricia", "james", "mary", "michael", "patricia"),
value = c(sample(x = as.character(seq(100, 300, 100)), size = 12, replace = TRUE))
)
sample_data
#> # A tibble: 12 × 2
#> name value
#> <chr> <chr>
#> 1 james 200
#> 2 mary 100
#> 3 michael 200
#> 4 patricia 300
#> 5 james 100
#> 6 mary 100
#> 7 michael 300
#> 8 patricia 100
#> 9 james 200
#> 10 mary 100
#> 11 michael 300
#> 12 patricia 300
# goal is to recode/relabel/anonymize the name to a new name
# something like this
target_data <- tibble(
name = c("apple", "banana", "cherry", "date", "apple", "banana", "cherry", "date", "apple", "banana", "cherry", "date"),
value = c(sample(x = as.character(seq(100, 300, 100)), size = 12, replace = TRUE))
)
target_data
#> # A tibble: 12 × 2
#> name value
#> <chr> <chr>
#> 1 apple 200
#> 2 banana 200
#> 3 cherry 200
#> 4 date 300
#> 5 apple 300
#> 6 banana 200
#> 7 cherry 200
#> 8 date 100
#> 9 apple 200
#> 10 banana 200
#> 11 cherry 100
#> 12 date 300
Created on 2024-10-23 with reprex v2.1.0
EDIT: I marked a solve before I could respond properly and locked myself out of the thread. Anyway. I've selected the suggestion by mduvekot as the best answer because it works for the example and my actual data. However, I am going to investigate how to implement the suggestion by AlexisW as well.
I appreciate your time and attention.