Hi all, I'm new to working with strings and characters in R. Wondering if someone could illustrate how to separate first name and last name into separate columns. Additionally, I would like to be able to match the person's title, email and phone with their name in my data frame. Any assistance would be appreciated. Thanks!
I think the following code gets you what you need for this particular data set. It would not work if the names were not single words. For example, if Ursula von der Leyen were in the list, the code would give her last name as von and push der Leyen into the position column. I also took advantage of the email addresses and phone numbers being in the same order as the names, so I just had to pick out the odd elements of the emailandphone vector to get the emails and the even elements to get the phone numbers.
library("dplyr")
library("rvest")
library(tidyr)
link = "https://dps.mn.gov/divisions/hsem/contact/Pages/staffcontacts.aspx"
page = read_html(link)
name = page %>% html_nodes(".dps_staffcontactpage_title") %>% html_text()
head(name)
#> [1] "Joe Kelly, Director"
#> [2] "Kevin Reed, Deputy Director"
#> [3] "Devan Armstrong, Program and Policy Analysist"
#> [4] "Jacob Beauregard, Mutual Aid and Logistics Coordinator"
#> [5] "Robert Berg, REP Planner"
#> [6] "Mari Bostrom, REP Planner, GIS"
emailandphone = page %>% html_nodes(".dps_staffcontactpage") %>% html_text()
head(emailandphone)
#> [1] "joseph.kelly@state.mn.us" "651-201-7404"
#> [3] "kevin.reed@state.mn.us" "651 201-7405"
#> [5] "devan.armstromg@state.mn.us" "651-201-7494"
#MNEMA = as_tibble(name, emailandphone, stringsAsFactors = FALSE)
#MNEMA
Emails <- emailandphone[seq(from = 1, to = 39, by = 2)]
head(Emails)
#> [1] "joseph.kelly@state.mn.us" "kevin.reed@state.mn.us"
#> [3] "devan.armstromg@state.mn.us" "jacob.beauregard@state.mn.us"
#> [5] "robert.m.berg@state.mn.us" "mari.bostrom@state.mn.us"
Phones <- emailandphone[seq(from = 2, to = 40, by = 2)]
head(Phones)
#> [1] "651-201-7404" "651 201-7405" "651-201-7494" "651-201-7474" "651-201-7458"
#> [6] "651-201-7437"
MNEMA <- tibble(name, Emails, Phones)
head(MNEMA)
#> # A tibble: 6 x 3
#> name Emails Phones
#> <chr> <chr> <chr>
#> 1 Joe Kelly, Director joseph.kelly@~ 651-201~
#> 2 Kevin Reed, Deputy Director kevin.reed@st~ 651 201~
#> 3 Devan Armstrong, Program and Policy Analysist devan.armstro~ 651-201~
#> 4 Jacob Beauregard, Mutual Aid and Logistics Coordinator jacob.beaureg~ 651-201~
#> 5 Robert Berg, REP Planner robert.m.berg~ 651-201~
#> 6 Mari Bostrom, REP Planner, GIS mari.bostrom@~ 651-201~
MNEMA <- separate(MNEMA, col = name, into = c("First", "Last", "Position"), extra = "merge")
head(MNEMA)
#> # A tibble: 6 x 5
#> First Last Position Emails Phones
#> <chr> <chr> <chr> <chr> <chr>
#> 1 Joe Kelly Director joseph.kelly@~ 651-201~
#> 2 Kevin Reed Deputy Director kevin.reed@st~ 651 201~
#> 3 Devan Armstrong Program and Policy Analysist devan.armstro~ 651-201~
#> 4 Jacob Beauregard Mutual Aid and Logistics Coordinator jacob.beaureg~ 651-201~
#> 5 Robert Berg REP Planner robert.m.berg~ 651-201~
#> 6 Mari Bostrom REP Planner, GIS mari.bostrom@~ 651-201~