I have 2 columns in my data frame and I am trying to create a third column called 'ManagerCount'. In excel I can calculate using the Countif function using the column as the range and the cell as the criteria. I am to achieve this in rstudio however. Example below:-
Perhaps I am just slow this morning but I do not understand the logic of the ManagerCount column. What determines the values of 1, 0, and 2 that appear?
Hi, It counts how many times the code in the 'EmployeeID' column appears in the 'ManagerID' column. So 80001 appears once in the ManagerID column, 54629 appears 2 times, 76524 and 14973 do not appear i.e. 0.
Getting the result of how many times each ID appears in the ManagerID column is easy. Making that a new column in the original data frame is a little clunky because the counts of zero appear as NA after the left_join. It may be that the data frame DF2 is all you need or it may be there is a more elegant solution.
Welcome! I'm afraid you'll need to supply some more info in order for helpers to be able to understand your problem.
Can you please try to compose a small, self-contained reproducible example that illustrates your problem? (follow that link for instructions and explanations) A reprex makes it much easier for others to understand your issue and figure out how to help.
Since you're new here, it might also be helpful to know how to properly format code and console output that you post. Using proper code formatting makes the site easier to read, prevents confusion (unformatted code can get garbled by the forum software ), and is generally considered the polite thing to do. Check out this FAQ to find out how — it's as easy as the click of a button! :
Since you are showing only a part of your code and none of your data, this is something of a guess. The two columns you are comparing in the sum() function, id and manager_id, are factors but they do not have the same levels.
(The error message mentions the column manager_gpid. Is there a typo somewhere?)
Judging from your previous post, this makes sense, since not all employees are managers. You can fix this by making the two columns be of the type character.