I've recently found myself in a position where I have to QC large amounts of data. As some background, I did some work with R in college for my Thesis, however it has been a few years since I have had the need for it, so for the most part I am getting back into it. Mostly I'm looking for ideas, thoughts, and packages to use on how to go about what I have below. Any input is appreciated.
So generally, I will have 2 excel files (Working File and Final Report), from which I only really care about 2 columns, say "id" and "Result" from each spreadsheet. In my mind what I want to do is import just those columns into 2 data sets (keeping "id" and "Result" paired...i.e. each "id" stays with it's "Result"), then compare the two for any differences in "Result". The "id" should be identical for each spreadsheet (for if not, that is an entirely different problem). What I would hope to get is a return that either shows all matches, or spits out "id"/"result" that are inconsistent between the two documents.
Would this be possible to include a small margin of error, something small like 1% just to accommodate rounding, if that had happened between the documents - hopefully not, but in either case a thought that occurred to me.
I really do appreciate any pointers as I fiddle around with this.