I need to use the capture() function from rebus package, but for the life of me, I cannot seem to find decent references or examples online. Makes me wonder if this function is even in common use anymore.
Does anyone know of a good reference for this function? I am a newbie in R.
library(rebus)
require(stringi)
#> Loading required package: stringi
# use help(capture) to get the function signature and example
# Usage
# capture is good with match functions
(rx_price <- capture(digit(1, Inf) %R% DOT %R% digit(2)))
#> <regex> ([[:digit:]]+\.[[:digit:]]{2})
(rx_quantity <- capture(digit(1, Inf)))
#> <regex> ([[:digit:]]+)
(rx_all <- DOLLAR %R% rx_price %R% " for " %R% rx_quantity)
#> <regex> \$([[:digit:]]+\.[[:digit:]]{2}) for ([[:digit:]]+)
stringi::stri_match_first_regex("The price was $123.99 for 12.", rx_all)
#> [,1] [,2] [,3]
#> [1,] "$123.99 for 12" "123.99" "12"
First, look for any vignettes and, if none, check the Description in the package's index. Here we find
Description: Build regular expressions piece by piece using human readable code. This package contains core functionality, and is primarily intended to be used by package developers.
(If I were in your position, that would make me wonder if rebus and its capture is the right tool. See the task view for possible alternatives.)
One of the hardest hurdles I had to overcome in learning R was learning out to decipher the help pages and the function signature.
Think of the user-facing portion of R as school algebra writ large:
f(x) = y)
The help page describes the arguments x and the result(s) y of the function.
The usage is complicated by the sprinkling of non-standard operators like %R%
Why not come back with a description of the problem you're trying to solve and the reasons that you chose this approach.
Hi, thanks very much for that. I too find the help pages not that helpful for beginners. So, i have been relying on online samples, and youtube videos & tutorials. The thing with rebus and capture() is that, for some reason these online resources are very much lacking.
I have to use rebus and capture() for a piece of work that I am submitting, so I can't use another function.
The result is a lazy regex expressions that captures 1 or 0 digits, preferring zero. Buried in the documentation is that DGT is generic class for 'digit'.
%R% is a concatenation operator, named because %c% was considered too hard to type. (You can't make this up.)
Next comes another capture enclosing
or('Y', 'YO','M','-', "")
The or operator
or takes multiple character vector inputs and returns a character vector of the inputs separated by pipes
Putting it all together in loose terms;
Gimme a regular expression for something that may or may have in it a number before another number, but at least let there be one number followed by one of those other characters.
Regular expressions are totally great. They are hard but powerful. I can't for the life of me understand why someone would interpose a barrier to learning them effectively with this meta-regex tool.