I was under the impression that the creators of dplyr
are familiar with SQL [1] and did (and still do) use it as a direct inspiration [2,3]. But SQL is not very well suited for data analysis [4] so the design of dplyr
is about taking the good parts but reformulating other parts with data analysis in mind [5,6].
(not speaking with authority, but I have sources!)
1:
That said, I am very familiar with SQL
see: Disagree with Hadley's comment about databases - #15 by hadley
2:
SQL is the inspiration for dplyr’s conventions, so the translation is straightforward
source: 13 Relational data | R for Data Science
3:
Thanks to Kirill Müller, dplyr has a new experimental family of row mutation functions inspired by SQL’s
UPDATE
,INSERT
,UPSERT
, andDELETE
.
source: dplyr 1.0.0: last minute additions
4: for example Why SQL is not for Analysis, but dplyr is | by Kan Nishida | learn data science
5:
If you’ve used a database before, you’ve almost certainly used SQL. If so, you should find the concepts in this chapter familiar, although their expression in dplyr is a little different. Generally, dplyr is a little easier to use than SQL because dplyr is specialised to do data analysis
source: 13 Relational data | R for Data Science
6:
[...] dplyr maybe might be better than SQL in some ways. But I think it is, because it's trying to solve a much, much smaller problem than SQL is trying to solve. [...] I think you can rethink the language and the interface, and of course, we've learned a bunch about programming and programming languages and the 40 years since SQL has been around. So I think there's some really nice things about dplyr that just make life a little bit more pleasant.
source: SuperDataScience