This is a companion discussion topic for the original entry at https://blog.rstudio.com/2021/03/18/bi-and-data-science-the-tradeoffs
Photo by Jamie Street on Unsplash
In the previous posts in our series on Data Science and Business Intelligence, we first discussed how data science can either complement or augment self-service BI tools to deliver more combined value. We then explored the strengths and challenges of the two approaches, both of which aim to help an organization get more insights from their data and to make better decisions.
In this post, we’ll provide insights from organizations who have used both types of tools and give some guidance about which you should use when. We’ll also set the stage for future blog posts where we will explore specific integration points for BI and Data Science tools.
Don’t Get Trapped into a False Choice
In our prior post, we explored the strengths and challenges of both BI tools and open source data science. We won’t repeat those arguments here. Instead, we’ll hear from users who seem to understand that both approaches have their place.
BI tools are often an easier place for an organization to start when approaching an analytic problem, They provide a lower barrier to entry for the typical business user, who may not be comfortable coding in R or Python. The built-in features make it easy to visualize, explore and analyze data using a point-and-click approach and then to share that analysis with others.
For example, this user prefers Power BI for creating quick and easy visualizations, but switches to R and Shiny for their highly interactive user interfaces.
“Power BI is an easy to build visualization tool widely used in our organization to make data accessible to non-data people. This is a really great tool when we want to create a dashboard for trends and track some metrics. But it becomes very difficult when we want to enable high levels of user interactivity with the dashboard. That’s where R Shiny helped us to build intuitive and highly interactive user interfaces.”
Meanwhile this Biotech firm views Spotfire and Tableau as fine products so long as you are satisfied with their built-in capabilities, but sees R being more flexible.
“RStudio is code based, so in the beginning tools like Spotfire and Tableau have [their] advantages since many things are already built in, but in terms of flexibility RStudio will win over the longer term.“
The individuals below describe how they apply this flexibility and power from two different industry perspectives. The first is from a financial industry leader.
“Most of the work the data scientists did used the R language. They did a great job satisfying management’s constant barrage of questions because iterative analysis is so easy with tools like R, and the powerful visualization tools made communication of results easy for sales people to grasp. As the CEO, I was gratified at how clear the presentations were and at how quickly presenters answered my difficult questions, in some cases on the fly during the presentations.
As an R user myself, I know its code-based workflow lends itself to rapid iteration while, at the same time, documenting the process used. It was easy to unroll the tape to see every step that led to any conclusion.”
– Art Steinmetz, former Chairman and CEO of Oppenheimer Funds
The second individual describes how he uses R in the beverages industry:
“The R ecosystem has vast power to quickly solve problems. With R, I can incorporate nearly any AI/ML model into a dashboard or Shiny app, without being reliant on proprietary data science tools. Executives can be confident I am using the best analytic approach for a given problem, and I can rapidly apply new approaches as they become available.”
– Paul Ditterline, Director of Data Science at Heaven Hill Brands
While these may be only anecdotal evidence, they do show awareness of both approaches to data analysis and provide some color into why companies opt for each solution. They illustrate that as the questions get more complex, requiring greater analytic depth to answer, and more customization in how the analysis is done and presented, BI tools may struggle. Users will encounter a relatively low ceiling to the complexity of questions they can answer.
On the other hand, code-friendly data science tools represent a relatively high barrier to entry. They require those who create the analyses to have some understanding of coding in R and Python, and familiarity with applying and interpreting advanced analytic methods to get the most out of the tools. However, the flexibility and analytic breadth of code-friendly data science combines to provide a very high ceiling for answering difficult, valuable questions for an organization.
This just leaves open the question, “How should I select my approach?”
Match Your Data Science Approach to Application Needs
We expect firms to continue struggling with this tradeoff between BI tools and open source data science for years to come. As we argued in our first post on the topic, this isn’t about choosing between the two approaches, but how to exploit the strengths of each while mitigating their challenges.
In the table below, the Use When You… column augments the table we presented last week. While this guide won’t be correct for every case, it at least provides a guideline for those times a data science leader needs a quick answer to an urgent project.
p { padding: 0 0 8px 0; } th { font-size: 90%; background-color: #4D8DC9; color: #fff; vertical-align: middle; } td { font-size: 80%; background-color: #F6F6FF; vertical-align: top; line-height: 16px; } td.approach { font-size: 90%; background-color: #4D8DC9; color: #fff; vertical-align: middle; } caption { padding: 0 0 0 0; } table { width: 100%; padding: 0 0 16px 0; } th.approach { width: 16%; } th.strengths { width: 28%;; vertical-align: middle; } th.challenges { width: 28%; vertical-align: middle; } th.use { width: 28%; vertical-align: middle; } table { border-top-style: hidden; border-bottom-style: hidden;}Strengths | Challenges | Use When You... | |
---|---|---|---|
Self-service BI Tools |
|
|
|
Open Source Data Science |
|
|
|
Table 1: Guidelines for when you should apply BI Tools or open source data science.
Summary
RStudio is dedicated to the proposition that code-friendly data science is uniquely powerful, and that everyone can learn to code. We support this through our education efforts, our Community site, and making R easier to use through our open source projects such as the tidyverse. Our software is already used by millions of people to analyze data every day.
However, code-friendly data science does present a higher barrier to entry compared to BI tools, which are very valuable for the wider community of analysts and business users in an organization. Because of this, it is critical to leverage both, and use data science to augment and complement your BI tools.
In our next posts, we will explore specific points of integration between these tools. We’re happy to help you explore these topics, so if you’d like to learn more about how RStudio products can help augment and complement your BI approaches, you can set up a meeting with our Customer Success team.
To Learn More
- See the second blog post in our BI series for more information on how RStudio tackles the challenges of open source data science listed in the table above. RStudio Team provides security, scalability, package management and the centralized management of development and deployment environments, delivering the enterprise features many organizations require.
- Read this recent interview for more information on Why RStudio focuses on code-friendly data science.
- For more information on what RStudio is doing to make deep learning and AI available in the R ecosystem, see the RStudio AI blog.
- Explore the enterprise value of an open source, code-friendly approach in our blog post series, importance and benefits of Serious Data Science.