The past few years have seen a huge shift in the business intelligence (BI) and analytics market.
On one hand, the ability to write code and perform advanced analytics has become widespread, ceasing to be the sole preserve of software engineers and data scientists.
This is reflected by the fact that programming is now taught routinely on undergraduate courses – and that for those who missed it at university, there’s no shortage of online courses. On the other hand, the availability of powerful, low-cost (or even free) database and data analysis tools has increased dramatically, and, largely as a result, the move from proprietary software towards open source technologies is gaining momentum.
Within this migration, one very strong trend is the rising popularity of R.
A powerful programming language for statistical and scientific computing, R was originally developed in the academic world, but is now widely used in business across industries as diverse as finance, insurance, pharmaceuticals and retail.
R’s rise to prominence is driven by several attractive attributes:
Free, widely taught, comprehensive, cross-platform, well integrated...
- It’s a free, open source, widely taught in universities for quantitative-related subjects.
- It’s comprehensive, and has thousands of packages available covering pretty much any use your organisation is likely to need.
- It’s cross-platform and can be operated from Windows, Linux or iOS.
- It’s well integrated with most of the leading BI and analytics software products on the market.
- As its popularity grows, more data scientists and business analysts are becoming skilled in its use, creating a positive adoption loop.
R programmes can implement complex algorithms
At EY, we’re seeing R’s standing rise every day among our clients; as they use it for data analytics as well as model development and testing, it is often supplementing or replacing spreadsheets and other statistical software.
When clients ask us to review models built in R, we commonly find it is being used to perform statistical and scenario analysis, or stress tests, on large datasets. For these kinds of uses, R scripts are easier to manage, can handle much bigger data volumes, and allow automation of reporting and use of a full range of statistical methods.
However, with great power comes great responsibility.
In R, potential issues are much harder to recognize.
R programmes can implement complex algorithms that would have been inconceivable using traditional spreadsheet tools. But this produces software engineering problems and disciplines normally associated with software development– such as code maintenance, code reviews, testing and documentation.
Put simply, when using Excel there is a relatively well-known pool of mistakes the modeller can make.
In contrast, the range of bugs and errors in R is growing exponentially across thousands of libraries, and potential issues are much harder to recognise.
Reviewing R models requires skills not typically available in most companies
What’s more, obscure code features, such as timezone settings or what data type was set, can have less obvious, but far-reaching, consequences when hidden in DataLoad scripts and embedded in the visualisations used to model results.
Moreover, reviewing R models requires skills not typically available in most companies. This means that to anyone but the developer, the model becomes a “black box”, with arcane calculations and logic.
Business leaders who rely on R models need to be aware of these risks and ensure they’re addressed.
One approach is to seek external code assurance and establish code best practices. Our analytics team is being asked with increasing frequency to review R models – one reason why we’ve invested actively in building a team of experienced R modellers.
EY has been at the forefront of model risk and assurance services for many years. Building on this solid base, we’re now extending our experience – gained through thousands of model reviews – to a new generation of business tools written in R and other software.
We understand both the startling benefits as well as the specific risks associated with R models, and we’re ready to help you adopt and use it safely through our R model assurance services. Please contact us to learn more.