Hi everyone, Stefano from the UQLab Dev Lair here.
Here’s another post to answer a personal “all times FAQ” about one of the core design choices behind UQLab: its programming language. As you know, our choice back in 2013, when we started the architectural design of UQLab, was to use MATLAB. In this post I’d like to share with you the thought process that led to our choice of going with MATLAB.
MATLAB is seen by some scientific communities as a questionable choice on two main grounds:
- It is a slow language that can only be used for quick prototyping, but it’s not good for complex computational tasks
- It is non-free, hence it should not be used for academic/research purposes
Here I will try to give you my 5 cents on those two topics, as well as adding some more food for thought on other aspects of the problem.
MATLAB is slow
This is an argument most often heard from experienced modellers used to develop their codes in C, C++ or FORTRAN. And, when working on strictly non-vectorizable code (e.g. FDTD solvers, N-body simulations etc.), it is to some degree true. A
for
loop in MATLAB can[1] be significantly slower than a similar one in a compiled language that supports advanced loop control optimizations (e.g. C).
However, when dealing with linear algebra, which in uncertainty quantification (UQ) constitutes the overwhelming majority of the actual computational costs, such a margin is extremely reduced. This is due to the innate vectorized nature of matrix operations, the very building blocks of MATLAB (MATrix LABoratory). Indeed most matrix operations make use of a compiled version of the LAPACK libraries[2], which are FORTRAN based. On properly designed code, the performance difference between MATLAB and other “faster” languages is in the one-digit percentage range.
MATLAB is neither free nor open source
Yup, MATLAB is not free nor open-source. Academic licenses (for research and students), however, are quite affordable (i.e. ~50$/year for the basic version). This brings an interesting discussion that edges on the philosophical: why is it “widely accepted” in Academia that hardware must be expensive, but software must be free? At the end of the day, a modelling workstation is a very expensive machine. Graphics cards that run modern machine-learning tools are also expensive. Personally, I don’t see the problem in using a non-free software, if it provides a powerful tool for the development and deployment of computer algorithms, more so if it was designed by an academic for academic purposes.
But is this all? Are those the two reasons why we choose MATLAB? To be fair, these were not our reasons to choose a specific programming language, but were rather considered as potential drawbacks.
The main reasons for our choice follow purely from the analysis of our intended audience: practitioners in applied sciences and engineering, the keywords being applied and engineering. Our goal is to make UQ available to a wide audience of people who are not necessarily experts in programming, nor in UQ. In this sense, a common ground that can be found between most of our users is a scientific/technical education. No matter the field, equations (formulas if you are an engineer ) are the universal shared language of scientists.
As a consequence, Bruno and I focused on languages that offer a functional programming paradigm, as opposed to object oriented programming, as it is by far the most natural approach to scientific computing. The overhead necessary to handle the abstractions in object-oriented programming forces users to waste precious time in “getting their code to run”, rather than “getting their algorithms to function”.
In 2013, the number of functional programming languages that wouldn’t be horribly slow when dealing with linear algebra and that were well known in most applied science education programs as well as outside academia were very few, and MATLAB was by far the most successful and well-known option.
Even more, UQ is most often performed at the prototyping stage, rather than during production. These were the main reasons for our choice. However, for the skeptics, here’s a rundown of a number of other reasons that need to be considered when designing a software for a specific audience and/or task:
- the availability of a powerful integrated development environment (IDE), with powerful debugging tools. Today a few options exist in the main languages (e.g. spyder for Python, or Julia’s own IDE Juno), but the choice back then was quite poor
- the extremely fast learning curve of MATLAB
- the possibility of hiding parts of the code (for licensing purposes)
- native object-oriented code support: even though the UQLab libraries are written in a functional programming paradigm to be accessible to applied scientsits/engineers, it’s core is actually fully object-oriented (OOP). This is because it doesn’t deal with science, but just with IT-related businesses, for which encapsulation is instead extremely important.
And I’m pretty sure in the months we spent identifying what to use to write UQLab other good arguments were made, but these are the ones that come to my mind first.
But then, would you still use MATLAB if you were to start from scratch in 2019? Unfortunately, I ran out of font-ink with this lengthy post, therefore I won’t be able to provide an answer here…
'til the next one
Stefano
More recent versions of MATLAB have dramatically increased the performance of
for
loops through improvements of the built-in just-in-time compiler ↩︎See also: LAPACK in MATLAB - MATLAB & Simulink ↩︎