MATLAB? But but Python/R/C++/COBOL

ste · May 8, 2019, 10:06am

Hi everyone, Stefano from the UQLab Dev Lair here.

Here’s another post to answer a personal “all times FAQ” about one of the core design choices behind UQLab: its programming language. As you know, our choice back in 2013, when we started the architectural design of UQLab, was to use MATLAB. In this post I’d like to share with you the thought process that led to our choice of going with MATLAB.

MATLAB is seen by some scientific communities as a questionable choice on two main grounds:

It is a slow language that can only be used for quick prototyping, but it’s not good for complex computational tasks
It is non-free, hence it should not be used for academic/research purposes

Here I will try to give you my 5 cents on those two topics, as well as adding some more food for thought on other aspects of the problem.

MATLAB is slow

This is an argument most often heard from experienced modellers used to develop their codes in C, C++ or FORTRAN. And, when working on strictly non-vectorizable code (e.g. FDTD solvers, N-body simulations etc.), it is to some degree true. A for loop in MATLAB can^[1] be significantly slower than a similar one in a compiled language that supports advanced loop control optimizations (e.g. C).
However, when dealing with linear algebra, which in uncertainty quantification (UQ) constitutes the overwhelming majority of the actual computational costs, such a margin is extremely reduced. This is due to the innate vectorized nature of matrix operations, the very building blocks of MATLAB (MATrix LABoratory). Indeed most matrix operations make use of a compiled version of the LAPACK libraries^[2], which are FORTRAN based. On properly designed code, the performance difference between MATLAB and other “faster” languages is in the one-digit percentage range.

MATLAB is neither free nor open source

Yup, MATLAB is not free nor open-source. Academic licenses (for research and students), however, are quite affordable (i.e. ~50$/year for the basic version). This brings an interesting discussion that edges on the philosophical: why is it “widely accepted” in Academia that hardware must be expensive, but software must be free? At the end of the day, a modelling workstation is a very expensive machine. Graphics cards that run modern machine-learning tools are also expensive. Personally, I don’t see the problem in using a non-free software, if it provides a powerful tool for the development and deployment of computer algorithms, more so if it was designed by an academic for academic purposes.

But is this all? Are those the two reasons why we choose MATLAB? To be fair, these were not our reasons to choose a specific programming language, but were rather considered as potential drawbacks.
The main reasons for our choice follow purely from the analysis of our intended audience: practitioners in applied sciences and engineering, the keywords being applied and engineering. Our goal is to make UQ available to a wide audience of people who are not necessarily experts in programming, nor in UQ. In this sense, a common ground that can be found between most of our users is a scientific/technical education. No matter the field, equations (formulas if you are an engineer ) are the universal shared language of scientists.

As a consequence, Bruno and I focused on languages that offer a functional programming paradigm, as opposed to object oriented programming, as it is by far the most natural approach to scientific computing. The overhead necessary to handle the abstractions in object-oriented programming forces users to waste precious time in “getting their code to run”, rather than “getting their algorithms to function”.

In 2013, the number of functional programming languages that wouldn’t be horribly slow when dealing with linear algebra and that were well known in most applied science education programs as well as outside academia were very few, and MATLAB was by far the most successful and well-known option.
Even more, UQ is most often performed at the prototyping stage, rather than during production. These were the main reasons for our choice. However, for the skeptics, here’s a rundown of a number of other reasons that need to be considered when designing a software for a specific audience and/or task:

the availability of a powerful integrated development environment (IDE), with powerful debugging tools. Today a few options exist in the main languages (e.g. spyder for Python, or Julia’s own IDE Juno), but the choice back then was quite poor
the extremely fast learning curve of MATLAB
the possibility of hiding parts of the code (for licensing purposes)
native object-oriented code support: even though the UQLab libraries are written in a functional programming paradigm to be accessible to applied scientsits/engineers, it’s core is actually fully object-oriented (OOP). This is because it doesn’t deal with science, but just with IT-related businesses, for which encapsulation is instead extremely important.

And I’m pretty sure in the months we spent identifying what to use to write UQLab other good arguments were made, but these are the ones that come to my mind first.

But then, would you still use MATLAB if you were to start from scratch in 2019? Unfortunately, I ran out of font-ink with this lengthy post, therefore I won’t be able to provide an answer here…

'til the next one
Stefano

More recent versions of MATLAB have dramatically increased the performance of for loops through improvements of the built-in just-in-time compiler ↩︎
See also: LAPACK in MATLAB - MATLAB & Simulink ↩︎

gramian · January 15, 2020, 10:31am

Well, and then there is GNU Octave, an open-source alternative to MATLAB.

timueh · January 17, 2020, 3:15pm

I like Matlab. It’s the language I was taught at university (whether that’s good or bad is not up to discussion). When I started doing my PhD, however, I wanted to learn a real programming language, and I chose Julia, because it seems to be a very promising language to bet on.

Why Julia? As pointed out in the post, there is always the question of object-oriented vs. functional. And Julia happens to be in the middle. Using the ideas of multiple-dispatch, in my opinion, it provides the right tradeoff.

But then again… it seems that the choice of one’s go-to programming language seems to be similar to one’s choice of (non-)religion…

ste · January 17, 2020, 3:58pm

hi @timueh, good choice with Julia. It’s one of my top 3 favorite languages, and it’s primarily functional, while admitting also OO and fancier stuff. Not last, programming today is not the same as in the '90s, hence the machine-learning- and HPC- friendliness of Julia will sure come in handy in the future.

That’s the language I’d probably choose if I were to design UQLab next year

As far as languages as religions go, I sincerely hope it’s not the dominant mindset (albeit admitting it’s the case for some communities…). Each of the available languages serves a purpose and has a specific philosophy, user-base, features, etc. Once the goal is clear, the language choice (assuming know-how is available) becomes rather “constrained”. I feel focusing on a language due to prior belief is a dangerous choice, especially in a project that is expected to run long-term and to have a number of different developers over time that will have to maintain and support it. But then again, this is just my perspective, as an OS- and programming language- agnostic type of guy.

@gramian Thanks for your comment. Yes, GNU Octave also existed and was considered early on during the design process of UQLab. But Octave 3 was painfully slow in 2012 (although I think it already officially supported BLAS and LAPACK), with poor portability across OSes, a relatively small user base and a rather limited object-oriented stack that didn’t allow us to implement what we planned. Hence the choice of Matlab instead.

[edit: typos]

timueh · January 19, 2020, 2:28pm

So, let’s assemble a task force, and let’s go @ste: UQLab4Julia!?..

Just one tiny clarification: Julia is not OO. It’s also not purely functional. The idea of multiple dispatch is to combine the best of the two world. I recommend this talk by Stefan Karpinski.

damarginal · January 20, 2020, 9:00am

Sometimes I forget that this is a rather important point before getting involved in whatever raging holy wars out there.

This reminds me of a blog post I read recently on MATLAB, Julia, and Python. The author is a veteran MATLAB user/programmer (has even written some textbooks on it) who is aware of the changing landscape. I think his post is not vitriolic and pretty nuanced, coming from a perspective of numerical computation. I like his comparison between these three languages to cars . I didn’t come from a strict numerical computation background and have a bias toward Python as it has served me well hacking my ways before; from text processing to gluing all the apps and codes I was dealing with. Most probably from the perspective of numerical computation, despite the recent wave, there are indeed better alternatives; old and new.