Do data scientists use JS/C/C++ in their projects?

I’m from quantitative economics background and I code only in Python and R, the two primary languages for data analysis. I’ve seen a lot of job offers that list down additional bonuses or weightage given to those who are comfortable in JS/C/C++.

I recently joined FCC that I found through a Medium post and so far have been really enjoying playing around with JS tutorials. I even installed the latest release of Node.js just to explore the framework (this is the first time since 2014 that i decided to get some JS exposure).

Now, being already experienced in Python, and finding JS to be similar and hence simple, it wouldn’t take much time in completing the challenges and some projects on FCC. But I researched around with the question in my mind on how JS would help me as a tool in my job applications and the projects I would work in, and I found very little information except that D3.js looks similar to what we plot in ggplot2 for R and matplotlib in Python3. So I’m still unsure about this - since data scientists are not developers but still have to write code, does JS provide leverage in projects that primarily use either Py3 or R?

It makes web-based visualisations possible, that’s the main benefit; nothing else can do that [directly]. It’s also good for doing scraping-style things with headless browsers (headless Chrome primarily) - there are Python/etc bindings, but nothing really works as well as the JS bindings (however for the actual automation of scraping Python has better libraries, so :man_shrugging:). It’s available effectively everywhere, unlike every other language.

It’s very similar to Python in many ways, and most of what you can do in Python you can do in JS. However, libraries are far less mature. More importantly, you can’t just write bindings to C/Fortran libraries in JS, which significantly limits speed and power.

Which leads into C/C++ - all the time-critical stuff in NumPy/SciPy isn’t written in Python, it’s C or Fortran. Python let’s you do that fairly easily, JS does not.

tl/dr

  • doing anything that goes in a browser? JS
  • doing lightweight processing? JS will work fine, but
  • otherwise Python/R
  • doing something that needs to be very highly optimised? C/Fortran

The job “data scientist” is a giant umbrella term for doing everything. I’m starting to see articles around of people wanting to separate the title to be more specific based on the tasks at hand. Some titles include “Data Analyst” and “Data Engineer”, both of which have specific tools and needs.

Going back to your question, I’d agree that the benefit for learning JS is for web based data visualizations (even R and Shiny are making use of JS, so you can fully capitalize on the strengths of both R and JS).

For C/C++, that is more for developing your own software kind of languages. If your job entails needing to implement fast and longer lasting software, that is where I would go. Even with R, there is a C++ API package called Rcpp, which I’ve seen around for speeding up certain calculations.

In sum, it depends on what your focus is going to be. There are uses for each of those languages being useful for a data scientist.

2 Likes

Thanks @erictleung and @DanCouper. That provided me clarity for languages that are not usually used in DS but certainly have their own importance.

“usually” is relative. Like the others mentioned, it depends on what you do. I see a lot of JVM languages (like Java and Scala) but that’s just the corner of the DS room that I prefer to hang out in (data engineering, “big data”, etc.). I suggest you find your niche, then find what is standard within that niche.

2 Likes

Thanks for the correction. ‘usually’ is actually the wrong term to use here (I apologise for that). JVM scripts certainly have their own place and without them there would have been no DS.

No. In fact C and C++ will just confuse the issue. C and C++ expose the implementation issues of electronic computers. That is far removed from data science/big data (which I assume you mean information science or computational science - that is the fundamentals that everything is based on).

JavaScript is used mainly for visualization. Now-a-days, most of the visualizations are done on the browser and using JavaScript there is inevitable. There are many libraries that do make it easier to use js for visualizations, and as you mentioned, D3.js is one of them and quite powerful at that as well.
Apart from that, JS lacks the all powerful libraries of Python and R to actually analyze the data.

1 Like