One answer is to host a conference in virtual reality and if you have the technology to hand, line Ireland’s VR Education PLC (LON:VRE), this period of enforced

Is data science a safe career to pursue? What will be the future, in terms of jobs? Will automation soon takeover, leaving data scientists in the middle of nowhere? How would you compare a software engineering career to data science?

Data Science is here to stay for a long, long time. So, yes, it is a ‘safe’ career to pursue.

Silicon Valley ‘Buzzwords’

Consider all the ‘hyped up’ and buzz wordy technologies to be promoted out of the Silicon Valley in the last couple of decades. Everything from web, cloud, mobile, SaaS (software as a service), and big data. I assume is it safe to say that the ‘web’ or internet was not just all hype. The internet is pervasive though out the world and is a game changer for everyday life. We could say the same for the smartphone and mobile revolution. Cloud, SaaS and Big Data are all playing out now. So, we can probably agree that these are not just hype and buzz words then, but, long term trends in technology. Moreover, all of these trends are still very alive and kicking today.

The next big trend in the Silicon Valley is ML & AI (and some related areas like VR and AR). If history is any judge of how new technologies from Silicon Valley will play out, we can be fairly confident that ML & AI will also become not just hype but real trends. Now, guess what sort of professionals you need to leverage and capitalize on this trend? I hope you didn’t guess ‘stand-up comics’. 🙂

The Continuum

Machine Learning is just the short term continuum in software engineering and part of the broader continuum of industrial revolution and automation in general. Somewhat simplistically,

Don’t just take my word for it [1], take a look at some of these quotes from industry stalwarts –

“A breakthrough in machine learning would be worth ten Microsofts” (Bill Gates, Chairman, Microsoft)

“Machine learning is the next Internet” (Tony Tether, Director, DARPA)

So what’s different with ML? Instead of a human (software engineer) codifying the heuristics, rules or algorithms to apply on the data to arrive at the output, in ML, we use examples of the data and the output to allow the computer to discover the program or algorithm! As problems become more and more complex, and collection, storage and analysis of data becomes cheap and ubiquitous, the computer is better suited at learning the algorithm from the data.

In a few more years, I fully expect more and more software engineering roles to require at least some basic data and ML proficiency. Today, we have full-stack engineers who demonstrate prowess in front end, back end and database technologies. Perhaps, ML will be added to that list of skills in the not-so-far future.

Broad Applicability

Just like software is pervasive in all industries today, so is the applicability of ML and AI. Take a look at Competitions on Kaggle (worldwide competitive platform for data scientists) to see the wide range of problems being tackled – from detecting tumors in medical scans, to predicting number of patrons a restaurant will receive, to forecasting home prices, to uniquely detecting whale specimens from images, to finding toxic comments on online forums and so on. At Kabbage, our data science team uses ML to identify and price credit worthy businesses, detecting fraudulent accounts, predict customer life time value and churn, optimize marketing spend, assist sales in identifying leads and much much more.

ML is responsible for the development of whole new business opportunities and domains such as FinTech[2], InsurTech[3], EdTech, and HealthTech[4]. Using software, data and predictive technologies is at the core of each of how these companies set themselves apart from the traditional Finance, Insurance, Education and Health sector companies.

Natural Evolution of Technology

The cloud computing and big data revolutions are helping organizations instrument, collect, store and process data reliably and at scale. Now that organizations are equipped with handling all that data, whats next? Data is the life blood of businesses. It is the key to unlock answers about how the company’s products are performing in the market, what customers think, what problems need to be solved, how to make operations more efficient, how to better manage vendors and suppliers etc. Enter data analytics, and ML to help answer all these questions and more.

Innovate or die

If you are thinking that AI and ML is nice to have, but not a necessity for businesses, think again. History is full of incumbents that seized to be relevant because they failed to innovate or recognize significant technological trends early and adopt. Amazon is now the undisputed king of retail (over Walmart) having taken advantage of the internet for retail. Apple beat Nokia, once the world leader in mobile phones, because it failed to see the smartphone revolution coming. Google was an early adopter of ML and algorithms (google PageRank is a form of ML) that led to all but the demise of Yahoo. Salesforce capitalized on SaaS and cloud and Oracle is now struggling to hold its footing. Intel failed to see the mobile revolution and now the applicability of GPUs for deep learning, and Nvidia is singing its way to the bank (nvidia stock price – Google Search. NVDA stock price is up 1000%+ in 2.5 years).

Companies have to adopt to the coming ML/AI revolution or will no longer be relevant in tomorrow’s world.


So far, I hope I have convinced you that data science is an all pervasive trend that is here to stay and that is the natural continuum to software engineering. What about the threat of data science getting automated away?

Data Science Automation

Recognize this [5]?

It’s a snippet of code written in assembly language. There was a time when programmers actually had to program computers in assembly language. Maybe you are too young to know that this was even a thing. Gradually, with advances in compilers, programming languages became more and more programmer friendly. We went from Basic to Fortran to C to C++ to Java to Python and JavaScript and Go and Scala. Moreover, in each of these high level languages, we have a rich suite of libraries that cover a vast set of functionalities. Has that made the programmer obsolete? Far from it. The democratization of programming has allowed ever more software engineers to thrive and build more and more amazing software products.

Today, data scientists are maybe at the ‘Fortran or C’ stage. Sure, there are some really good open source libraries available for the ML algorithms and common functionality [6], [7], [8] in Python and R that makes our jobs a hundred times easier. However, Data Science doesn’t begin and end with just the algorithms. There are many many pieces to a data science pipeline. Getting at the different data sets and corralling it into a form suitable for prediction problems is one of our biggest and time consuming challenges today. As datasets get larger, managing the compute infrastructure necessary to train those algorithms is another big challenge. Many companies like H20, Databricks, DataRobot etc are working on data science platforms to simplify these tasks. That’s great news for us Data Scientists. We can now focus more time and energy on actually thinking about and solving the critical business problems. That’s where the true value of the data scientist lies.

Business leaders used to rely mostly on experience and intuition to make key business decisions. Now, you (the data scientist) can assist business leaders using the available data to guide, challenge or form completely new intuitions about how the business should react, pivot or enforce previous decisions.


TL;DR – In conclusion, data science is just getting started, and the tsunami is yet to come! Get on the surfboard and get ready to ride that massive wave!

References

[1] https://courses.cs.washington.ed…

[2] https://www.huffingtonpost.com/e…

[3] https://www.vertafore.com/resour…

[4] https://www.techemergence.com/ma…

[5] x86assembly

[6] scikit-learn: machine learning in Python

[7] Keras Documentation

[8] TensorFlow