Three areas of progression in data science
The amount of data being created and made available today is extraordinary. According to a study by PWC, the accumulated digital universe of data will have grown ten-fold in the last 6 years. To put that in perspective, there are now 40 times more bytes than there are stars in the entire observable universe. Hence it comes as no surprise that the number of specialist careers in data science is growing at an equally rapid rate. In addition to the increase of workers in this sector, the jobs themselves have also evolved. As the industry changes at breakneck speed, it is important to consider the developments happening now that will change the way the sector operates.
1. Now: Data Science demand outstrips supply
With this much data being exposed, a skills shortage which prevents it from being used effectively has arisen. In 2019 the LinkedIn workforce found a staggering 150,000 unfilled data scientist jobs in the US alone. A couple of changes can be expected to occur to combat this problem.
Firstly, universities will provide more data science specific degrees at both the postgraduate and undergraduate level. This will be coupled with more modules focused on how data analysis can practically be applied, which will encourage collaboration between data scientists and experts in other fields.
Currently, many people entering their first data science job do not come straight from university. Either they have related experience, or a PHD in a STEM subject. In the future, these jobs are likely to become increasingly filled with undergraduates.
Companies will also look to upskill their current employees using training programs, centres of excellence, and online classes. This will boost the number of internal workers with data science skills, keeping the company in line with technological advancements.
Despite these measures to try and close the skill gap, it is still likely that it will continue to grow. A report from Indeed showed a 29% increase in demand for data scientists in the past year and at this growth rate demand for data scientists will continue to outstrip supply.
2. Next: Role augmentation
Automation is another major change that has affected people in every industry. From self-checkouts in supermarkets, to the trading floors of large investments banks, many jobs have either been altered or replaced by machines.
Roles in data science and machine learning are no different.
Automation can already be used to carry out simple but previously time-consuming tasks, such as replacing missing data, scaling and normalising, and identifying collinearity. Now, and in the future, automation will increasingly be used to produce complex machine learning models themselves.
On the one hand, this will begin to open up a data science capability to organisations that don’t have dedicated data scientists today.
On the other hand, this will again begin to change the role of data scientists that are in place today. Their focus can shift towards designing, creating and refining more complex, unique machine learning models and algorithms that are more well suited to human, rather than machine, development. Also, the importance of domain knowledge can not be overstated. Whilst no human can compete with the pure amount of work a computer can complete, people will always be needed to give meaning to the data, and to provide insight on the output of these models.
3. Later: Quantum computing
Quantum computing currently sounds like a concept you might only hear in a sci-fi film. Although it is a change that will take longer to bear fruit than the aforementioned industry developments, with the speed that research is progressing, it is quickly becoming a reality.
Essentially, Quantum computers are used to perform calculations based on the probability of an object’s state before it is measured, which means they have the potential to process substantially more data.
Quantum computers are still a long way from consistently out-performing normal computers in all tasks, but progress has been made. In October 2019, a quantum computer solved a problem that would have taken 10,000 years using classical methods, achieving the fabled “Quantum Supremacy”.
There is a whole research area dedicated solely to Artificial Intelligence (AI) and Quantum computing. There are structure similarities within the maths that sits behind both which means that Quantum computing will drive efficiencies in the way that some models are processed. This could truly open up the potential of machine learning.
Over time, and with the continued exponential growth in data, there’s also an interesting question around whether Quantum computing will render the cloud obsolete. As the quantity and complexity of data increases, the efficiencies gained through the cloud may become cost prohibitive to organisations. With a cheap, readily available Quantum machine, it may be more prudent to move back towards an on-premises solution, rather than distributing computing power and storage.
Whilst quantum computing shows so much promise, there are also several big hurdles to tackle before it becomes a reality. Currently the machines are complicated both to make and to use. The algorithms created thus far are extremely complex and require a mix of both classical and quantum techniques. Eventually a high-level language such as C will be made to help reduce complexity, but that is still currently some way off. Also, the machines themselves are extremely delicate. Some parts need to be kept at the same temperature as liquid nitrogen (−196 °C) and are so sensitive that they can even be disrupted by footsteps.
Clearly these problems are significant, but the scale of the challenge is not dissimilar to those that were faced by classical computers in the 60/70s. If these hurdles are overcome, it will revolutionize how we are able to deal with data forever.
2020 and beyond
From the way in which we teach and learn data science, to the technology that supports our data analysis, there are many changes that will disrupt the concept and practice of data science in the coming years. The certainty though is that the importance of organisations having a strong data science capability, and ideally a capability that is embedded across the firm, will only increase.