Hoping to land this year's 'hottest job'? Here's what you need to be a data scientist

Skills in Hadoop or Spark are 'a big plus,' one recruiter says

Now that "data scientist" has been named this year's hottest job, it's only natural to wonder if you've got what it takes to fill it.

With a median base salary of $116,840 and top rankings from recruiting site Glassdoor for job and career opportunities, the position currently suffers from a chronic shortage of supply, despite universities' best efforts to launch new training programs.

It's a wide-open opportunity for people with the right stuff, in other words. Could that include you? It all depends, said Adam Flugel, a data-science recruiter with Burtch Works.

Among the base requirements for the position are deep statistical knowledge and the ability to work with predictive analytics, Flugel said in an interview this week. R and Python skills are commonly needed, as is evidence of general coding ability.

Whereas traditionally a professional specialized in predictive analytics would need to know how to use Python on a fairly basic level for that purpose, in the data-science realm Flugel looks for someone who can build their own tools in Python without being limited to one or two libraries, for example, or who can tackle unstructured data like video, images and natural language.

There is some leeway in terms of experience with specific tools, Flugel pointed out. Java or C++ skills, for instance, can sometimes stand in for Python experience.

"What's really important is the ability to code and work with statistics and predictive analytics," he explained. "An employer can teach the syntax of Python on the job a whole lot more easily than the method and mind-set of coding."

Also important are experience with both relational databases and nonrelational big-data databases. SQL is usually a requirement; skills in Hadoop or Spark are "a big plus," Flugel said.

Another big area that's attractive to companies is experience with machine-learning algorithms, Flugel said.

SQL, Hadoop, Python, Java and R were the top 5 skills identified in a recent report from data-focused vendor CrowdFlower.

On the educational front, there's roughly a 50/50 split among data scientists between those with Ph.D.s and those with Master's degrees, Flugel said, while "a handful" have just Bachelor's degrees. Math, statistics and computer science are among the most common areas of specialization, but there's also a fair bit of representation from other disciplines with a quantitative edge, such as neuroscience and computational psychology or biology.

Flugel had some tips for those considering a career in data science. First, those with backgrounds in IT or computer science or coding generally shouldn't underestimate the importance of the quantitative and statistical skills required: "It takes significant schooling," he said.

There are numerous good "bootcamp" programs today focused on imparting specific skills, Flugel added. Competitions on data-science site Kaggle can be another valuable resource, he pointed out, and in some cases can even lead to job offers.

Still, data science is not a career for those without passion.

"It's a lifelong thing," Flugel said. "You have to stay current, and you have to keep learning on your own. The best data scientists I work with have their job during the day but then have pet projects at home experimenting with new techniques or tools."

It's not an easy route to a high salary, in other words: "It takes huge devotion and legitimate interest," he said.

Join the Computerworld New Zealand newsletter!

Error: Please check your email address.

More about

Show Comments
[]