These are people who filter, manage and analyze data sets. They apply human learning to machine learning to create the kind of intelligence their organizations can act on. They understand infrastructure and data relationships, and eat statistical analysis with a side of production coding for breakfast.
Historically speaking, data science was, mostly, an academic discipline until about a decade ago—and we knew it as statistics until about 20 years ago. Every company uses data of some kind—even if it’s as basic as how much they earned and spent in a period. But those are rudimentary ledgers compared to the crazy genius-level data juggling required to be a data scientist. Those guys take data from across functions and processes, and understand the story it tells. But more importantly, they understand the importance of the story to the organization’s growth plans.
A straight line from early data science to now cuts through three corporate culture shifts:
First we measured, then we educated our workforce, and then we created the technologies they needed. All at once, it seemed, technology and human ability to measure and make sense of huge amounts of data (Big Data) converged, and corporate data science was born.
Most jobs in the agency, adtech and media sales fields involve some data manipulation, or at least an understanding of data terms. Whether it’s open rates, click-through, pageviews, bounce rates, abandoned shopping carts, sales opportunities won or lost, calls made per day…it’s almost impossible to get through a workday without being held accountable to some kind of numbers.
But that doesn’t mean business intelligence and data science are interchangeable disciplines or titles.
Business intelligence typically looks at that data and talks about what’s already happened. It’s primarily a reactive or responsive function. Data scientists take that data and use it to create models that can be used to predict the future. This requires advanced skills, tools that can manipulate staggering amounts of data, and sometimes multiple computers running in clusters or parallel to provide enough processing power. The data matters here, too. Business intelligence usually comes from simple internal data sources, whereas data science might pull data from dozens of internal and external sources.
Visualizing the flow of data through a process or system helps managers see trouble spots early so they can take action. This is true whether your organization is a shipping warehouse or a data warehouse. This visualization needs machine intelligence, usually found in tools like Tableau and Hadoop, SQL and NoSQL databases, programming languages like R, Python, Java and JavaScript, and various APIs. If it sounds techy and serious, it’s because data science is techy and serious.
Graphical representations of data using these tools go far beyond simple Excel charts, and while most of the published charts tend to be a look back, they can be used to visualize various possible future scenarios (but businesses tend to keep their predictions private).
See more visualizations like this one at Tableau’s Public Gallery.
Beyond simply measuring what your company does, or has done, a data scientist is a strategic role that can guide your company by spotting trends before they develop. A data scientist will find bubbles before they burst, and help you understand social, geographic, technological, economic and other factors that can affect your business.
Baseball writer Bill James once wrote, “I find it remarkable that, in listing offences, the league will list first—meaning best—not the team which scored the most runs, but the team with the highest batting average. It should be obvious that the purpose of an offense is not to compile a high batting average.”
And that’s where many businesses live. They know they should be measuring something, but end up measuring the wrong things. Batting averages don’t convert into wins and not all corporate measures convert into revenues. If revenue is important to your business, you need to identify the outputs and activities that affect revenue generation.
Similarly, if brand awareness is a priority, you’ll need a way to measure that; if social impact is your game, think about how to measure that. Data science is how you predict outcomes before they happen.
If you’re thinking about hiring a data scientist, remember there’s a gulf between business intelligence and data science. The combination of academic learning and technical skills required to be a data scientist come at a price, and an actual data scientist hire is likely to cost six figures, and maybe as much as 50 percent more than a business intelligence analyst or a data analyst.
Now all you need to do is run the numbers to see whether you need a scientist or an analyst.