Data Science: The Tools, Techniques, and Topics
by Kevin Croxall, Ph.D.
Lesson 1: What is Data Science?
Welcome to your correspondence course on Data Science. While we did not promise
college credits, we did promise seven short lessons to bring you up to speed on this powerful
discipline. Read on to ingest your first lesson.
While business objectives (and job seekers, and training programs) desire a simple and straight-
forward answer to the question of “What is Data Science?”, reality is complicated by the fact
that the answer is highly nuanced and somewhat esoteric. I would posit that this comes from
the fact that to define the field of data science we must combine the skills, insight, and prob-
lem solving in a business context. The exact mixture of these depends on the individuals in-
volved and the questions that must be answered, or more to the point, the questions that must
be asked. Indeed, there are many ways to define data science as a domain; likely as many ways
as there are people to whom this question is posed at the moment. In the less-than-immortal
words of Wikipedia:
Data science is an interdisciplinary field that uses scientific methods, processes, algo-
rithms and systems to extract knowledge and insights from data in various forms, both
structured and unstructured, similar to data mining.
Data science is a “concept to unify statistics, data analysis, machine learning and their
related methods” in order to “understand and analyze actual phenomena” with data. It
employs techniques and theories drawn from many fields within the context of mathe-
matics, statistics, information science, and computer science.
Having been trained as a modern scientist, the above words could, in a loose fashion, describe
the work I did before transitioning into the field of data science. Indeed, as an astrophysicist I
had diverse flavors of data from which I extracted knowledge and insights that could only be
quantified and justified in by using statistical data analytics and comparison to theories. Fur-
thermore, ALL science uses data. It is only in the realms of pseudoscience, and perhaps string
theory (physics burn), where data is replaced by the esoteric dreams based on vapid hopes.
Thus, we could say that data science is merely the application of the scientific method to
questions in that do not fit into academic categories of traditional science such as business
applications. Indeed, many professionals often interchange the terms data science, business
analytics, business intelligence, predictive modeling, and statistics without a thought to the
differences between those practices. Sometimes, scoundrels will even rebrand these earlier
approaches and solutions as “data science” to be more attractive.
So how do you define data science in as a leader in your company? Focus on the skills need-
ed for the successful application of the scientific method. Critical thinking, problem solving
skills, ability to design an experiment, ability to carry out an investigation that is unbiased in
its approach. Simply put, data science is problem solving in a digital environment.
So how do you succeed at data science? Well, first you hire a data scientist. To help you know
what to look for in that, be sure to look for Lesson 2: What makes a Data Scientist?, which will
help you hunt down these rare creatures.
About Kevin Croxall:
Currently, Dr. Croxall is the Director of Data Science at Expeed Software. After a decade
as a research astrophysicist, Dr. Kevin Croxall transitioned to life as a Data Scientist.
He has subsequently worked on numerous projects as a team member and leader to
solve complex quandaries in the government and commercial sectors. From training
machine learning algorithms, dissecting data, designing systems to interpret data, tell-
ing the forgotten tales of the data, he has done it all and is ready to find more data he
can tame.
For more information on Expeed Software, please visit: [Link]
Copyright ©2019 Expeed Software, LLC. All Rights Reserved. DS012919