Data analytics is the process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information.
I have written about these topics from a 30,000 foot view in another freeCodeCamp piece, and now I want to tackle data analytics from a different perspective. Specifically I want to help you answer two questions:
- What is data analytics?
- How much do you know about data analytics already? (hint: a lot more than you think)
To answer these questions we need to first articulate the building blocks of this discipline.
A mastery of data analytics can be found in three interrelated perspectives that serve as the foundations of this science: inferential thinking, computational thinking, and critical engagement with questions of real-world relevance.
Let me define each:
- Inferential thinking: the ability or skill to interpret, combine ideas, and draw a series of conclusions from certain data.
- Computational thinking: a set of problem-solving methods that involve expressing problems and their solutions in ways that a computer could also execute.
- Critical engagement: make data-led judgements (arguments, policies, decisions, and so on).
Now that you know what represents the foundation of data analytics, let me provide a couple reasons why you should be enthused to study it.
There's a low barrier to entry
First of all, learning data analytics requires practice, patience, and application. But unlike many academic fields, the barrier to entry to start learning and getting your hands dirty with data is very low.
In light of the fact that data is all around us – and we are constantly producing more of it – these topics can easily be mastered for free, at home, and with little formal academic guidance.
You can measure your steps, or your weight, or the amount of time you spend reading. All of these behaviors produce a data output that you can measure and test.
Data gives you insights into human behavior
Second, data unlocks signals from noise. What this means is that data can help you see new truths, understand topics at a deeper level, and explain the “why” behind human behavior.
Nearly all life decisions require data inputs and analysis. Seemingly non-mathematical concepts – like weighing the pros and cons of where to work, or how to invest your time – are in fact data analytics problems. More on this later.
You Already Know The Fundamental of Data Science
If these two arguments have not won you over, here is a third: you already are good at data analytics.
Yes, you read that correctly.
You might not be great at data analytics in the academic sense but you very much understand the foundations already.
Don’t believe me? Quickly answer these questions:
- Do you cross a busy street without looking?
- Do you think learning is important?
- Are you taller than the average person in your home city?
At their core, these types of questions weigh future values and assign probabilities to unknown outcomes. Without data analytics we would be hamstrung in our decision making process.
These questions are seemingly unrelated.
But in reality data, and an understanding of data analytics, underpins each answer.
And the answers are obvious! Of course you check traffic before crossing a street because you know that the risk of potentially being hit by a car, although low, is greater than the costs of turning your head to check.
The cost-benefit analysis is a measure of return in terms of risk for a specific time period. The risk-return ratio is a core data analytics concept.
Let’s now look at the second question. Why should one care about learning? You care about learning because you believe that the value of the knowledge you are acquiring will provide use to you in a future state.
You are assigning probabilities to these future states. You don’t know when or by how much the learning will help you, but you believe that the future value of this learning is larger than the present value of not learning it. In other words, you have a hypothesis and are testing it.
Let’s review the third question.
Height is a continuous variable because in between the world’s shortest and tallest person, the demarcation of heights that a person can have is theoretically infinite.
So are you taller than the average person where you come from? In order to answer this you need to know roughly the average height of people in your city and your height in comparison.
Unless your home city is very small, you will need to conduct a sample. Perhaps you will think of your family and friends and high school classmates and make an inference from that population (but beware of sampling bias!). Maybe you intuitively know where you stand on the height spectrum and will answer from there.
What you did in your head - understanding a population, making inferences about a sample, comparing averages - those are core data analytics building blocks.
Data Science in Practice
What is important to communicate is that even without you realizing it, these three questions triggered the foundations of applied data analytics.
You had to understand distributions and random sampling, properties of several statistics (median, mean, max, variation), testing a hypothesis, estimating and predicting models, correlation, regression, and classification.
How did it feel? Hopefully you found the questions fun and light-hearted.
Any class in data analytics will start by helping students gain a solid understanding of classical statistical concepts: probability theory, for example complements and multiplication rules and permutations as well as distributions of data (categorical and numerical) and of probabilities.
Moreover, a student of data analytics will learn the law of averages, sampling variability, permutation testing, and Bayes’s Rule, which describes the probability of an event, based on prior knowledge of conditions that might be related to the event.
Even if you don’t know all of these topics yet, your intuition is a strong guide that can help you tackle and then master this content.
Your everyday knowledge of your life – like why you value learning and how to safely cross a street – can inform your deeper engagement in these topics.
Data Analytics and The Real World
Data is everywhere. Builders, designers, governments, engineers, and companies are accelerating their capture and analysis of data.
You should be too.
Here is a diverse array of organizations doing interesting work with data to shape how consumers and users interact with their products.
- RaleighDigital uses categorical population datasets to inform their clients about Search Engine Optimization. If you care about how Google prioritizes websites you need to understand PageRank, which is a statistical calculation that values websites based on the quality and quantity of links to a webpage.
- Carlypso uses sampling data and law of averages to recommend products. They comb through hundreds of examples, find averages and provide guidance based on ranges.
- Ever wondered how water is filtered and cleaned? Pool CleanerIO relies on water sampling reports to recommend products.
- Not all of us can hit a golf ball like a Professional Golfers' Association (PGA) tour player. But do you think that it is better to make two short putts or one longer one? The PGA is now using putting data to help players improve their decision making when close to the hole. A number of businesses are following suit, like GolfingInformer.com, which leverages permutation testing to advise users on their golf swings.
- William Pitt, a real estate company, uses dozens of nontraditional variables to recommend homes. These variables include the number of permits issued to build swimming pools, change in number of coffee shops within a one-mile (1.6 km) radius, and building energy consumption relative to other structures in the same zip code. By taking into consideration these nontraditional variables, Pitt can be more prescriptive and better help people identify neighborhoods and homes they want to evaluate for purchase.
- Musicians can now learn how many people listen to different genres of music, when they heard a particular song, and how long they listened to each track. This gives real-time data to musicians that can shape decisions about how – and to whom – songs are marketed, using the preferences of the listeners.
This list is eclectic - and ranges from technology to athletics to real estate to musical arts - because I want you to be inspired by the breadth of applications and services that rely on data to improve user experiences.
Bringing It All Together
Data analytics is a topic that you are already well on your way to understanding.
Yes, you will need to learn the specific vocabulary of the field and practice on data sets. But by thinking intuitively about risk, return, numbers, and modeling data, you are well on your way.
If you want to build hardware, software, innovative new products to help pets reduce anxiety, a web design agency, the next search engine, your own company, governmental departments, or nonprofits, you will benefit from comfortability with data.
Data analytics helps you discover useful information. Information can help you make good decisions, avoid pitfalls, and maximize what you do. If for no other reasons, this makes data analytics worthy of your time.