Blog

What is Data Science? Making Meaning From Big Data

Written by Lexi Pasi, Ph.D. | Dec 12, 2023 7:46:45 PM

Though data analysis may seem like a modern practice, as a species we’ve been analyzing data for hundreds, if not thousands, of years. As computing advanced, classical statistics gave way to Excel spreadsheets and databases. We went from testing narrow hypotheses on small datasets to digging through so-called “big data” to surface insights and trends. 

Today, when we talk about analytics, we often talk about “data science.” More than just a buzzword, data science is an important step in transforming the raw ingredients of massive datasets into fully-baked, usable insights. But what exactly is data science and how can organizations use it effectively? Let’s discuss. 

What is data science? 

 

Put simply, data science is the practice of making meaning from massive amounts of data. 

What we call “data” is only partial information — it’s a sample or snapshot that represents a piece of the full picture. In the early days of statistics, we worked with small amounts of data under tightly controlled parameters. But as technology advanced, it became easier to gather and store. The increase in volume brought us into a new era of “big data.” 
Big data isn’t just high-volume — it also comes from many different sources, which can make it messy, chaotic, and imprecisely measured. While it can be very powerful, it’s important to remember that big data is a raw material that requires processing with advanced tools and techniques to make use of it. 

Data science emerged to help us use big data to answer questions, build technology, and create value. It combines classical statistical analysis methods in clever ways, executing them through high-powered computing machines.

Who works in data science?

 

Data science is a complex discipline, one where making mistakes can lead to incorrect conclusions or insights. The best data science teams bring together diverse subject matter experts who collaborate to build effective models that lead to meaningful outcomes. 

Data science teams typically consist of a broad range of experts: 

  • Mathematicians focus on understanding how data behaves by analyzing structures and patterns within the data.
  • Experimental scientists design experiments to help us learn about massive-scale data beyond what’s possible through mathematical analysis.
  • Computer scientists specialize in high-power computing, which helps us run models on hundreds of terabytes of data. 
  • Machine learning experts use powerful tools and techniques for analyzing large datasets.

Depending on the industry, other subject matter experts might help put data and insights in context. For example, Zartico’s data science team includes ecologists, who help us model the movement and interactions of people and places.

Did you know…?

Zartico’s Research & Data Science teams include eight members with advanced degrees in a variety of mathematical and scientific fields, including seven with Ph.D.s in mathematics, large-scale crowd dynamics, high-performance computing, or experimental physics, and a specialist in computational ecology.

This team spends hundreds of hours each week discovering, designing, and testing methodologies — such as normalization and hot spot filtering — that turn the raw data we receive into destination intelligence our partners can use.

What do data scientists consider when approaching their work?

 

Data science is, most certainly, a science, but there’s also an art to the practice. Because data can be misleading when there’s ambiguity around its context, data science teams must be cognizant of how they extract meaning from raw data points. To conduct accurate analyses data scientists must:

1. Be careful when extrapolating data.

Data is partial information, and the process of extrapolating it out to create a full picture is not as straightforward as it sounds. To draw accurate conclusions, data scientists must work to understand what the data really is: where it came from and how it was generated. 

At Zartico, we work with geolocation data to answer the question How are people moving? Geolocation data doesn’t actually show how people are moving, though — it shows who was using their phone at a certain location with certain privacy settings enabled that fed into a certain portfolio of data. Our data science team then uses a variety of techniques to make sure we account for any issues that might arise when making assumptions based on partial information. 

Getting it wrong can have serious consequences, and you can see examples in the news. The political polling industry has been rocked by errors related to extrapolation. Research has shown that women are likelier than men to suffer adverse effects from medication because they are underrepresented in clinical trials. It has even been suggested that the 1986 Challenger disaster resulted from incorrect assumptions based on limited data. 

2. Know that data is subjective and evolving.

People tend to think of data as being hard and fast and immutable — but the longer you work with data, the more you realize that it’s subject to interpretation. What’s more, the world is always evolving, which means that the data we use to describe it is changing, too. 

Questions that seem simple become complex when you start to dig in. Even determining whether an observation represents a resident or a visitor — a vital element of the destination insights we provide — requires our data scientists to agree on a meaningful definition of an individual’s home location. Seems straightforward — but what about students? What about digital nomads? Is someone’s mailing address even relevant to the questions we’re asking these days? 

Our team makes a decision so we can “mathematize” data points into our platform, but we also pay constant attention to how our world is changing and evolve our models accordingly.

3. Use off-the-shelf tools wisely.

Some data providers rely on “black-box” algorithms and machine learning tools that are purchased as-is and then applied to data sets. As the name implies, these tools are opaque: you put in your data and receive the results, but these algorithms don’t generally explain or reveal their calculations. 

These tools can be helpful, but only to a point. If you don’t know what’s going on behind the scenes, you don’t know if the tool is extrapolating data in a way that makes sense for your particular use case. And because they’re built and trained on historic data, they can’t adapt to account for changes in the data sets or the evolving nature of data over time.

Because big data is unpredictable, it’s critical to have human insight into how changes in the raw data will affect an algorithm’s output. Without a clear understanding of the methodologies you’re applying, it’s hard to trust the answers they provide. 

How Zartico uses data science to empower the world’s places

 

At Zartico, we do the data processing dirty work so you don’t have to. 

We look at the raw data and start with two key questions: What is the data? And what are we trying to do with it? This question-oriented approach ensures that we make the best choices about how we use data, so we can give our customers the signals they need to make data-led decisions. 

By taking a first principles approach to data science, we don’t just throw data into prepackaged black-box tools. We are intentional about the techniques and technologies we use to ensure that we have confidence in the insights our products provide. And because both technology and data are always evolving, we know the work is never done. Staying up to date — even developing methodologies that can adapt to future changes — is essential, which is why Zartico customers receive updates when adjustments are made. 

Our team has crafted a deeper understanding of how data behaves, how people move and spend, and the forces that drive these patterns. We use this understanding to build mathematical tools that allow us to use partial information to paint full pictures of consumer behavior.

Zartico’s data science team transforms massive-scale data into tools that destination organizations and airports use to create stronger communities. To learn how data science can help you realize new possibilities for your community, book a demo