Chapter One Part I: Getting Started with Data
A lot of the time it can be confusing and overwhelming to know where to begin your data investigation journey. That leads to the common question: where do I start my analysis?
The answer to this question is that there is no right or wrong way to begin your analysis and that is the beauty of data exploration. The whole goal is to explore your data and understand it on a more insightful level to be able to create an accurate, valid and reliable analysis that can be visualised in impactful ways and help users make decisions.
Firstly, you need to understand the context and background of your data:What is the main aim/purpose of the business/not-for-profit organisation?
What is the data showing in relation to your organisation’s aim?
What kind of data is this?
What was the main purpose behind the reasoning for this data being collected?
Now that you have the answers for the questions above, start applying these answers to help you formulate the specific question you need to ask from your data. If you begin with specific questions this can help act as a guide for keeping your analysis focused. This will help you create a strong, focused dashboard telling the right story.
Data visualisation is a way for you to tell the story with data to help your end-user. If you start off with vague questions it’ll allow you to do data exploration, but you will end up with a vague story that doesn’t inform your user with much. This results in a weak data visualisation that doesn’t lead to a useful outcome.
For this reason, a specific question leads to a powerful story and this can be leveraged into a data visualisation that has structure and impact. Therefore, it is important to form a clear, concise and specific hypothesis this will help your analytical journey.
What to do you if you’re working with big data?
Big data is a common term that we all hear frequently, but what does it refer to? For this first section, we are going to keep things simple. To start off with big data refers to the massive volume of data, this data is large enough that it can be difficult to process and analyse.
It can be overwhelming if you’re analysing big data for the first time, even more, nerve-wracking if this is the first time you are analysing your data.
The first step, just take a different subset of your data this will make it easier to explore and familiarise yourself with the data you are working with, in addition to this it’ll be easier for you to understand the relationships between the different variable within your dataset.
Find the patterns within your data?
The easiest way to first understand your data is to discover the trend analysis that exists within your dataset. This can reveal interesting patterns of behaviour, which can lead to informed decision-making skills. The first place to start is to play around with your time variables, start plotting your data on a monthly, weekly, yearly, daily and quarterly level – whatever makes sense for your data. What are you starting to see?Do you have missing dates – that can be just as revealing in terms of information about your data quality. Maybe you’ll want to highlight this and investigate it further.
Is there a seasonal trend?
Are their outliers in your data? If yes, did a specific event trigger it?
How are things changing over time? Consider taking a look at the percentage difference next
Now that you’re exploring your data, this will cause more questions to arise. Start noted down some of those questions, keep them specific and evaluate if they are worth exploring.
The more you explore your data the more the question: “why?” will pop up in your head. This is because data can tell you what is happening, when is it happening, however, it cannot always tell you why this is happening. The question “why” a lot of the time is the most important aspect of the story that people want to understand and discover, so how can you do this with data?
One of the ways you can discover the answer to why things are happening is by putting together the background story. For example, you discover there was a specific month where sales were underperforming more than usual, it looks like an outlier and doesn’t fit with the normal seasonal trend analysis you have found. The first thing you can do is start talking to your team, your manager and find out did a specific event happen on that day, is the data quality accurate for that time period.
Data quality in any organisation will never be 100% and it won’t have all the answers you want to discover but what it can do is act as guidance and offer you more information than what you had to begin with. This is why it’s important to critically look at your data in these key ways: What questions is your data not able to provide you with the answers you need?
How is your data being collected?
How error-prone is your data?
Is your data sample biased?
How representative is your data?
What is the level of uncertainty within your data?
All of these are the key questions you need to ask concerning your data, due to this data is in a lifecycle that continues to grow and develop that’s why data is a journey that we go through by:
Good quality data visualisation is a dashboard that improves the quality of insight and the ability to make decisions. This is based on the effectiveness of the dashboard is in communicating the information garnered from the data.