Whether you are digital or not, Data Analytics can and will help optimize your company with intelligent analysis and responses in your data.
Portuguese version can be found here.
What is Data Analytics?
Which is something good you may have heard/read, and also that you can help, improve and optimize, but what exactly is it?
Data analysis is a process of inspecting, cleaning, transforming, and modeling data to discover useful information, inform conclusions and support decision-making.
In short, let's look at the information we have, let's treat it, classify and organize it so that later we can conclude decision-making at a tactical and/or strategic level.
So, a process that a data analyst would do would be something like:
In summary, we have:
- We define which question or problem we are trying to solve.
- Collect the raw data relevant to the problem.
- Treat this data (formatting, relating, removing things that we are not going to use, etc.).
- We analyzed the data.
- Created ways to visualize this data (plot graphs, generate reports, spreadsheets, etc.).
- Share the result.
As usual, there are several libraries in JS to solve the same problem in different ways, here we are going to look at just one; Tidy.js.
Tidy JS is a library inspired by an R language pack (Tidyverse) with a very clean and simple syntax, prioritizing the readability of your code. The documentation is full of examples and a lot of explanation about each method.
There's no secret to starting a project, if you've already started a Node.js project you'll see that it's more of the same. Just create a
package.json file and add the lib with our JS files and we're done.
Let's run two commands in the terminal (start the project and install the lib), first, create a folder on your desktop and navigate to it through the terminal, then run:
Now we can open your project in the code editor (Vscode for me):
Let's do a quick modification to the
package.json file so that it accepts the ES syntax (No need to worry about this, it's just a design pattern detail, but if you want to know more, search for "Node.js CommonJS vs ES6 "). Just open the file and add the line
To start with the code, let's create a
.js file in the root of our project, the name doesn't matter right now, but I'd suggest something like
main.js. In this file we will put a small example code from the Tidy documentation, just for testing:
Earlier we talk about the steps to analyze data, we have a very simple form of this happening in this example.
We have the variable data with a collection of data with properties
b, each with a numerical value and in random order, which would be our raw data. Imagining that the problem here is to order this collection of data based on a third property (
ab) which is the result of multiplying
To run the code run in the terminal:
node index.js, the output has the following result:
So here, we have our data analysis with JS (or not), without any headaches, and in a simple and fast way. But what if we try to do something more real-world and with more data?
For a more complex example, I'll use a database from the Kaggle website, a community for data scientists where among other things there are several databases of all types and sizes for you to use in your study projects.
I will use the database containing information about cyclists, and the City Bike Dataset.