Complete Guide to Visualizing Data: Chart Types and Tools

Let's talk about the world of data visualization. This can lead to the following questions: What is visualization? What are visualizations for? What types of charts are adapted to which data set? What are the tools that allow us to visualize? Keep reading and answer all your questions about this area.

To visualize is to transmit something to our receiver. If we take it to the spiritual side it could be to transmit love, happiness, anger and/or frustration; on the statistical side they can represent an increase or decrease in the world economy; on the color side, red could represent warning or danger, while green can mean success. We receive all this information daily, without asking ourselves how this result was achieved. Why is red associated with danger and green with success? Now focusing on the technological side, visualizations serve to show a graphical representation of the data. Here, large sets of data or datasets are taken and organized so that they can provide relevant information for the receiver. For example: we have a dataset from a school, where all the students' grades are with their respective branches. Thanks to the visualization, this data could be taken, entered into a table and visualized. What is the course with the worst average? o What is the field where students have the best grades? You could also show the top 10 of the students with the best grades in the entire school. Taking a larger example, we have a dataset with all the schools in Chile and we want to know which are the best schools by region? O What are the 5 lowest rated schools in the Los Lagos Region? All of this and much more can be achieved in just a few minutes thanks to data visualization.

Types of graphics

A statistical data is a visual representation of a series of data, today there are many types of graphs, which are divided by qualitative ones that refer to qualities that cannot be represented numerically, here we find the ordinals and the categorical ones, on the other hand we have the quantitative ones that refer to quantities or numerical values, among them we have the discrete values that only take integer values, such as the number of students or the number of schools, and the continuous values that take any value from a range for example a average age or grade point average. With this difference clear, we can show the most important graphics that are used to create the visualizations. (Reference article hither)

Bar chart

It is a graphical representation on a Cartesian axis of the frequencies of a qualitative or discrete variable, it can be found both in vertical and horizontal formats. There are three types of bar charts: the first is a simple bar chart, the second is a bar chart composed of different fields, and the third is a stacked bar chart where all the fields are in the same stack.

Histogram

It is a bar graph that is used to represent the frequencies of a continuous quantitative variable, it is identified by having its bars together.

 

Line chart

It is a graphical representation on a Cartesian axis of the relationship that exists between two variables, clearly reflecting the changes that occurred over a period, and is usually used to represent temporal trends.

Parapet graphic

It is a type of vertical bar graph ordered by descending frequencies that identifies and gives priority to data.

  

Sector graphics

Also known as a pie or pie graph, it is a circular representation of the relative frequencies of a qualitative or discrete variable that allows, in a simple and fast way, their comparison. This graph cannot have less than two fields and more than 5 fields.

           

Pictogram

It is a graph that represents the frequencies of a qualitative or discrete variable using figures or symbols.

      

Scatter chart

Here the relationship that exists between two variables is shown on a Cartesian axis.

       

Cartogram

It is a map in which statistical data are presented by region either by entering the number or by coloring the different areas depending on the data they represent. 

  

Creating a visualization

To start a visualization, you must immerse yourself in the context and nature of the data you will be working with, since with this you will be clear about the scope of the aspects that can be done in the tool you are working with. This point is very important since it will be able to align expectations with users and not be left with errors in the final product. The previous study of the data involves a total of 70% of the total time required by a visualization project. If the data comes with errors, the entire process will be faulty.

As a team, we must come up with a proposal based on best data visualization practices. This will be our Mockup, which will give a visual reference to the Product Owner and the users of the distribution of elements on the screen and the representation of the KPI's previously defined in the different graphics.

We recommend coming up with at least 2 Mockup proposals to iterate together with the team receiving the work, this way it will be easier to reach a final agreement, which is will translate into the list of requirements and that will be the basis for the progress checklist.

Once the mockup has been defined, work must be done on the Storyboard, this work must depend on the group of users and their needs in relation to the dashboard. The same visualization can be used by both a manager and an area manager, however, their interests and the granularity of the metrics to be observed for each profile are completely different, a manager expects to be able to observe the development of the metrics at a macro level, on the other hand, an area manager will be aware day by day of the evolution shown on the dashboard. This simple fact implies that the filters, elements and movements to be carried out within the dashboard will be different from each other. Given this, a storyboard must be generated for each group of user profiles that the dashboard will use. This section will also be reflected in the requirements document for the progress checklist.

Photo by Lukas Blazek On Unsplash

There are different tools for creating visualizations, Amazon Web Services provides a tool called QuickSight, this is a fast, cloud-based and fully managed business intelligence service. QuickSight allows you to create and publish interactive visualizations, its payment methodology is only to pay for what the customer uses, it allows the connection to a wide range of datasets such as MySQL, PostgreSQL, MariaDB, Aurora, SQL Server, Redshift, among others, it also supports Excel files, files stored on Amazon S3, GitHub, Twitter, among others. (For more information click Here ).

       

Another tool for visualizations is Power BI. It consists of a set of business analysis tools that was born as an integrated Office 365 solution. It allows connection to hundreds of data sources, simplified data preparation and generation of ad hoc analysis. The advantages of using this tool are its flexibility, since it allows you to extract information, organize it, transform it and combine data from multiple sources. Create interactive and customizable visualizations. It is also multiplatform. (For more information click Here ).

          

Finally, we'll talk about Tableau, it offers a more effective, secure and flexible comprehensive analysis for your data, it's a business intelligence platform that transforms your data to motivate actions based on information. Tableau was designed for the individual, but it adapts to any company. (For more information click Here).

              

There are 4 ways to work in Tableau:

Tableau Desktop: Considered a “model of excellence” in visual analysis, this is where the analysis takes place, thanks to its easy-to-use interface, Tableau Desktop revolutionized the business intelligence industry.

Tableau Online: It's the answer for self-service analysis in the cloud, you don't need to manage a server, it's secure, scalable and without hardware that requires maintenance.

Tableau Prep: It is a tool that allows you to make your data ready for analysis, and it also allows more people to combine, clean and shape data quickly and confidently.

Tableau Server: It is a business analysis tool, it is ideal for application in business-related environments, it allows you to share and manage data and information in physical facilities or in the public cloud.

Visualization Development

You are the owner of a bookstore in your city and you are generating positive profits, to such an extent that you are thinking of expanding the place and that means bringing more books, but you don't know which books are the best sellers, the most sold genre, or which books sell the least. This is why you decide to store the data in a dataset and then connect that data to a tool that allows you to answer all your questions about books.

The first thing a visualization tool will do is to identify if the fields correspond to measures or dimensions, the measures correspond to quantitative values and are usually represented in green while dimensions are qualitative values and are usually represented in blue. In addition, the application contains a list of all the possible graphics that can be used, depending on the user's need, it will be the graphic that will be applied, in the case of the owner of the bookstore, he wants to identify what are the best-selling books? And what are the most purchased books?

Having identified the type of data and the graphic to be used, the user will have the mission of identifying what functions the tool has stored, one of its most used functions is the filtering system, where you can filter by date (year, month or day), filter by gender (female or male), filter by geographical data (continent, country or city), etc. You can also create new fields called “calculated fields”, these are born from calculations that the user needs to create the visualization, without modifying the source of the data set, we can calculate the total of two original fields in the dataset, calculate the average of those two fields, two string fields can be joined and left as a single string field, calculated fields can be created that indicate the difference between two dates, etc.

Photo by Luke Chesser On Unsplash

There are also functions that indicate the maximum or minimum value in a set of data, for example, this serves to calculate the total sum of each of the genres of books sold and observe that adventure novels are the ones that report the most sales while fairytale novels are the ones that report the least sales. In addition, these tools have functions that show a range of values depending on a value entered by the user, for example if the user wants to know which are the sales greater than $35,000, he enters that value in the application and it will give him all the values greater than that value or on the contrary they can also show all the values lower than that value, you can also configure the application to display the top x that the user requires on the screen, by using this type of function, the owner of the library will be able to see a top 10 of the best-selling books, you can also identify which genre each book corresponds to.

Assuming that the dataset also contains information about customers, with this data we can determine which people buy books, see what gender (male or female) they buy the most, create age ranges (child, adolescent, youth, adult or adult) and identify in which age range the highest purchase rates are found, this will help the bookstore owner to identify which books to buy, since if the children's index is low, it will most likely not be as profitable to buy many children's books.

On the statistical functions side, these tools are not far behind, the vast majority have built-in statistical functions, such as summing, averaging, difference between data, converting data to a percentage, difference between percentages, standard deviation, variance, percentages, and many others. The Tableau tool, for example, allows us to analyze trend lines showing the value of r squared and the p value, as seen in the following illustration.

Visualizations make an impact on the user thanks to the large amount of interaction they can make with their data, and color is an essential element when delivering a product. For example, as indicated at the beginning, we could make a graphic that separates the sale of books into three sections. The first section would be poor sales, which would be assigned the color red, then normal sales that don't have much impact, will be assigned the color yellow, and finally the sales that matter to the bookstore owner: high sales, which will have a green color. So every time the owner updates the graphic, he will notice which books appear in the green sector, those are the ones he will ask the book factory, while those in the yellow sector will wait for what he has in the display case to be sold and those that are in the red sector may apply discounts so that they can be sold, he could also apply a new order and place the books that are not sold at the beginning so that when the customer first sees those books and perhaps thus generate a future sale. The colors provide a visual impact to the user, but they should not be abused, it is advisable not to use more than 6 colors or the user will lose the focus of the information.

The last step is to create the dashboard, where in the same view we insert the graphics created in the previous steps, depending on the application being used, they will be actions that are allowed to be performed on the dashboard. Some applications allow you to create filters, for example, one filter per year, so if the user selects the year 2019, all the graphs change their data to that year. If it changes to the year 2020, the user will see the data corresponding to that year and so on. You can also choose which graphics change and which don't, since sometimes we don't want all the graphics to change, so when you create the action, you choose which or which graphics will follow the filter.

Let's think that we have created a cartogram, which shows all the cities in Chile, if the user wanted to see how many books are sold in a particular city such as Viña del Mar, what is done is to create an action in that graph and that action is applied to the other graphics, then if the user selects any city, all the graphics will change their data related to that city. Depending on the application, some of them upload images and URLs to complement the information that is being delivered to the dashboard, the dashboard creation process ends, now if the owner of the bookstore is clear about the decision he will make for his business, he already knows what types of books he is going to buy and what books he will possibly throw away on sale, he also knows which cities in Chile generate the most income and which are the cities that he should best close.

Ready to transform your data into strategic decisions?

At Kranio, we help you implement data visualization solutions adapted to the needs of your business, allowing you to interpret information clearly and effectively. Contact us and discover how we can boost your data strategy.

Team Kranio

September 16, 2024