Scimago Graphica

HomeLearningUser Guide

Data formats and arrangement

Graphica can work with file-based data, although future updates will include more data sources.

File-based data sources include:

CSV: Delimited text file. When opening the file, you can specify which is the separator character or if the first row contains the name of the variables or dimensions.

XLSX: Microsoft Excel file. When opening it, select the sheet with the data to be loaded.

GraphML: A file format for graphs based on XML.

GML: Graph Modeling Language.

GEXF: Graph Exchange XML Format (partial support). This is the format used by the Gephi tool.

CSV Graph: A specific format of SCImago Graphica, in which a graph is defined by means of a plain text file, with two parts, one for the nodes and the other for the edges. The file must use the .csv extension. Example.

Edge list: Text file in which a network is represented as a list of edges. The file must use the .edges extension.

Data shape

When working with tools such as Excel, it is very common to work with table formats such as the following:

Genre/Year 2010
2020
2030
Fantasy & Sci Fi
10
16
28
Literature
5
9
13
Mystery/Crime
20
23
29
Romance
24
22
19

However, to work with tools such as Graphica (and other advanced data visualization tools), the data must use an organization called Tidy Data. In this way of organizing data, each variable must have its own column, each row represents an observation, and each cell must contain a single value.

Genre
Year
Sales
Fantasy & Sci Fi
2010
10
Literature
2010
5
Mystery/Crime
2010
20
Romance
2010
24
Fantasy & Sci Fi
2020
16
Literature
2020
9
Mystery/Crime
2020
23
Romance
2020
22
Fantasy & Sci Fi
2030
28
Literature
2030
13
Mystery/Crime
2030
29
Romance
2030
19

This way of organizing the data will allow Graphica to automatically detect which variables are in the dataset and what is their typology and scope.

To prepare and format the data in this way, tools such as Open Refine can be used.

References

Open Refine. https://openrefine.org/

Tidy Data. https://r4ds.had.co.nz/tidy-data.html

The GraphML File Format. http://graphml.graphdrawing.org/

← Introduction to SCImago Graphica
Data types →