Supported Formats
SocNetV supports the following network data formats:
- GraphML (.graphml or .xml)
- GML (.gml or .xml)
- GraphViz (.dot)
- Adjacency Matrix (.sm, .adj, or .csv)
- Pajek (.net, .paj, or .pajek)
- UCINET’s Data Language (.dl)
- Edge List (.lst or .list)
- Weighted Lists (.wlst or .wlist)
The default network data format of SocNetV is GraphML.
If you create a new network and press Ctrl+S
to save it, SocNetV will try to save it in GraphML format by default.
Loading and Importing Files
-
GraphML files can be loaded directly using
File > Load
or specified via the command line. -
For other formats, use
File > Import
.
GraphML files
GraphML is a highly versatile and widely adopted XML-based format designed for graph data representation. SocNetV supports both reading and saving networks in GraphML format, making it an excellent choice for interoperability with other network analysis tools.
Features of GraphML
- XML-Based: Uses XML structure to represent nodes, edges, and their attributes.
- Custom Attributes: Allows defining custom attributes for nodes and edges, such as labels, weights, or types.
- Directed and Undirected Graphs: Supports both directed and undirected graph structures.
- Hierarchical Graphs: Facilitates multi-level graph representations.
Examples
Example 1: A simple GraphML File
Below is an example of a simple directed graph with two nodes and one edge:
Explanation:
All GraphML files consist of a graphml element and a variety of sub-elements: graph, node, edge, keys. SocNetV understands all of them:
-
Nodes: Each
element has an id attribute, which uniquely identifies the node. In this case, the nodes are labeled n0 through n10. -
Edges: Each
element defines a relationship between two nodes, indicated by source and target. For example, indicates an edge from node n0 to node n2. -
Graph Structure: The graph is undirected as indicated by the edgedefault=“undirected” attribute in the
element.
Here is the visualization of the above example using Mermaid, which demonstrates the directed graph structure:
How to Use GraphML Files in SocNetV
Loading GraphML Files
To load a GraphML file in SocNetV:
- Go to File > Load.
- Select the desired
.graphml
file. - The network will be rendered on the canvas with all its nodes, edges, and attributes.
Saving to GraphML files
To save a network as a GraphML file:
- Go to File > Save As.
- Choose the GraphML format from the list.
- Specify the filename and save.
Pros and Cons of GraphML
Advantages:
- Standardized Format: As an XML-based format, GraphML is standardized and widely supported by many graph-analysis tools (such as SocNetV, Cytoscape, yEd, etc) making it easy to exchange graph data.
- Extensible and Flexible: With support for custom attributes, GraphML is highly flexible and can describe complex networks with rich metadata.
- Schema Validation: GraphML supports schema validation, allowing users to ensure that graph data adheres to a predefined structure, which is useful for maintaining data integrity.
Disadvantages:
- File Size: Since GraphML is an XML-based format, it can be larger in size compared to binary formats, especially for large graphs. This can make file handling and transfer slower.
- Complexity for Simple Graphs: For simpler, smaller networks, GraphML may be overkill compared to simpler formats like GML or CSV, which may be easier to use for basic graph tasks.
- Parsing Overhead: Parsing XML can be more computationally expensive than parsing other formats, especially when working with large graphs, potentially affecting performance.
Reference
- GraphML Official Specification: GraphML Specification
GML Files
GML (Graph Modeling Language) is a simple, text-based file format designed for describing graphs. It is widely used due to its human-readable structure and its ability to represent both small and large graphs in a flexible way. GML files are commonly used for storing and exchanging graph data and are supported by many graph-processing tools, including SocNetV.
Features of GML
- Human-readable format: GML is a text-based format that is easy to read and edit manually, making it accessible even to users without specialized graph software.
- Flexibility in Graph Representation: GML supports both directed and undirected graphs, as well as the inclusion of additional attributes for nodes and edges, such as labels, weights, and other custom properties.
- Graph-Level Properties: GML allows you to define global properties for the graph, such as directionality, making it easy to specify the overall structure of the network.
Examples
Example 1: A simple GML File
Explanation
- Nodes: Each node is defined by an
id
(unique identifier) and an optionallabel
that describes the node. In this example, three nodes are defined: “Alice”, “Bob”, and “Charlie”, each with a unique ID. - Edges: The edges connect nodes using the
source
andtarget
attributes. For example, an edge from node1
(Alice) to node2
(Bob) has a weight of1.0
. Similarly, other edges are defined with weights that indicate the strength of the connection. - Graph Properties: The
directed 1
property at the beginning of the graph indicates that the graph is directed. If this value were set to0
, the graph would be undirected.
Mermaid Visualization
This Mermaid diagram visualizes the directed graph described in the GML example. It captures the relationships between “Alice”, “Bob”, and “Charlie”, along with the weights assigned to the edges.
How to Use GML in SocNetV
- Loading GML Files: In SocNetV, you can load a GML file by selecting File > Import from the menu and choosing your GML file. SocNetV will parse the file and display the network graph, preserving node labels, edge weights, and any other attributes defined in the GML.
- Saving Networks as GML: Exporting a network from SocNetV in the GML format is not supported.
Supported GML Elements in SocNetV
SocNetV supports the following key features of GML files:
- Graph Type: Directed (
directed 1
) and undirected graphs (directed 0
). - Node Attributes:
id
,label
, and simple custom attributes such ascolor
orsize
for nodes. - Edge Attributes:
source
,target
, and simple attributes likeweight
andcolor
for edges. - Basic Graph Structure: SocNetV can handle simple graph structures with nodes and edges but does not support hierarchical elements or advanced GML features like nested graphs.
Pros and Cons of GML
Advantages:
- Human-readable: As a text-based format, GML files are easy to read and edit by hand, making it simple to inspect and modify graph data.
- Wide tool support: Many graph analysis tools, including SocNetV, support GML, making it a versatile format for sharing and processing graphs.
- Flexibility: GML allows the inclusion of various attributes for both nodes and edges, making it adaptable to different types of networks.
Disadvantages:
- Scalability Issues: For very large graphs, GML can become inefficient because it is a text-based format, leading to larger file sizes and slower parsing times compared to binary formats.
- Limited Structure: While GML is flexible, it lacks the extensive validation or schema enforcement seen in more structured formats, which can lead to inconsistencies in large, complex graphs.
- Manual Edits Can Be Error-Prone: While human-readable, manual edits to a GML file can easily introduce syntax errors or formatting mistakes, especially in large graphs.
Reference
- GML Official Specification: Graph Modeling Language (GML) Specification
GraphViz (DOT) Files
GraphViz is a graph description language that uses the “DOT format” to define and represent graphs. SocNetV supports importing GraphViz files for analysis and visualization. These files describe graphs using a simple text-based syntax, making them easy to create and edit.
Key Features
- Directed and Undirected Graphs: GraphViz supports both graph types using intuitive syntax (
digraph
for directed graphs andgraph
for undirected graphs). - Attribute Definitions: Customize nodes and edges with attributes like colors, shapes, and weights to enrich graph visualizations.
- Hierarchical Structures: Allows defining subgraphs and clusters (partially supported in SocNetV).
Examples
Example 1 : A simple .dot file
Explanation
- Graph Type: The keyword
digraph
indicates a directed graph, where edges have direction. - Node Attributes: Nodes in the graph can have attributes, such as
color
(red) andshape
(box or circle). - Edges: Directed edges are defined using
->
, such asa -> b
. Attributes likeweight
andcolor
can also be applied to edges. - Clusters and Groups: Nodes are grouped by attribute definitions, making visualization more intuitive.
Mermaid Visualization:
This diagram replicates the structure of the example GraphViz file, illustrating the directed relationships between nodes.
How to Use GraphViz Files in SocNetV
- Loading Files: To import a GraphViz file, navigate to File > Import and select the
.dot
file. SocNetV will parse the graph and display it for further analysis. - Visualization Options: SocNetV will render the graph with its attributes (e.g., node colors and shapes) if supported. Unrecognized attributes will be ignored.
- Limitations: SocNetV does not support advanced GraphViz features like subgraphs, clusters, or complex attribute combinations.
Supported DOT Elements in SocNetV
SocNetV supports the following basic elements of GraphViz files:
- Graph Type:
digraph
(directed) andgraph
(undirected). - Node Attributes:
color
,shape
,label
, and basic styles likefillcolor
andstyle
. - Edge Attributes:
weight
,color
, andlabel
. - Basic Layout: SocNetV handles most node-edge relations and visualizes the graphs as directed or undirected, but does not support subgraphs, clusters, or certain complex attributes.
Pros and Cons of GraphViz files
Advantages:
- Human-Readable Syntax: Easy to create and edit with any text editor.
- Customizable: Allows detailed customization of nodes and edges using attributes.
- Widely Supported: Used in many visualization tools, including SocNetV.
Disadvantages:
- Partial Support: SocNetV does not support all GraphViz features, such as subgraphs and clusters.
- Formatting Limitations: Requires properly formatted input; unstructured files may not be imported successfully.
- Visualization Complexity: Large graphs with extensive attributes can become difficult to manage and interpret.
Reference:
- GraphViz Official Documentation: GraphViz DOT Language
Adjacency Matrix Files
The adjacency matrix format is a widely used format for representing one-mode networks, where each element in an NxN matrix represents the connection between nodes (or the weight). This format is simple to use and is typically employed for analyzing networks based on node relationships.
SocNetV supports adjacency matrix files in files with extensions such as .txt
, .sm
, .adj
, and .csv
.
Key Features
- Each line in the matrix corresponds to a row, and the entries are separated by a delimiter: a space, a comma, a tab etc.
- The value at position
(i, j)
in the matrix represents the weight of the edge from nodei
to nodej
. For unweighted networks, use1
for edges and0
for no connection. - Diagonal elements (self-loops) can be included but are not mandatory.
- Directed networks can be represented by an asymmetric adjacency matrix, where the entry
(i, j)
differs from(j, i)
.
Node Labels
In addition to numeric node identifiers, SocNetV supports labeled nodes. This feature enables users to define custom node labels, making the matrix more descriptive.
Labels can be provided as follows:
- A header comment line at the top of the file lists the node labels, separated by spaces or tabs.
- The subsequent rows represent the adjacency matrix, where the order of rows matches the order of node labels.
Examples
Example 1: Adjacency Matrix File with Numeric Node Identifiers
For a simple network:
Explanation:
- Matrix Representation: Each row and column in the matrix represents a node. The element
(i, j)
represents the connection between nodei
and nodej
. For example, the first row (0 1 0
) means node 1 is connected to node 2 - Unweighted: In this example, the matrix is unweighted, with binary values (0 and 1).
- Symmetry: The adjacency matrix is symmetric for undirected graphs, meaning
A[i][j] = A[j][i]
.
Example 2: With Labeled Nodes
Using labels instead of numeric identifiers:
Here:
Alice
is connected toBob
.Bob
is connected to bothAlice
andCarol
.Carol
is connected toAlice
.
Note that this matrix is not symmetric.
This format is especially useful for social networks where nodes represent entities like people, organizations, or websites, and meaningful labels enhance readability.
Example 3: A Weighted Network
Explanation:
- Matrix Representation: Each row and column in the matrix represents a node. The element
(i, j)
represents the connection between nodei
and nodej
. For example, the first row (0, 1, 1, 2
) means node 1 is connected to node 2 (with weight 1), and node 3 (with weight 1), while node 4 is connected to node 1 (with weight 2). - Weighted: In this example, the matrix is weighted, with values like 1 and 2 indicating the strength of the connections. Unweighted matrices would use binary values (0 and 1).
- Symmetry: The adjacency matrix is symmetric for undirected graphs, meaning
A[i][j] = A[j][i]
.
Mermaid Visualization:
This Mermaid diagram visualizes the graph represented by the adjacency matrix. The nodes 1, 2, 3, and 4 are connected with weighted edges, as indicated by the matrix.
How to Use Adjacency Matrix Files in SocNetV
- Loading Files: In SocNetV, you can load an adjacency matrix file by selecting the File > Import > Adjacency Matrix menu option. and choosing the appropriate file. SocNetV will parse the matrix and display the graph, representing each node and its connections.
- Delimiters: When importing an adjacency matrix, SocNetV allows you to specify the delimiter used in the file (comma, space, or tab) for correct parsing.
- Exporting as Matrix: SocNetV allows exporting the graph as an adjacency matrix file, which can be shared and used in other tools for further analysis.
Supported Elements in SocNetV
SocNetV supports the following elements of adjacency matrix files:
- Basic Structure: The matrix should be a square matrix (NxN), where
N
is the number of nodes in the network. - Weighted and Unweighted Networks: SocNetV can handle both weighted (non-zero values) and unweighted (binary values 0 and 1) networks.
- Node Labels: SocNetV supports the inclusion of node labels, either in the comment line (e.g.,
# Alice, Bob, Charlie
) or as part of the matrix input.
Pros and Cons of Adjacency Matrix Files
The Adjacency Matrix Format is widely used in research and applications where the network structure is predefined, such as importing datasets from external tools or simulations. The inclusion of labels further enhances its versatility for descriptive and real-world social network data.
Advantages:
- Simple and Efficient: Adjacency matrices are straightforward to use and understand, especially for small networks or those without complex structures.
- Widely Supported: The matrix format is widely accepted and can be used across various graph analysis and network tools.
- Direct Representation of Connections: The matrix provides an explicit representation of node connections, which is useful for mathematical and algorithmic analysis of networks.
Disadvantages:
- Scalability Issues: For large networks, the adjacency matrix can become inefficient in terms of both storage and processing. Sparse networks can lead to a lot of wasted space in memory.
- Limited Metadata: The adjacency matrix format is not ideal for representing additional information (e.g., node attributes) beyond connections. While it supports weights, more complex node and edge attributes are difficult to handle in this format.
- Not Ideal for Directed Graphs: While the matrix can represent directed graphs, it becomes more difficult to distinguish between directed and undirected edges without additional context.
Reference:
- Adjacency Matrix Files: Adjacency Matrix Explanation (Wikipedia)
Pajek Files
SocNetV supports importing Pajek project files, which usually have a .net
or .paj
extension. It is important to note that SocNetV supports a limited subset of the features of this format.
Key Features
- Network Definition: Pajek files start with the
*Network
tag to declare the network and provide its name. - Vertices and Edges/Arcs: The
*Vertices
section lists nodes, and the*Edges
or*Arcs
section defines relationships between nodes. - ArcsList: The
*ArcsList
tag is used to represent arcs from one node to others in a more compact form. SocNetV supports this tag for defining relationships in a simplified way. - Matrix Representation: SocNetV can also handle Pajek files that use the
*Matrix
tag to represent relationships in an adjacency matrix format. - Multiple Matrix Declarations: SocNetV supports files with multiple
*Matrix
tags, which will each be parsed as a separate relation within the same network.
Examples
Example 1: A Simple Pajek Network
Explanation:
- Network Declaration: The
*Network
tag declares the name of the network, in this case, “Simple Network.” - Vertices: The
*Vertices
section defines 6 nodes, each with an ID, a label (node name), color (LightGreen
,LightYellow
), and additional properties such as shape (box
,ellipse
, etc.) and position coordinates (0.5 0.5
for the X and Y coordinates). - Arcs: The
*Arcs
section defines directed edges between nodes. Each line specifies a source node, target node, weight (e.g.,1
,-1
), and an optional color. - Edges: The
*Edges
section defines undirected edges, similar to*Arcs
, with the same format.
Mermaid Visualization:
This Mermaid diagram visualizes the graph described in the Pajek file, showing the directed and undirected relationships between nodes.
Handling Multiple Matrices in Pajek Files
SocNetV also supports Pajek files with multiple *Matrix
declarations. Each matrix defines a new relation between the nodes. Below is an example of a Pajek file with multiple matrices:
Explanation:
- Multiple Matrices: Each
*Matrix
section represents an adjacency matrix that defines a relation between nodes. In this example, there are 4 matrices, each representing a different relation between the 5 nodes. - Matrix Format: Each matrix is a square matrix (NxN), where
N
is the number of nodes in the network. The values inside the matrix represent the connections between the nodes, where0
indicates no connection and any non-zero number (such as1
or-1
) represents a connection between nodes. - Multiple Relations: Each matrix is treated as a separate relation within the same network. For example, the first matrix represents one relation, and the second matrix represents a different relation.
How to Use Pajek Files in SocNetV
- Loading Files: To import a Pajek file, navigate to File > Import and select the
.net
or.paj
file. SocNetV will parse the file and display the network structure. - Limitations: SocNetV does not support advanced Pajek features like vectors, permutations, or event-based files.
Supported Pajek Elements in SocNetV
SocNetV supports the following elements of Pajek files:
- Network Declaration:
*Network
header to define the start of the network file. - Node Definitions:
*Vertices
followed by a list of nodes with their labels, colors, shapes, and optional positions. - Edge Definitions:
*Arcs
followed by pairs of node IDs indicating directed connections, with optional weights and colors. - Edge Definitions:
*Edges
for undirected relationships between nodes. - ArcsList: SocNetV supports the
*ArcsList
tag to represent multiple outgoing edges from a node in a more compact list format. - Matrix Representation: The
*Matrix
tag can represent multiple adjacency matrices, with each matrix corresponding to a new relation in the network.
Pros and Cons of Pajek Files (in SocNetV)
Advantages:
- Simple Structure: Pajek files have a straightforward text-based format, making them easy to create and understand.
- Efficient for Large Networks: Pajek is specifically designed for handling large networks, and the format can efficiently represent large graphs.
- Widely Used: Pajek is a widely recognized tool in network analysis, making this file format compatible with many graph analysis tools.
Disadvantages:
- Limited Attribute Support: While Pajek files support some basic attributes, SocNetV does not handle advanced attributes like node colors, shapes, or other visual properties.
- Partial Feature Support: SocNetV only supports a subset of the Pajek format, excluding more complex features like vectors, permutations, or event-based files.
- No Schema Validation: Pajek files are not validated automatically, so incorrect or unformatted files may cause import issues.
References
- Pajek File Format Documentation: Pajek Documentation
UCINET DL Files
UCINET’s DL (Data Language) format is widely used for representing social networks and other types of relational data. It is a simple and flexible format, primarily designed for use with the UCINET software. SocNetV supports importing DL files in specific formats, including full matrices and edge lists.
Key Features
- File Header: DL files start with the
DL
header, followed by metadata describing the number of nodes, whether diagonals are included, and the data format. - Matrix or Edge List Representation: SocNetV supports DL files in both adjacency matrix and edge list formats, making it versatile for different types of network data.
- Node Labels: DL files can include labels for nodes, making it easier to identify nodes in the graph.
Examples
Example 1: Full Matrix Format
Explanation:
- Header: The file starts with
DL
, indicating the UCINET format, and specifies the number of nodes (N=4
), the format (FULLMATRIX
), and whether the diagonal is present. - Labels: The
LABELS
section lists the names of the nodes (e.g., Alice, Bob, Charlie, David). - Data: The
DATA
section provides the adjacency matrix for the network. For instance, the value1
at position(1, 2)
indicates a connection between Alice and Bob, with a weight of1
.
Mermaid Visualization:
This diagram visualizes the relationships described in the full matrix format, showing both the weights and the direction of edges.
Example 2: Edge List Format
Explanation:
- Header: Similar to the full matrix format, the file begins with
DL
, followed by metadata. The format here isEDGELIST1
, which indicates an edge list. - Labels: The
LABELS
section lists node names, just like in the full matrix example. - Data: The
DATA
section lists edges in the formsource target weight
. For instance,1 2 1
means Alice (node 1) is connected to Bob (node 2) with a weight of 1.
Mermaid Visualization:
This diagram visualizes the network described in the edge list format, where the weights of connections are shown.
How to Use UCINET DL Files in SocNetV
- Loading Files: To import a UCINET DL file, go to File > Import and select the
.dl
file. SocNetV will parse the file and display the network. - Full Matrix or Edge List: Depending on the file format (
FULLMATRIX
orEDGELIST1
), SocNetV will either render a matrix-based visualization or a graph-based edge list. - Visualization Options: Node labels and edge weights will be displayed if included in the DL file.
Supported UCINET DL Features in SocNetV
SocNetV supports the following key features of UCINET DL files:
- Full Matrix Format:
FULLMATRIX
representation with or without diagonals. - Edge List Format:
EDGELIST1
for defining edges between nodes. - Node Labels: SocNetV recognizes the
LABELS
section and assigns these labels to the corresponding nodes in the graph. - Weighted Edges: SocNetV supports weights for both edge list and matrix formats.
Pros and Cons of UCINET DL Files (in SocNetV)
Advantages:
- Simplicity: DL files are easy to read and write, especially for small networks.
- Versatility: Supports both adjacency matrix and edge list representations, making it flexible for different types of network data.
- Compatibility: Widely used in network analysis, particularly with UCINET, and compatible with many graph analysis tools.
Disadvantages:
- Limited Advanced Features: In SocNetV, DL files with advanced attributes such as colors, shapes, or styles for nodes and edges are not supported.
- Manual Editing Risks: Errors in formatting (e.g., mismatched rows in matrices) can lead to import failures or incorrect visualizations.
- Scalability Issues: While adequate for small to medium-sized networks, very large networks may be cumbersome to represent in a text-based format.
References
- UCINET DL Format Documentation: UCINET Official Website
Edge List Files
The edge list format is one of the simplest and most commonly used formats for representing graphs. It describes a graph in terms of its edges, where each line represents a connection (edge) between two nodes. SocNetV supports two types of edge list formats: unvalued and valued (weighted).
Key Features
- Simple Representation: Edge list files are straightforward, with each line representing a directed or undirected edge between two nodes.
- Unvalued and Valued Edges: The edge list can be unweighted (binary values, where
1
represents a connection) or weighted, where the third column represents the weight of the edge. - Compatibility: The edge list format is compatible with a wide variety of graph analysis tools, making it a universal format for network data.
Examples
Example 1: Unvalued Edge List
Explanation:
- Nodes: Each pair of numbers in the edge list represents an edge between two nodes. For instance,
1 2
indicates an edge from node1
to node2
. - Unvalued: This example is unvalued, meaning the edges have no weight associated with them. A
1
is implicitly understood as the connection between the nodes.
Mermaid Visualization:
This Mermaid diagram visualizes the unvalued edge list. It shows a simple directed graph where each edge connects two nodes.
Example 2: Valued Edge List
Explanation:
- Nodes and Weights: Each line in the edge list represents a connection between two nodes, with the third column representing the weight of the edge. For example,
1 2 0.5
means there is an edge from node1
to node2
with a weight of0.5
. - Valued Edges: This example is valued, meaning the edges are weighted with numerical values.
Mermaid Visualization:
This diagram visualizes the valued edge list. The edges are weighted, as indicated by the labels on each arrow.
How to Use Edge List Files in SocNetV
- Loading Files: To import an edge list file, navigate to File > Import and select the
.lst
or.list
file. SocNetV will parse the file and display the graph structure. - Unvalued and Valued Edge Lists: SocNetV can handle both unvalued and valued edge lists. It will automatically interpret the file based on whether it includes weights for the edges.
- Visualization Options: SocNetV will render the graph with the appropriate layout, and it will display edge weights if they are provided in the file.
Supported Edge List Features in SocNetV
SocNetV supports the following elements of edge list files:
- Unvalued Edges: Simple binary edges, where
1
indicates a connection between nodes. - Valued Edges: Edges with numerical weights. SocNetV supports edge weights and will display them in the visualization.
- Directed and Undirected Graphs: Edge lists can represent both directed and undirected graphs, depending on whether the file includes one or more directed edges.
Pros and Cons of Edge List Files
Advantages:
- Simplicity: Edge list files are one of the simplest graph formats, making them easy to read, write, and manipulate.
- Flexibility: Supports both unweighted and weighted graphs, as well as directed and undirected edges.
- Wide Compatibility: Edge lists are widely supported by many graph analysis tools, making them a standard format for graph representation.
Disadvantages:
- Limited Attributes: The edge list format does not support advanced node or edge attributes, such as colors, shapes, or other metadata. It is limited to node connections and, optionally, weights.
- Scalability: For very large graphs, edge list files can become cumbersome to manage, especially when dealing with graphs with millions of edges.
- No Topological Information: Unlike adjacency matrices or other formats, edge lists do not inherently encode the structure or topology of the graph in a way that makes it easy to retrieve specific information about the graph’s properties.
Reference:
- Edge List Format Documentation: Wikipedia on Edge List