# Getting Started with Network Data

In [None]:
import networkx as nx
import matplotlib.pyplot as plt

NetworkX reference page: https://networkx.org/documentation/stable/index.html

## Create a graph by yourself

In [None]:
g = nx.Graph()

# g.add_node(1)             # Add a single node
g.add_nodes_from([1,2,3])   # Add a list of nodes

g.add_edge(1,2)
g.add_edge(3,1)

In [None]:
nx.draw_networkx(g)

## Import a classic network

In [None]:
g_kk = nx.krackhardt_kite_graph()

nx.draw_networkx(g_kk)

## Import a graph file

In this example, we import a dataset about collaborations in network science. Read more about the network [here](https://networks.skewed.de/net/netscience). 

First, download a zip file containing the data from [this page](https://websites.umich.edu/~mejn/netdata/). The dataset you need is about [Coauthorships in network science](https://websites.umich.edu/~mejn/netdata/netscience.zip). After downloading, open the zip file to unzip it to a directory that you can access. 

If you run the notebook locally, change the `PATH_TO` below to point the path to the .gml file.

If you use Google Colab, read [this page](https://saturncloud.io/blog/uploading-local-files-using-google-colab/) to understand how to upload local files to Colab. The simplest solution is probably to use `files.upload()`. 

In [None]:
G = nx.read_gml("PATH_TO/netscience.gml") # change PATH_TO

In [None]:
list(G.nodes(data=True))

In [None]:
list(G.edges(data=True))

In [None]:
nx.draw_networkx(G, with_labels=False)

## Import an edge list file

In [None]:
import pandas as pd

In [None]:
df = pd.read_csv("https://raw.githubusercontent.com/eehh-stanford/SNA-workshop/master/hp5edgelist.txt", delimiter='\t')
df = df.rename(columns={"“From”": "From", '“To”': "To"})
df

In [None]:
DG = nx.from_pandas_edgelist(df, 'From', 'To', create_using=nx.DiGraph())

In [None]:
nx.draw_networkx(DG)

In [None]:
nx.density(DG)

In [None]:
list(nx.degree(DG))

## Other datasets

If you are interested in playing with more datasets, there are many places to find network data. Below is a list of sites you can start with:


- [Gephi datasets](https://github.com/gephi/gephi/wiki/Datasets)
- [Netzschleuder](https://networks.skewed.de/)
- [UCI Network Data Repository](https://networkdata.ics.uci.edu/)
- [Stanford Large Network Dataset Collection](https://snap.stanford.edu/data/)

Have fun!