Spotify Graph Dashboard (Part I): Creating a Spotify Graph with TigerGraph

How to Create a Graph Modelling Spotify Data from Kaggle with TigerGraph

Shreya Chaudhary
7 min readJul 13, 2021
Image from Pixabay

Overview

Objective

Spotify is a digital music, podcast, and video service used by many worldwide. Songs on Spotify have several characteristics, such as danceability, loudness, and so on. Using graph technology, specifically TigerGraph, this data can be mapped onto a graph database and then visualising it. For this first blog, we’ll focus on creating the Spotify graph, then we’ll create Spotify dashboards in subsequent blogs.

Tools

In this blog, we’ll be using:

Part I: Set Up Your Solution on TG Cloud

First, we’ll set up our solution on TG Cloud. (See a thorough walkthrough here.) To do so, navigate to https://tgcloud.io/, log in or sign up, then click the “My Solutions” tab. Finally, click on the blue “Create Solution” button on the top right.

Log in or sign up then go to the “My Solutions” tab.
Click on the blue “Create Solution” button.

On the first page, press “Blank.” Since we’ll be loading in our own data, we won’t use a Starter Kit. Press “Next” then press “Next” again for the second page. The second page will create a free solution on TigerGraph

Press blank then continue. On the second page, don’t change anything and continue.

On the third page, modify the details to your solution. The subdomain must be unique. Keep note of your subdomain and password, as we’ll be using it for the Python portion.

Edit the credentials appropriately.

Finally, double-check that everything looks good in the final step, then press “Submit” to provision the solution! This might take a few minutes.

Double-check everything then press submit.

Wait till the “Status” of the solution says “Ready.”

Status as ready.

Once it’s green, you’re ready to create your schema and load data.

Part II: Connect to the Solution and Create the Schema

Step I: Install and Import pyTigerGraph and Connect to your Solution

Open Google Colab (or a Python file). First, we’ll need to connect to the solution we just created. To do this, we first need to install and import pyTigerGraph. In a Colab notebook, type:

!pip install pyTigerGraph

If you’re running a normal Python file, type this into your terminal:

pip install pyTigerGraph

Once it’s installed, import it.

import pyTigerGraph as tg

Finally, we’ll create a TigerGraph connection. Replace SUBDOMAIN and PASSWORD with your password and subdomain.

conn = tg.TigerGraphConnection(host="https://SUBDOMAIN.i.tgcloud.io/", password="PASSWORD")

Since I used the default password, my connection would look like this:

conn = tg.TigerGraphConnection(host="https://spotify.i.tgcloud.io/", password="tigergraph")

After running this, congrats! You’re now connected to your graph. Next, let’s add a schema.

Step II: Create a Schema

Looking at the Kaggle Spotify dataset, I decided to create four vertices: Genre, Song, Playlist, and Artist. I’ll connect these with edges like the following:

print(conn.gsql('''CREATE VERTEX Genre(PRIMARY_ID name STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"CREATE VERTEX Song(PRIMARY_ID id STRING, name STRING, popularity INT, dancibility DOUBLE, energy_level DOUBLE, energy DOUBLE, key_id INT, loudness DOUBLE, mode INT, speechiness DOUBLE, acousticness DOUBLE, instrumentalness DOUBLE, liveness DOUBLE, valence DOUBLE, tempo DOUBLE, uri STRING, track_href STRING, analysis_url STRING, duration_ms INT, time_signature INT) WITH PRIMARY_ID_AS_ATTRIBUTE="true"CREATE VERTEX Playlist(PRIMARY_ID name STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"CREATE VERTEX Artist(PRIMARY_ID name STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"CREATE UNDIRECTED EDGE SONG_ARTIST(FROM Song, TO Artist)CREATE UNDIRECTED EDGE SONG_PLAYLIST(FROM Song, TO Playlist)CREATE UNDIRECTED EDGE SONG_SGENRE(FROM Song, TO Genre)'''))

Fantastic! Finally, I’ll create the graph, calling it SpotifyGraph and passing as a parameter all the vertices and edges we just created.

print(conn.gsql('''CREATE GRAPH SpotifyGraph(Genre, Song, Playlist, Artist,SONG_ARTIST, SONG_PLAYLIST, SONG_GENRE)'''))

With this, you should be able to see your graph on GraphStudio! Nice job!

Step III: Updating Credentials

Before we proceed, we need to update the connection credentials, adding the graph name and the API token.

conn.graphname = "SpotifyGraph"conn.apiToken = conn.getToken(conn.createSecret())

Great! Now we’re set.

Part III: Load Data

First, let’s read the CSVs located in the train folder using pandas.

import pandas as pdalternative = pd.read_csv("train/alternative_music_data.csv")
indie_alt = pd.read_csv("train/indie_alt_music_data.csv")
rock = pd.read_csv("train/rock_music_data.csv")
blues = pd.read_csv("train/blues_music_data.csv")
metal = pd.read_csv("train/metal_music_data.csv")
hiphop = pd.read_csv("train/hiphop_music_data.csv")
pop = pd.read_csv("train/pop_music_data.csv")

Next, we’ll upsert the data using pyTigerGraph’s upsert dataframes method.

conn.upsertVertexDataFrame(alternative, "Playlist", "Playlist", attributes={"name": "Playlist"})conn.upsertVertexDataFrame(indie_alt, "Playlist", "Playlist", attributes={"name": "Playlist"})conn.upsertVertexDataFrame(rock, "Playlist", "Playlist", attributes={"name": "Playlist"})conn.upsertVertexDataFrame(blues, "Playlist", "Playlist", attributes={"name": "Playlist"})conn.upsertVertexDataFrame(metal, "Playlist", "Playlist", attributes={"name": "Playlist"})conn.upsertVertexDataFrame(hiphop, "Playlist", "Playlist", attributes={"name": "Playlist"})conn.upsertVertexDataFrame(pop, "Playlist", "Playlist", attributes={"name": "Playlist"})conn.upsertVertexDataFrame(alternative, "Artist", "Artist Name", attributes={"name": "Artist Name"})conn.upsertVertexDataFrame(indie_alt, "Artist", "Artist Name", attributes={"name": "Artist Name"})conn.upsertVertexDataFrame(rock, "Artist", "Artist Name", attributes={"name": "Artist Name"})conn.upsertVertexDataFrame(blues, "Artist", "Artist Name", attributes={"name": "Artist Name"})conn.upsertVertexDataFrame(metal, "Artist", "Artist Name", attributes={"name": "Artist Name"})conn.upsertVertexDataFrame(hiphop, "Artist", "Artist Name", attributes={"name": "Artist Name"})conn.upsertVertexDataFrame(pop, "Artist", "Artist Name", attributes={"name": "Artist Name"})conn.upsertEdgeDataFrame(alternative, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(indie_alt, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(rock, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(blues, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(metal, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(hiphop, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(pop, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(alternative, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(indie_alt, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(rock, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(blues, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(metal, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(hiphop, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(pop, "Song", "SONG_ARTIST", "Artist", "id", "Artist Name", attributes={})conn.upsertEdgeDataFrame(alternative, "Song", "SONG_GENRE", "Genre", "id", "Genre", attributes={})conn.upsertEdgeDataFrame(indie_alt, "Song", "SONG_GENRE", "Genre", "id", "Genre", attributes={})conn.upsertEdgeDataFrame(rock, "Song", "SONG_GENRE", "Genre", "id", "Genre", attributes={})conn.upsertEdgeDataFrame(blues, "Song", "SONG_GENRE", "Genre", "id", "Genre", attributes={})conn.upsertEdgeDataFrame(metal, "Song", "SONG_GENRE", "Genre", "id", "Genre", attributes={})conn.upsertEdgeDataFrame(hiphop, "Song", "SONG_GENRE", "Genre", "id", "Genre", attributes={})conn.upsertEdgeDataFrame(pop, "Song", "SONG_GENRE", "Genre", "id", "Genre", attributes={})conn.upsertVertexDataFrame(alternative, "Song", "id", attributes={"id": "id", "name": "Track Name", "popularity": "Popularity", "dancibility": "danceability", "energy": "energy", "key_id": "key", "loudness": "loudness", "mode": "mode", "speechiness": "speechiness", "acousticness": "acousticness", "instrumentalness": "instrumentalness", "liveness": "liveness", "valence": "valence", "tempo": "tempo", "uri": "uri", "track_href": "track_href", "analysis_url": "analysis_url", "duration_ms": "duration_ms", "time_signature": "time_signature"})conn.upsertVertexDataFrame(indie_alt, "Song", "id", attributes={"id": "id", "name": "Track Name", "popularity": "Popularity", "dancibility": "danceability", "energy": "energy", "key_id": "key", "loudness": "loudness", "mode": "mode", "speechiness": "speechiness", "acousticness": "acousticness", "instrumentalness": "instrumentalness", "liveness": "liveness", "valence": "valence", "tempo": "tempo", "uri": "uri", "track_href": "track_href", "analysis_url": "analysis_url", "duration_ms": "duration_ms", "time_signature": "time_signature"})conn.upsertVertexDataFrame(rock, "Song", "id", attributes={"id": "id", "name": "Track Name", "popularity": "Popularity", "dancibility": "danceability", "energy": "energy", "key_id": "key", "loudness": "loudness", "mode": "mode", "speechiness": "speechiness", "acousticness": "acousticness", "instrumentalness": "instrumentalness", "liveness": "liveness", "valence": "valence", "tempo": "tempo", "uri": "uri", "track_href": "track_href", "analysis_url": "analysis_url", "duration_ms": "duration_ms", "time_signature": "time_signature"})conn.upsertVertexDataFrame(blues, "Song", "id", attributes={"id": "id", "name": "Track Name", "popularity": "Popularity", "dancibility": "danceability", "energy": "energy", "key_id": "key", "loudness": "loudness", "mode": "mode", "speechiness": "speechiness", "acousticness": "acousticness", "instrumentalness": "instrumentalness", "liveness": "liveness", "valence": "valence", "tempo": "tempo", "uri": "uri", "track_href": "track_href", "analysis_url": "analysis_url", "duration_ms": "duration_ms", "time_signature": "time_signature"})conn.upsertVertexDataFrame(metal, "Song", "id", attributes={"id": "id", "name": "Track Name", "popularity": "Popularity", "dancibility": "danceability", "energy": "energy", "key_id": "key", "loudness": "loudness", "mode": "mode", "speechiness": "speechiness", "acousticness": "acousticness", "instrumentalness": "instrumentalness", "liveness": "liveness", "valence": "valence", "tempo": "tempo", "uri": "uri", "track_href": "track_href", "analysis_url": "analysis_url", "duration_ms": "duration_ms", "time_signature": "time_signature"})conn.upsertVertexDataFrame(hiphop, "Song", "id", attributes={"id": "id", "name": "Track Name", "popularity": "Popularity", "dancibility": "danceability", "energy": "energy", "key_id": "key", "loudness": "loudness", "mode": "mode", "speechiness": "speechiness", "acousticness": "acousticness", "instrumentalness": "instrumentalness", "liveness": "liveness", "valence": "valence", "tempo": "tempo", "uri": "uri", "track_href": "track_href", "analysis_url": "analysis_url", "duration_ms": "duration_ms", "time_signature": "time_signature"})conn.upsertVertexDataFrame(pop, "Song", "id", attributes={"id": "id", "name": "Track Name", "popularity": "Popularity", "dancibility": "danceability", "energy": "energy", "key_id": "key", "loudness": "loudness", "mode": "mode", "speechiness": "speechiness", "acousticness": "acousticness", "instrumentalness": "instrumentalness", "liveness": "liveness", "valence": "valence", "tempo": "tempo", "uri": "uri", "track_href": "track_href", "analysis_url": "analysis_url", "duration_ms": "duration_ms", "time_signature": "time_signature"})

Great! Once this is completed, our graph is officially set up! We can now start exploring visualisations on top of our graph.

Part IV: Congrats!

Congrats! Great work on creating this Spotify Graph. Look out for the next blogs to create some awesome visualisations on top of it!

If you have any questions or would like to learn more, feel free to join the TigerGraph Discord:

Good luck with your TigerGraph adventures!

--

--

No responses yet