Creating a Google Drive Notes Graph using TigerGraph and SpaCy

How to Create a Notes Graph with TigerGraph and SpaCy

Shreya Chaudhary
3 min readJul 26, 2021

Overview

Welcome! Here, we’ll create a notes graph with TigerGraph. For this, we’ll extract entities using SpaCy and put it into a graph with TigerGraph. We can then use this graph to create cool projects on top of it.

Part I: Set Up your Solution

First, set up a solution on TigerGraph. To find full steps on how to create your solution, follow this blog here:

In short, do the following:

  1. Go to https://tgcloud.io/app/solutions.
  2. Press the “Create Solution” button.
  3. On the first page, press “Blank” then press “Next.”
  4. On the second page, don’t change anything and press “Next.”
  5. On the third page, customise it to your solution. Keep note of your subdomain and password!
  6. On the fourth page, verify that everything is correct and press “Submit.”
  7. Finally, wait for the Status of your Solution to say “Ready.”

Part II: Create your Graph

Step I: Install and Import pyTigerGraph and Create a Connection

First, you’ll need to install and import pyTigerGraph, TigerGraph’s Python library for interacting with TigerGraph in Python.

!pip install pyTigerGraphimport pyTigerGraph as tg

Perfect! Next, you’ll need to run the TigerGraphConnection function. Be sure to replace SUBDOMAIN and PASSWORD with their respective values!

conn = tg.TigerGraphConnection(host="https://SUBDOMAIN.i.tgcloud.io/", password="PASSWORD")

For example, my connection would look like this:

conn = tg.TigerGraphConnection(host="https://tigergraphnlp.i.tgcloud.io/", password="tigergraph")

Great job! Now that we’re connected to our solution, let’s create our graph schema.

Step II: Create the Schema

Next, we’ll create the schema. We’ll create vertices with the entities and we’ll connect them to the documents.

conn.gsql('''CREATE VERTEX Class(PRIMARY_ID class STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"
CREATE VERTEX Document(PRIMARY_ID document STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"
CREATE VERTEX Entity(PRIMARY_ID entity STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"
CREATE VERTEX Entity_Name(PRIMARY_ID entity_name STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"
CREATE UNDIRECTED EDGE CLASS_DOCUMENT(FROM Class, TO Document)
CREATE UNDIRECTED EDGE DOCUMENT_ENTITY_NAME(FROM Document, TO Entity_Name)
CREATE UNDIRECTED EDGE ENTITY_NAME_ENTITY(FROM Entity_Name, TO Entity)
''')

Perfect! If you navigate to Graph Studio, you’ll be able to see your schema.

Schema in GraphStudio

Part III: Extract Entities from the Documents

Step I: Import Libraries

First, let’s import the necessary libraries, including SpaCy.

import spacyimport en_core_web_sm

Step II: Extract Entities Document

Now, we’ll extract the entities from the document. First, we’ll assign nlp to en_core_web_sm.load(). NLP is what we’ll primarily use to run the Named Entity Recognition.

nlp = en_core_web_sm.load()

Next, we’ll run NLP on the notes.

article = nlp(notes)

Step III: Upsert Data into Graph

Finally, we’ll upsert the data into the graph. We’ll upsert the Entity of each, then the specific Entity_Name. For example, the Whigs is the Entity_Name, and the Entity is “Person.”

if article.ents:   for ent in article.ents:      conn.upsertVertex("Entity", ent.label_, attributes={"entity": ent.label_})
conn.upsertVertex("Entity_Name", ent.text, attributes={"entity_name": ent.text})
conn.upsertEdge("Entity_Name", ent.text, "ENTITY_NAME_ENTITY", "Entity", ent.label_)
conn.upsertEdge("Document", "Final Review Worksheet", "DOCUMENT_ENTITY_NAME", "Entity_Name", ent.text)

This may take some time to run. But once it’s completed, perfect! You can now check out your graph in GraphStudio. For example, we can look at the Person Entity vertex and find all of the connected Entity_Name vertices connected to it. For example, Jefferson would be the Entity_Name connected to the Person Entity vertex.

All of the vertices connected to the Person Entity.

Fantastic! Now that our graph is ready, we can write queries and create applications on top of it.

Part IV: Congrats + Resources

Congrats! You were officially able to create a graph using the power of SpaCy. Look out for future blogs creating applications on top of this!

In the meantime, if you have any questions, ask in the TigerGraph Discord and Community Page.

Thank you!

--

--