Creating a Google Drive Notes Graph using TigerGraph and SpaCy
How to Create a Notes Graph with TigerGraph and SpaCy
Overview
Welcome! Here, we’ll create a notes graph with TigerGraph. For this, we’ll extract entities using SpaCy and put it into a graph with TigerGraph. We can then use this graph to create cool projects on top of it.
Part I: Set Up your Solution
First, set up a solution on TigerGraph. To find full steps on how to create your solution, follow this blog here:
In short, do the following:
- Go to https://tgcloud.io/app/solutions.
- Press the “Create Solution” button.
- On the first page, press “Blank” then press “Next.”
- On the second page, don’t change anything and press “Next.”
- On the third page, customise it to your solution. Keep note of your subdomain and password!
- On the fourth page, verify that everything is correct and press “Submit.”
- Finally, wait for the Status of your Solution to say “Ready.”
Part II: Create your Graph
Step I: Install and Import pyTigerGraph and Create a Connection
First, you’ll need to install and import pyTigerGraph, TigerGraph’s Python library for interacting with TigerGraph in Python.
!pip install pyTigerGraphimport pyTigerGraph as tg
Perfect! Next, you’ll need to run the TigerGraphConnection function. Be sure to replace SUBDOMAIN and PASSWORD with their respective values!
conn = tg.TigerGraphConnection(host="https://SUBDOMAIN.i.tgcloud.io/", password="PASSWORD")
For example, my connection would look like this:
conn = tg.TigerGraphConnection(host="https://tigergraphnlp.i.tgcloud.io/", password="tigergraph")
Great job! Now that we’re connected to our solution, let’s create our graph schema.
Step II: Create the Schema
Next, we’ll create the schema. We’ll create vertices with the entities and we’ll connect them to the documents.
conn.gsql('''CREATE VERTEX Class(PRIMARY_ID class STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"
CREATE VERTEX Document(PRIMARY_ID document STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"
CREATE VERTEX Entity(PRIMARY_ID entity STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"
CREATE VERTEX Entity_Name(PRIMARY_ID entity_name STRING) WITH PRIMARY_ID_AS_ATTRIBUTE="true"CREATE UNDIRECTED EDGE CLASS_DOCUMENT(FROM Class, TO Document)
CREATE UNDIRECTED EDGE DOCUMENT_ENTITY_NAME(FROM Document, TO Entity_Name)
CREATE UNDIRECTED EDGE ENTITY_NAME_ENTITY(FROM Entity_Name, TO Entity)''')
Perfect! If you navigate to Graph Studio, you’ll be able to see your schema.
Part III: Extract Entities from the Documents
Step I: Import Libraries
First, let’s import the necessary libraries, including SpaCy.
import spacyimport en_core_web_sm
Step II: Extract Entities Document
Now, we’ll extract the entities from the document. First, we’ll assign nlp to en_core_web_sm.load(). NLP is what we’ll primarily use to run the Named Entity Recognition.
nlp = en_core_web_sm.load()
Next, we’ll run NLP on the notes.
article = nlp(notes)
Step III: Upsert Data into Graph
Finally, we’ll upsert the data into the graph. We’ll upsert the Entity of each, then the specific Entity_Name. For example, the Whigs is the Entity_Name, and the Entity is “Person.”
if article.ents: for ent in article.ents: conn.upsertVertex("Entity", ent.label_, attributes={"entity": ent.label_})
conn.upsertVertex("Entity_Name", ent.text, attributes={"entity_name": ent.text})
conn.upsertEdge("Entity_Name", ent.text, "ENTITY_NAME_ENTITY", "Entity", ent.label_)
conn.upsertEdge("Document", "Final Review Worksheet", "DOCUMENT_ENTITY_NAME", "Entity_Name", ent.text)
This may take some time to run. But once it’s completed, perfect! You can now check out your graph in GraphStudio. For example, we can look at the Person Entity vertex and find all of the connected Entity_Name vertices connected to it. For example, Jefferson would be the Entity_Name connected to the Person Entity vertex.
Fantastic! Now that our graph is ready, we can write queries and create applications on top of it.
Part IV: Congrats + Resources
Congrats! You were officially able to create a graph using the power of SpaCy. Look out for future blogs creating applications on top of this!
In the meantime, if you have any questions, ask in the TigerGraph Discord and Community Page.
Thank you!