Back to all blogs
Project Blog

Hyperloo: Building a Knowledge Graph for University Courses

How I built an algorithmically generated knowledge graph for UWaterloo programs and courses

NLPWeb ScrapingKnowledge GraphNext.js
Hyperloo: Building a Knowledge Graph for University Courses

Hyperloo is a knowledge graph that maps out all topics, courses, and degrees at the University of Waterloo. Traditional degree structures tend to be abstract, making it difficult to visualize the interconnected nature of knowledge. I wanted to change that. What if you could represent an entire degree visually - showing how concepts interlink and build on one another? That question led me deep into knowledge graphs, which I found incredibly powerful as a tool for structuring and exploring complex domains.

Why Knowledge Graphs?

Concepts are inherently nested objects. Take nuclear physics - understanding the entire field seems daunting. But break it down, and it's really just a collection of interconnected subtopics. Each of those subtopics can be further divided, creating a layered structure that makes even the most complex subjects feel approachable. Knowledge graphs embody this philosophy: no topic is truly out of reach if broken down correctly. Seeing how subjects connect makes learning more intuitive, empowering students to explore topics they once thought were beyond them.

One of my key inspirations was the Socratica graph matching from the 2023 Socratica Symposium. As graph tooling continues to improve, I believe we'll see knowledge graphs become a standard way to represent and navigate information.

Building Hyperloo

Creating Hyperloo was a long and technically challenging process. The first step was data collection: scraping every Waterloo syllabus to form a base knowledge corpus. We used Selenium to automate the scraping, parsing programs, courses, and subtopics iteratively. This required multiple browser automation scripts to extract structured data from unstructured web pages. The process was labor-intensive, but necessary.

Once we had a structured corpus, the next challenge was transforming raw syllabus text into meaningful graph data. We trained a custom NLP model using SpaCy to extract key information. The NLP pipeline was relatively simple - a classification model trained with labeled examples to recognize important syllabus components. After multiple iterations, we achieved ~92% accuracy, which was sufficient for our needs. The model's purpose wasn't perfect precision but rather rough approximation to filter syllabus content into useful knowledge nodes.

With the extracted information, we structured the data into a nested JSON format that could be easily visualized as a graph. To generate these structured JSON objects, we used OpenAI's Batch API, processing large volumes of text and distilling them into a structured hierarchy. The result was a massive JSON-L file representing Waterloo's academic knowledge as a network of interconnected topics.

Finally, we built the front end using React, leveraging the GraphForce component to render the knowledge graph. While some customization was required to optimize the visualization, the hardest part of the project was the data transformation itself - getting from raw syllabi to structured knowledge nodes.

Impact and Utility

Hyperloo has already gained traction. After sharing it on LinkedIn and Twitter, it received over 250 likes on LinkedIn and 100+ on Twitter. More importantly, it was bookmarked 17 times on Twitter - an indicator that people actually intend to use it as a reference.

For Waterloo students, the utility is clear. Hyperloo provides a structured, visual way to explore degree programs, understand prerequisite relationships, and dive into any topic of interest. Even if only a few dozen students actively use it, that's a meaningful outcome for me.

But the implications go beyond Waterloo. With Hyperloo, anyone, anywhere in the world, can effectively trace the structure of a Waterloo degree and use it as a self-learning roadmap. Even though it's not a complete curriculum, the ability to map out an entire field and navigate it freely is incredibly powerful. In theory, a student in a third world nation or any remote region could use Hyperloo, coupled with Perplexity AI and other online resources, to pursue an entire degree's worth of knowledge for free.

What's Next?

Before starting any project, I ask myself: if I were to disappear tomorrow, what would I leave behind? Hyperloo is one of those projects that feels genuinely useful - not just to me, but to the broader world. If it grows, it could be a foundational resource for structured, open-access education. That's the kind of impact worth building for.

Check out the production version of Hyperloo here.