What is a knowledge graph?

The SOLI project is more than just a knowledge graph, but in order to understand how you can best use SOLI, it’s important to truly understand what a knowledge graph is.

In this article, we’ll cover the basics of knowledge graphs, including what they are, how they work, and why they’re so important - especially in the context of software that uses generative AI.

By the time you’re done, you will be able to answer the following questions:

  • What is a knowledge graph?
  • What is a taxonomy?
  • What is an ontology?
  • How are knowledge graphs represented?
  • What are common examples of knowledge graph applications in legal tech?
  • What are some challenges and considerations specific to legal knowledge graphs?

What is a knowledge graph?

A knowledge graph is a collection of entities, their properties, and the relationships between them, represented in a way that both humans and machines can understand and process. It’s a structured representation of information that shows relationships between different concepts, entities, or data points.

If you’ve ever scribbled boxes and arrows on paper or a whiteboard to map out ideas or concepts, then you’re already familiar with the basic idea of a knowledge graph.

Knowledge graphs are powerful tools for thinking about, organizing, and interconnecting complex information, making them particularly useful in fields like the law where relationships between concepts are crucial.

Because SOLI is focused on the standardized representation and communication of legal information, knowledge graphs are a natural fit for the project. But there are many different ways to build and use knowledge graphs, so it’s important to understand the basics.

What is a taxonomy?

You may have heard the term “taxonomy” before, especially in the context of biology or library systems.

Taxonomies are hierarchical classification systems that organize concepts into categories and subcategories. Most taxonomies are tree-like structures with one or two parent nodes for each child node, allowing them to represent information like a family tree or an organizational chart.

In the legal world, taxonomies are often used to classify legal concepts, document types, practice areas, and more. SOLI incorporates taxonomies to provide a structured way of organizing legal information.

For example, a legal taxonomy might categorize ‘Intellectual Property Law’ into subcategories like ‘Patent Law’, ‘Trademark Law’, and ‘Copyright Law’.

Some people refer to taxonomies as knowledge graphs, and while taxonomies can be part of knowledge graphs, they are not knowledge graphs themselves. Taxonomies have limitations. Because they only capture one type of relationship between concepts (usually a hierarchical “is-a” or “part-of” relationship), they can be rigid and inflexible, making it difficult to represent complex relationships between concepts.

What is an ontology?

An ontology goes beyond a simple taxonomy by not only classifying concepts but also defining arbitrary relationships between them.

Ontologies allow for more complete, formal representations of concepts within a domain and the relationships between those concepts. In SOLI, the ontology helps to not just list concepts, but to explain how they are related to each other.

Ontologies form the structural framework of knowledge graphs, defining the types of entities and relationships that can exist within the graph. For example, an ontology in SOLI might define that a ‘Court Decision’ is issued by a ‘Judge’, relates to a specific ‘Case’, cites ‘Legal Precedents’, and interprets certain ‘Statutes’.

This allows us to describe a Real Estate Lease not just as a Contract or Real Estate Contract, but as a type of Document between a Lessor and a Lessee that conveys a Property for a Period of Time in exchange for Rent.

As the bolded terms suggest, ontologies are ideal for capturing how nouns and verbs relate to each other in a certain context. This makes them ideal for usage in legal data, where carefully expressing relationships in words is crucial.

How are knowledge graphs represented?

Every knowledge graph can be expressed as a simple list of sentences with a special pattern:

  • a subject, which is a noun or noun phrase that represents the entity being described
  • a predicate, which is a verb or verb phrase that represents the relationship between the subject and object
  • an object, which is a noun or noun phrase that represents the entity being related to the subject

This pattern is known as a “triple” and is the basic building block of a knowledge graph.

Each triple represents a relationship between two concepts, with the subject and object representing the entities being related and the predicate representing the relationship between them.

For example:

Subject: Court Decision
Predicate: is issued by
Object: Judge

In practice, because knowledge graphs are often large and complex structures, they are typically represented in condensed, machine-readable formats that make these triples easier to store and search at scale.

For example, SOLI’s knowledge graph is stored in the Web Ontology Language (OWL), a standard format for representing ontologies and knowledge graphs on the web. This allows SOLI’s knowledge graph to be easily shared, queried, and integrated with other systems.

Knowledge graphs can be queried using specialized languages like SPARQL (SPARQL Protocol and RDF Query Language), which allows for complex queries across the graph structure.

  1. Contract Management: SOLI’s knowledge graph can be used to standardize data within Contract Lifecycle Management (CLM) systems. It can annotate contract text, making contracts machine-readable and directly linked to other concepts in the SOLI knowledge graph.

  2. Case Management: The knowledge graph can standardize data in case or matter management systems, either at the record level or within the text of legal documents like complaints, motions, or orders.

  3. Legal Research: Knowledge graphs can power advanced legal research tools, helping lawyers find relevant cases, statutes, and legal concepts more efficiently.

  4. Compliance: By mapping relationships between laws, regulations, and business processes, knowledge graphs can help organizations stay compliant with complex legal requirements.

  5. RAG in Generative AI: SOLI’s rich, expert-vetted knowledge graph can be used in Retrieval-Augmented Generation (RAG) for generative AI to produce more accurate and consistent outputs.

  6. Legal Reasoning and Decision Support: Knowledge graphs can assist in legal reasoning by providing a structured representation of legal concepts and their relationships. This can aid in tasks such as case outcome prediction, identifying relevant precedents, or assessing the strength of legal arguments.

Creating and maintaining legal knowledge graphs comes with unique challenges:

  1. Frequent Updates: Laws, regulations, and legal precedents are constantly evolving. Knowledge graphs need to be regularly updated to reflect these changes accurately.

  2. Jurisdictional Complexity: Legal systems vary significantly across different jurisdictions. Representing these differences accurately in a knowledge graph requires careful modeling.

  3. Provenance and Authority: In law, the source and authority of information are crucial. Knowledge graphs need to maintain clear links to authoritative sources and indicate the relative weight or precedential value of different pieces of information.

  4. Ambiguity and Interpretation: Legal language can often be ambiguous or open to interpretation. Representing this nuance in a structured knowledge graph can be challenging.

SOLI addresses these challenges by leveraging expert curation, automated data extraction, machine learning, and community feedback to build and maintain a high-quality legal knowledge graph.

Interoperability and SOLI

One of the key benefits of knowledge graphs is their potential to improve interoperability between different systems and datasets. By providing a standardized representation of legal concepts and relationships, SOLI’s knowledge graph can serve as a common language for legal data.

This standardization can facilitate:

  • Easier data exchange between different legal software systems
  • More accurate and consistent legal analytics across different datasets
  • Improved integration of AI and machine learning tools in legal workflows
  • Enhanced collaboration between different legal entities, such as law firms, courts, and regulatory bodies

Conclusion

Understanding knowledge graphs is crucial for leveraging the full potential of SOLI in legal technology applications. By providing a standardized, interconnected representation of legal concepts, SOLI’s knowledge graph enables more efficient, accurate, and interoperable legal tech solutions.

Whether you’re a law firm looking to standardize information across your systems, a legal tech provider aiming to improve interoperability, or a statistician trying to understand trends in our court systems, SOLI’s knowledge graph provides a robust foundation for your work.

Further Reading

Now that you understand knowledge graphs broadly, you can dive into the specifics of how the SOLI knowledge graph is structured.

Continue learning with our SOLI Knowledge Graph Overview.