A world-first German AI-based recommendation system for lawyers

How Austrian legal publisher Manz leveraged deepset Cloud to significantly reduce legal research efforts through semantic search

Lawyers, judges, and other legal professionals must build a well-rounded picture of a case by looking at related incidents, making sure that no precedent escapes attention. Depending on the complexity of the case, this can require searching hundreds or even thousands of documents – a considerable investment in time and resources.

Manz, a legal publishing house based in Austria, leveraged deepset Cloud to build a language model recommendation system based on semantic document similarity to speed research workflows, reducing effort and the associated cost.

Information management

Manz’s online legal database, RDB Rechtsdatenbank is a paid offering that assists legal professionals in handling cases. Home to over three million documents, from legal rulings and contractual clauses, to other published content, it is updated on a daily basis. 

The basic search function in RDB lets users input a query and then returns all documents from the database that match the search terms. However, as Alexander Feldinger, product manager at Manz, explained: "usually one document doesn’t answer your question. It’s important to look at all the relevant documents — you don’t want to miss anything because that could have a negative effect on your work."

Modern language models can determine similarity on the basis of semantics — if you find a document that is relevant for your use case, you receive recommendations of contextually related documents. A semantic search-based system returns a greater number of high-quality results at a lower cost than traditional approaches like knowledge graphs. Not only that, it can also easily handle new documents – processing them with the underlying language model and seamlessly incorporating them into the similarity search feature, doing away with the need to annotate each new document manually.

Language models used in semantic search learn a notion of semantic similarity by being trained on large collections of in-domain documents. If trained properly,  a model will perform much better than even the most sophisticated keyword-based approaches.

“It’s all about speed, it’s all about efficiency — if lawyers need less time to research their cases, they have more time to acquire new clients,” says Feldinger.

Adapting a general-purpose language model to a specific domain

Legal terminology and text structure differs from ordinary language, and for a model to successfully handle legal language, it needs to account for these idiosyncrasies. Natural language processing (NLP) uses a process known as fine-tuning, or domain adaptation, to specialize a general-purpose, pre-trained language model like BERT to domain-specific jargon. While LEGAL-BERT provided an English language model, there was no precedent for a German language model.

Manz specifically chose deepset Cloud for experience in applied NLP, and expertise with German language models. deepset also provided guidance for model fine-tuning and annotation – especially important for legal language where it is nearly impossible for laypeople to correctly identify what constitutes similarity between documents. 

Prototyping, evaluation, and quick demoing in deepset Cloud

Adapting a freely available pre-trained model through fine-tuning may require significant high-quality annotation and labeling, thus the expertise of Manz’s legal professionals was indispensable throughout the process. To provide the data needed for fine-tuning the German BERT model to the legal domain, three professional annotators from Manz labeled 11,000 data points. Once the data was annotated, Manz started building the application in deepset Cloud. The team continuously refined the legal language model in a two-step process: first using a larger, publicly available dataset by LAVIS-NLP, then training on a subset of the Manz data points.

Manz used deepset Cloud capabilities for testing and evaluating the system before moving to production. Quantitative evaluation revealed the returned results were 20% more accurate than existing baselines. And to evaluate the system qualitatively, Manz tested the system’s results with a selection of legal professionals who were asked to decide whether, in the context of a given query, they would click on a suggested document or not. 

While qualitative evaluation is often overlooked, it provides a more comprehensive insight into how real-world end users will perceive and interact with the final product. deepset Cloud built-in end-user search interface made it easy to test and demonstrate the models after each training round, as well as maintaining an overview of how the quality of the model improved with each iteration. Feldinger explained that it allowed him to “show my superiors what we had done so far and how it worked, before we integrated it into our system. I could clearly demonstrate the advantage of the language model over our previous system.”

Once Feldinger and his team were satisfied with the quality of the fine-tuned model, they moved towards integrating it into RDB. The integration, using deepset Cloud REST APIs, was “very straightforward and easy,” says Alexander. “If you have a little bit of experience with software development, then it’s going to work.”

Faster, more accurate results

Upon viewing a document, users now receive recommendations for thirty semantically related documents from the database, making for a faster and more enjoyable research process. What’s more, the similarity based recommendation system returns far more pertinent results than the leading competitor’s implementation — though both have access to the same pool of documents.

“NLP is the way to go — everything else is time-consuming and costs a lot of money. It’s far easier to use a modern language model,” says Feldinger. And while data annotation currently still constitutes the bulk of an applied NLP project, he’s convinced that this will change soon: “Techniques to facilitate both annotation and training will make the process easier and faster in the near future.”

What’s next for Manz and NLP?

Manz’ ultimate goal is to become a leader in applying cutting-edge NLP technologies to the German-language legal domain. Not only is Manz planning to improve the similarity feature by retraining and updating it regularly, they also intend to collect more user feedback about the quality of the search, and place the similarity recommendations more prominently within RDB.

Thanks to the composable nature of modern NLP systems, the work on semantic document similarity can serve as a basis for a range of other products for Manz. In addition to licensing it to other organizations, Manz can use the model as a foundation for a question answering feature on their site, or to implement more fine-grained, paragraph-based recommendation systems.