
Software development is now widely understood as a cyclical process, with frequent iterations of planning, coding, testing, and deploying leading to the best results. This is because the way a system is designed to work can differ vastly from how it’s used in the real world. Therefore, developers can only get a realistic picture of their product’s ability to solve an actual problem by subjecting it to frequent tests. Once they have evidence of the system functionality, they can use this to tweak and tune it.
There’s no reason not to apply these same principles to projects that incorporate natural language processing (NLP). But because developers can find it hard to wrap their heads around the complexities of modern NLP as a whole, they might not follow best practices when implementing NLP projects. At deepset, we’ve seen many teams develop large-scale NLP systems in a linear fashion, only to realize, after months of development, that their application didn’t solve the right problem. Far too often, those projects fizzle out, wasting budget, and discouraging anyone involved from ever trying again.
The solution? Deploy often, test early, and prioritize involving real-world users from the very beginning. To help you think about the NLP development process from start to finish, we have developed this handy guide to the implementation cycle in applied NLP.
Unlike theoretical NLP, applied NLP focuses on providing developers with the tools they need to leverage pre-trained language models to benefit their organization in practice. People just learning about theoretical NLP are sometimes intimidated by the complexity of large, Transformer-based language models. These complicated fundamentals are less of a barrier in applied NLP: you will rarely be training a language model entirely from scratch.
Thanks to model sharing, you can go to centralized locations like the Hugging Face model hub, where tens of thousands of pre-trained models are freely available. You can also use an interface like OpenAI’s API to access their language models without even leaving your IDE. Due to this ease of sharing, everyone can benefit from the huge leaps that research in NLP has made since the inception of the Transformer architecture for language modeling.
The NLP implementation process consists of two interlocking and overlapping phases: prototyping and machine learning operations (MLOps). In the prototyping phase, the developer sets up a working prototype pipeline and experiments with different configurations. In applied NLP, we usually opt for rapid prototyping: a workflow in which a prototype system gets developed and deployed quickly, to make sure that we collect user feedback very early on in the process. This way, we can iterate through prototypes quickly, constantly improving and refining our system.
Once the cycle of prototyping, deployment, and user feedback has produced a satisfactory system, the second phase, known as MLOps (machine learning operations), starts. In MLOps, the system is deployed and integrated into the final product. But it doesn’t stop there. Rather, MLOps provides the framework for the regular monitoring, updating, and improvement of a system in production. Because modern NLP is so driven by data, you need to make sure that your language models are regularly re-trained on textual data that captures your real-world use case.
Both the prototyping and the MLOps phase therefore heavily feature the testing and evaluation of models, both quantitatively and qualitatively. Only by periodically testing your system on real-world data and with real users can you make sure that it remains up to date.
Some NLP models are harder to evaluate than others. For example, if a sentiment classifier labels a text as positive while the correct label is negative, then we can safely say that it made a mistake. However, when it comes to generating or extracting text, it isn't always so easy to say what's wrong and what isn't. Your summarization model may output a summary that's vastly different from the "correct" text — but it could be just as good, or even better. That's why it's important to evaluate NLP models not only quantitatively, but also qualitatively — by having real-world users test them.
Rather than following the sequential data science model of finalizing a system before deploying and sharing it with end users, applied NLP makes user feedback an essential element of the production process. In the old model, it could very well happen that after months of development, you would realize that your final product wasn’t solving your users’ actual problem. By involving an example group of end users early on, you can analyze their feedback to improve your system — for instance, by collecting and annotating new, more representative data to fine-tune your models.
Let’s now take a bird’s-eye view of the different phases of the NLP implementation process.
So, in a way, the final phase consists of periodically repeating the steps outlined earlier — only now, you’re performing them on a system that’s already been deployed to production.
The above diagram illustrates why, contrary to most people’s perceptions, building an NLP system has little to do with training a Transformer language model from scratch. Rather, the hard work in applied NLP is about making sure that you have all the parts you need: high-quality, annotated data, and a framework that will help you with setting up pipelines, connecting to the resources where pre-trained models are stored, and fine-tuning those models on your own data.
While the process of how to implement NLP in its entirety may seem a bit overwhelming, it helps to break it down into neatly defined steps, as we have done here.
This blog post has been adapted from our ebook “NLP for Developers” — a detailed guide to the entire NLP implementation process that fills in the gaps for developers aiming for the successful implementation of an NLP system. In the ebook, we address questions such as:
Plus, we talk in depth about topics like data collection and annotation, evaluation, and the modularity of modern NLP systems. If that sounds good to you, follow this link to download the ebook for free.
To learn more about the state of NLP today, check out our blog, and have a look at our first ebook “NLP for Product Managers,” which is lighter on the technical content and includes many real-world use cases of applied NLP.
When we’re not sharing knowledge about applied NLP through our blog or ebooks, we’re developing Haystack, an open source framework for applied NLP that aims to make the process as smooth and easy as possible, no matter your background. Have a look at the GitHub repository or the Haystack documentation page.