Build Scalable Search Systems with Haystack and Weaviate
Weaviate is a vector database optimized for production-ready vector search.
Weaviate is a database and search engine optimized for handling vectorized data. Vector-optimized databases like Weaviate are a recent innovation in the field of data storage solutions. They are particularly efficient at storing unstructured data (such as text, images, and audio) as vectors in a continuous, high-dimensional space.
Previously, unstructured data could only be made searchable through metadata or inverted indexes. Vector representations, on the other hand, allow us to perform searches on the basis of semantic similarity functions. Through nearest-neighbor algorithms (KNN) and cosine similarity, we can project our queries onto the high-dimensional search space and retrieve the results that are closest to them. This is known as semantic search (also, sometimes 'neural search').
What is Haystack?
Haystack is our open source NLP framework. Our goal is to bring modular NLP systems to the fingertips of any developer. With Haystack, you can mix and match the latest Transformer models in a pipeline object, allowing you to find the perfect configuration for your particular use case.
Weaviate and Haystack: better together
These days, text and other unstructured data is routinely converted into high-dimensional vectors for further handling by powerful Transformer-based models like BERT. This is what makes vector-optimized databases so attractive for anyone looking to extract value from large document collections in record time.
Once you’ve set up your Weaviate data storage and indexed your documents, you can use it as your document store for any kind of state-of-the-art NLP system. Whether you want to implement an extractive question answering pipeline or are interested in natural language generation, summarization, or even question answering on tabular data — we’ve got you covered.