Haystack is a composable neural search framework

Experiment, build and scale a question answering system to millions of documents. Transform the way users search.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Build a simple Question Answering pipeline
p =ExtractiveQAPipeline(
retriever=dpr_retriever,
reader=roberta_reader
)
p.run(
query="What is the outlook for the US market?"
)
# Returned answer:
{'answers':
[{'answers': '+22% growth in revenue',
'context':
"For the US market, we expect a strong Q4
with +22% growth in revenue.",
'probability': 0.989,
....},
....
QueryWhat is the outlook for the US Market?
Retriever
dpr_retriever
Reader
roberta_reader
Answer“For the US market, we expect a strong Q4 with+22% growth in revenue.”

Start exploring Haystack!

Check repo
  • 2.3k

    GitHub stars

  • 67

    Contributors

  • Latest Transformer Models

    Load any Transformer model from Hugging Face's Model Hub, experiment, find the one that works.

  • Flexible Document Store

    Use Haystack on top of Elasticsearch, OpenDistro, or plain SQL.

  • Vector Databases

    Improve search performance with Milvus, FAISS, or Weaviate vector databases and DPR.

  • Scalable

    Build language-aware applications that scale to millions of documents.

  • End-to-end

    Building blocks for the entire dev cycle like file converters, indexing functions, models, labeling tools, domain adaptation modules, REST API.

  • Pipelines

    It's not one-size-fits-all! Combine nodes into flexible and scalable pipelines and launch a powerful neural search system.

When to add NLP and QA to your applications

From an idea to a natural language search — address the varying demands of the business users with ease. Below are some examples of use-case-driven scenarios for question answering with Haystack.

Ready to get started?

Start learning Haystack

Frequently asked questions

  • Haystack is a flexible open source Python framework for building neural search systems on top of large document collections. It enables the developers to apply Transformer models to real world applications, for example, question answering, information extraction, semantic search, summarization, chatbots. Haystack is being developed by deepset.

  • To quote Wikipedia: "Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural language."

  • The Transformer is a deep learning model initially open sourced by Google in late 2018. It sparked a wave of interest from researchers and developers alike because of its groundbreaking performance. Thanks to a community of researchers and NLP practitioners, many new variants of the Transformer model appeared.

  • Haystack is designed to be a very practical, down-to-earth framework. A decent knowledge of Python would be a prerequisite, and some fundamentals of ML too. We work on lowering the barrier of entry, but we'd appreciate you to be curious enough. For inference, you'd also need a GPU-enabled system.

  • Our community consists of NLP researchers, experts, but also of NLP practitioners, and even full-stack developers who built Haystack into their applications. We welcome everyone who's interested in NLP and is solving a real-world use case involving question answering and neural search.

  • Haystack framework is completely open source, licensed under Apache License 2.0. As deepset-the-company, we've already helped the largest European companies and public sector organizations to instrument neural search, and we're also building an enterprise SaaS product to complement Haystack. We'd be very happy to discuss this in more detail.