Datasets are at the core of performant AI systems. We use our industry experience to create and share datasets that directly enable users to solve their business problems.
GermanQuAD stems from the insights on the existing datasets and our labeling experience of working with enterprise customers. We combine the strengths of SQuAD with self-sufficient questions that contain all the relevant information for open-domain QA. This is a human-labeled dataset of 13,722 questions and answers.