New Article – Small Data Big Challenges

A new article by Arianna Betti, project leader at the RPA Human(e) AI, and other researchers was presented during the 28th International Conference on Computational Linguistics (COLING 2020), within the framework of the seed-funded project “Small Data, Big Challenges“.

“It is regularly said that the Humanities can deliver provocations, but not ground truths – datasets that can be considered as the desirable result against which to evaluate AI systems. In this recent paper we show that the opposite is the case. Ground truths in the Humanities can and should be constructed – even in concept-heavy domains such as philosophy and intellectual history.”


Paper take away:

  • We invented a six-step semi-automatic method to enable experts to construct ground truths for corpora in domains that focus on concepts (e. g. philosophy, history of ideas).
  • We make precise the concept of an expert-controlled, concept-focused ground truth (‘ground truthxc’).
  • We validated the method by asking experts to build a ground truthxc for the concept of naturalized epistemology (QuiNE-GT) in a corpus of outstanding quality comprising the academic production in English of the 20th-century American philosopher W. V. O. Quine, enriched and cleaned up for the purpose.

Read the full article >>