T2K (Text-To-Knowledge) allows you to automatically extract linguistic and domain-specific information from texts. It provides a structured organisation of extracted knowledge and indexes your texts with respect to the extracted information. It relies on a battery of tools for Natural Language Processing (NLP), statistical text analysis and machine learning which are dynamically integrated to provide an accurate representation of the linguistic information and of the domain-specific content of English and Italian text corpora in different domains. T2K allows you to upload your texts and corpora, to store them in your repository and to manage them. It performs the following knowledge extraction steps:

  • linguistic annotation carried at increasingly complex levels of analysis, i.e. sentence splitting, tokenization, Part-Of-Speech tagging, dependency parsing;
  • extraction of the linguistic profile of texts with respect to lexical, morpho-syntactic and syntactic features;
  • extraction of domain-specific terminology and phrases;
  • extraction of domain-specific glossary;
  • organization and structuring of the set of extracted terms and phrases into taxonomical chains;
  • Named Entity Recognition and Classification;
  • indexing of the text with respect to the extracted terminology, phrases and Named Entities;
  • extraction of relations between terms, phrases and Named Entities;
  • construction and visualisation of the relation graph.

An account is needed to access the service.

Language Coverage
English (Latin), Italian (Latin)

Get Started with the service

: Free of charge

Support

Helpdesk: italianlp@ilc.cnr.it

Documentation: T2K REST API Documentation

Other