Most of the Web of Data is limited to a large compendium of encyclopedic knowledge describing entities. The timely and massive extraction of RDF facts from unstructured data is a huge challenge. The speaker addresses the problem by presenting an approach that allows for extracting RDF triples from unstructured data streams. The approach employs statistical methods in combination with de-duplication, disambiguation and both unsupervised and supervised machine learning techniques to create a knowledge base that reflects the content of the input streams.

URL: http://videolectures.net/iswc2013_ngonga_ngomo_data_streams/
Keywords: Streaming data, Unstructured data, Named-entity extraction, Part-of-speech (POS) tagging, Machine learning
Author: Ngonga, Axel-Cyrille Ngomo
Date created: 2013-11-28 05:00:00.000
Language: http://id.loc.gov/vocabulary/iso639-2/eng
Time required: P15M
Educational use: professionalDevelopment
Educational audience: professional
Interactivity type: expositive

  • Competencies