Cleans a dataset by finding and correcting errors, removing duplicates and unwanted data. – Linked Data for Professional Education https://ld4pe.dublincore.org Learning resources tagged by competency Thu, 19 Nov 2020 14:45:03 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.16 Open Refine 101 https://ld4pe.dublincore.org/learning_resource/open-refine-101/ Fri, 25 Aug 2017 08:34:44 +0000 https://ld4pe.dublincore.org/learning_resource/open-refine-101/ This free online course explains that while data cleaning, preparation and enrichment take up an enormous amount of time, and is nevertheless a crucial stage in the data science methodology. However, data transformation tools haven’t fully caught up with the popularity of data analysis. Learn why domain experts need powerful yet easy-to-use interfaces to explore new data sets, normalize them and process them via innovative services often available via an API only. The instructor demonstrates the strengths of OpenRefine, which are that it offers a self-service agile and iterative interface for data discovery and preparation, as well as an easy-to-learn scripting language.

URL: https://cognitiveclass.ai/courses/introduction-to-openrefine/
Keywords: Google Refine, General Refine Expression Language (GREL), Data enrichment, Data cleansing
Author: Magdinier, Martin
Publisher: Cognitive Class
Language: http://id.loc.gov/vocabulary/iso639-2/eng
Time required: P7H
Educational use: instruction
Educational audience: teacher-educationSpecialist
Interactivity type: mixed

]]>
Free Your Metadata: Clean up your metadata https://ld4pe.dublincore.org/learning_resource/free-your-metadata-clean-up-your-metadata/ Tue, 23 May 2017 07:03:37 +0000 https://ld4pe.dublincore.org/learning_resource/free-your-metadata-clean-up-your-metadata/ A brief tutorial containing both a screencast and text instructions for cleaning an example dataset (from the Powerhouse Museum) using Open Refine (formerly Google Refine). The walk-through includes the following steps: 1) Loading the data; 2) Inspecting the data; 3) Removing blank rows; 4) Removing duplicate rows; 5) Splitting cells with multiple values; 6) Removing blanks cells; 7) Clustering values; 8) Removing double category values. Links to the sample data files and the tool itself are provided.

URL: http://freeyourmetadata.org/cleanup/
Keywords: Open Refine, Google Refine, General Refine Expression Language (GREL), Data cleaning
Author: Verborgh, Ruben
Publisher: MaSTIC
Date created: 2016-01-01 05:00:00.000
Language: http://id.loc.gov/vocabulary/iso639-2/eng
Time required: P1H
Interactivity type: active

]]>
Joining the Linked Data Cloud in a Cost-Effective Manner https://ld4pe.dublincore.org/learning_resource/joining-the-linked-data-cloud-in-a-cost-effective-manner/ Tue, 23 May 2017 07:03:37 +0000 https://ld4pe.dublincore.org/learning_resource/joining-the-linked-data-cloud-in-a-cost-effective-manner/ Linked Data holds the promise to derive additional value from existing data throughout different sectors, but practitioners currently lack a straightforward methodology and the tools to experiment with Linked Data. This article gives a pragmatic overview of how general purpose Interactive Data Transformation tools (IDTs) can be used to perform the two essential steps to bring data into the Linked Data cloud: data cleaning and reconciliation. These steps are explained with the help of freely available data (Cooper-Hewitt National Design Museum, New York) and tools (Google Refine), making the process repeatable and understandable for practitioners.

URL: http://dx.doi.org/10.3789/isqv24n2-3.2012.04
Keywords: Linked Open Data (LOD), Data cleaning, Atomization, Clustering, Data reconciliation
Author: Van de Walle, Rik
Publisher: ISQ (Information Standards Quarterly)
Date created: 2012-05-01 04:00:00.000
Language: http://id.loc.gov/vocabulary/iso639-2/eng
Time required: P10M
Educational use: instruction
Educational audience: professional
Interactivity type: expositive

]]>
Linking Lives: Creating an End-User Interface Using Linked Data https://ld4pe.dublincore.org/learning_resource/linking-lives-creating-an-end-user-interface-using-linked-data/ Mon, 22 May 2017 07:03:30 +0000 https://ld4pe.dublincore.org/learning_resource/linking-lives-creating-an-end-user-interface-using-linked-data/ This article describes how LOCAH, a JISC-funded project working to make data from the Archives Hub available as Linked Data, continued on in a new form as "Linking Lives". Biographical data is presented on pages which are populated entirely by Linked Data from various authoritative sources (e.g., VIAF, DBpedia). One challenge faced involved data collection via the application' server vs client's web browse. Another was whether to reconcile of multiple source URIs via creation and persistence of a new URI or to map multiple URIs using the "owl:sameAs property".

URL: http://www.niso.org/publications/isq/2012/v24no2-3/stevenson/
Keywords: Linked Open Data (LOD), Libraries, Archives, and Museums (LAMs), HTTP URIs
Author: Stevenson, Jane
Publisher: ISQ (Information Standards Quarterly)
Date created: 2012-05-01 04:00:00.000
Language: http://id.loc.gov/vocabulary/iso639-2/eng
Time required: P20M
Educational use: instruction
Educational audience: professional
Interactivity type: expositive

]]>
Linked Data for Libraries, Archives and Museums: How to clean, link and publish your metadata https://ld4pe.dublincore.org/learning_resource/linked-data-for-libraries-archives-and-museums-how-to-clean-link-and-publish-your-metadata/ Mon, 22 May 2017 07:03:30 +0000 https://ld4pe.dublincore.org/learning_resource/linked-data-for-libraries-archives-and-museums-how-to-clean-link-and-publish-your-metadata/ This handbook teaches how to unlock the value of existing metadata through cleaning, reconciliation, enrichment and linking, as well as how to streamline the process of new metadata creation. It introduces the key concepts related to metadata standards and Linked Data and how they can be practically applied to existing metadata. Chapters are dedicated to modeling, cleaning, reconciling, enriching, and publishing one's data.

URL: http://book.freeyourmetadata.org/
Keywords: Libraries, Archives, and Museums (LAMs), Linked Open Data (LOD), HTTP URIs, Controlled vocabulary, Simple Knowledge Organization System (SKOS), Semantic Web
Author: Verborgh, Ruben
Publisher: Facet Publishing
Date created: 2014-06-19 04:00:00.000
Language: http://id.loc.gov/vocabulary/iso639-2/eng
Time required: P5H
Educational use: professionalDevelopment
Educational audience: teacher-educationSpecialist
Interactivity type: mixed

]]>
Methodological Guidelines for Publishing Government Linked Data https://ld4pe.dublincore.org/learning_resource/methodological-guidelines-for-publishing-government-linked-data/ Wed, 21 Oct 2015 20:15:07 +0000 https://ld4pe.dublincore.org/learning_resource/methodological-guidelines-for-publishing-government-linked-data/ Publishing Linked Data is a process that involves many steps, design decisions and technologies. Some initial guidelines have been provided by Linked Data publishers, but these are still far from covering all the steps that are necessary. This chapter, from the book "Linking Government Data" (Springer, 2011), proposes a set of methodological guidelines for the activities involved in the publication process. These guidelines are the result of the authors' experience in the production of Linked Data in several Governmental contexts and are validated by the GeoLinkedData and AEMETLinkedData use cases.

URL: https://www.lri.fr/~hamdi/datalift/tuto_inspire_2012/Suggestedreadings/egovld.pdf
Keywords: Linked Open Data, Government Open Data, HTTP URIs
Author: Gomez-Perez, Asuncion
Publisher: Springer
Date created: 2011-01-01 05:00:00.000
Language: http://id.loc.gov/vocabulary/iso639-2/eng
Time required: P25M
Educational use: professionalDevelopment
Educational audience: professional
Interactivity type: expositive

]]>
Knowledge Graph Identification https://ld4pe.dublincore.org/learning_resource/knowledge-graph-identification/ Tue, 15 Sep 2015 02:33:09 +0000 https://ld4pe.dublincore.org/learning_resource/knowledge-graph-identification/ Large-scale information processing systems are able to extract massive collections of interrelated facts, but unfortunately transforming these candidate facts into useful knowledge is a formidable challenge. In this paper, the authors show how uncertain extractions about entities and their relations can be transformed into a knowledge graph. They demonstrate the power of their method on a synthetic Linked Data corpus derived from the MusicBrainz music community and a real-world set of extractions from the NELL (Never-Ending Language Learner) project. NOTE: Also available as a PDF: http://videolectures.net/site/normal_dl/tag=817827/iswc2013_pujara_graph_identification_01.pdf

URL: http://videolectures.net/iswc2013_pujara_graph_identification/
Keywords: Probabilistic SoftLogic (PSL), Ontology, Named entity extraction, Knowledge graph
Author: Pujara, Jay
Publisher: videolectures.net
Date created: 2013-11-28 05:00:00.000
Language: http://id.loc.gov/vocabulary/iso639-2/eng
Time required: P15M
Educational use: instruction
Educational audience: generalPublic
Interactivity type: expositive

]]>
Providing Linked Data https://ld4pe.dublincore.org/learning_resource/providing-linked-data-2/ Tue, 15 Sep 2015 02:33:09 +0000 https://ld4pe.dublincore.org/learning_resource/providing-linked-data-2/ This video presentation covers the whole spectrum of Linked Data production and exposure. It begins with a grounding in Linked Data principles and best practices, with special emphasis on the VoID vocabulary. It then covers R2RML (for operating on relational databases), Open Refine (for operating on spreadsheets), and GATECloud (for operating on natural language). Finally, the presentation describes means to increase inter-linkage between datasets, focusing on tools like Silk. NOTE: This a three-part lecture, with all three videos hosted at the same URL. These videos represent material from several lessons that comprised a larger "module" of the EUCLID Project. As such, they cover a wider range of topics than most resources.

URL: http://videolectures.net/eswc2013_norton_acosta_linked_data/
Keywords: Vocabulary of Interlinked Datasets (VOID), Data extraction, Link discovery, Validation, Simple Knowledge Organization System (SKOS), Linked Open Vocabularies (LOV)
Author: Norton, Barry
Publisher: EUCLID Project
Date created: 2013-11-05 05:00:00.000
Language: http://id.loc.gov/vocabulary/iso639-2/eng
Time required: P3H
Educational use: instruction
Educational audience: professional
Interactivity type: expositive

]]>
Preserving Linked Data: Challenges and Opportunities https://ld4pe.dublincore.org/learning_resource/preserving-linked-data-challenges-and-opportunities/ Tue, 15 Sep 2015 02:33:09 +0000 https://ld4pe.dublincore.org/learning_resource/preserving-linked-data-challenges-and-opportunities/ In this video, the speaker begins by discussing the Web of Data and Linked Data Principles, then switches focus to the challenges of publishing and maintaining Linked Data datasets over time. This includes data quality issues, such as incompleteness, redundancy, inconsistency, and incorrectness. Difficulty employing entity-relationship model.

URL: http://videolectures.net/eswc2013_christophides_linked_data/
Keywords: Entity resolution, Digital preservation, Similarity functions, Web of Data
Author: Christophides, Vassilis
Publisher: videolectures.net
Date created: 2013-11-05 05:00:00.000
Language: http://id.loc.gov/vocabulary/iso639-2/eng
Time required: P1H
Educational use: professionalDevelopment
Educational audience: generalPublic
Interactivity type: expositive

]]>
DBpedia: Visualising Linked Data – Graph of Members of Punk Rock Bands https://ld4pe.dublincore.org/learning_resource/dbpedia-visualising-linked-data-graph-of-members-of-punk-rock-bands/ https://ld4pe.dublincore.org/learning_resource/dbpedia-visualising-linked-data-graph-of-members-of-punk-rock-bands/#respond Thu, 13 Aug 2015 14:33:32 +0000 https://ld4pe.dublincore.org/learning_resource/dbpedia-visualising-linked-data-graph-of-members-of-punk-rock-bands/ Using DBpedia, Google Refine, R, and Gephi to play with Linked Data. This tutorial uses data from DBpedia with the aforementioned tools to create visualizations of musicians who performed in different bands together.

URL: https://www.youtube.com/watch?v=qAVGpb8KMpk
Keywords: R (programming language), Google Refine, Data visualization, SPARQL, DBpedia, Gephi
Author: Sherlock, David
Date created: 2013-05-31 07:00:00.000
Language: http://id.loc.gov/vocabulary/iso639-2/eng
Time required: P30M

]]>
https://ld4pe.dublincore.org/learning_resource/dbpedia-visualising-linked-data-graph-of-members-of-punk-rock-bands/feed/ 0