Tutorials
The role of the tutorials is to provide a platform for a more intensive scientific exchange amongst researchers interested in a particular topic and as a meeting point for the community. Tutorials complement the depth-oriented technical sessions by providing participants with broad overviews of emerging fields. A tutorial can be scheduled for 1.5 or 3 hours.
Tutorial on
Knowledge Discovery and Information Retrieval using the Shell
Instructor
|
Andreas Schmidt
University of Applied Sciences & Karlsruhe Institute of Technology
Germany
|
|
|
Abstract
The bash shell contains a wealth of useful programs for examining, filtering, transforming and also analyzing data. In conjunction with the underlying filter and pipe architecture, powerful data transformations can be performed interactively and iteratively within a very short time, which can for example support the knowledge discovery process with further dedicated tools like mathematica, R, etc. In the tutorial presented here, the most useful command-line tools from the GNU coreutils and their interaction will be introduced on the basis of a continuous scenario and clarified by means of two in-depth practical exercises in which the participants have to convict a murderer using a series of available police documents - exciting!
Website of the tutorial
Keywords
Filter and Pipes, Unix-Shell, gnu coreutils.
Aims and Learning Objectives
After completing the tutorial, the participants will be able to successfully create their own filter pipes for various tasks. They have internalized the idea of composing complex programs from small well defined components allows rapid prototyping, incremental iterations and easy experimentation.
Target Audience
Data Analysts, Database Developers, anyone who is interested in finding, filtering and transforming information.
Prerequisite Knowledge of Audience
Intermediate - Participants should be familiar (or at least interested) using a shell like bash, csh, DOS-shell. A basic knowledge of Regular Expressions is helpful, but not necessarily required.
Detailed Outline
• Introduction
• The Basic Building blocks (text utilities from coreutils)
• Practical Exercise I
• Composition using the Filter and Pipe Architecture
• Practical Exercise II
• Summary and Outlook