Internal Preview! The data shown below is not valid for students! Please refer to the official Module Descriptions at the Examination Office.
Big Data Engineering BDE

General

study semester
4-5
standard study semester
6
cycle
every summer semester
duration
1 semester
SWS
4
ECTS
6
teaching language
English

People

responsible
Prof. Dr. Jens Dittrich
lectures
Prof. Dr. Jens Dittrich

Assessment & Grades

entrance requirements

Programming 1, Programming 2, Software Engineering Lab, Mathematics for Computer Scientists 1, as well as Fundamentals of Algorithms and Data Structures (all recommended)

assessment / exams

Successful participation in the exercises/project entitles the student to take part in the final exam.

grade

Will be determined from performance in exams, exercises, and (optionally) practical tasks. The exact modalities will be announced at the beginning of the module.

Workload

course type /weekly hours
  2 h lectures
+ 2 h tutorial
= 4 h (weekly)
total workload
   60 h of classes
+ 120 h private study
= 180 h (= 6 ECTS)

Aims / Competences to be developed

The lecture provides basic knowledge of fundamental concepts of data management and data analysis in Big Data Engineering.

As part of the exercises, a project can be carried out during the semester. This can be, for example, a social network (Facebook style) or any other project where data management techniques can be practiced (e.g., natural science data, image data, other web applications, etc.). First, this project will be modeled in E/R, then realized and implemented in a database schema. Then the project is extended to manage and analyze unstructured data as well. Altogether, all fundamental techniques that are important for managing and analyzing data are thus demonstrated on a single project.

Content

1 Introduction and classification 
    Classification and delimitation: "Big Data"
    Value of Data: The gold of the 21st century 
    Importance of database systems 
    What is data? 
    Modeling vs Reality 
    Costs of inadequate modeling 
    Using a database system vs developing it yourself 
    Positive examples for apps 
    Requirements 
    References 
    Lecture mode 

2 Data modeling 
    Motivation 
    E/R 
    Relational Model 
    domains, attributes
    entity type vs entity
    relation type vs relation
    Hierarchical Data 
    keys, foreign keys
    inheritance
    Redundancy, normalization, denormalization 

3 query languages 
    Relational Algebra 
    Graph-oriented query languages 

4 SQL 
    Basics 
    Relationship to relational algebra 
    CRUD-style vs analytical SQL
    SQL standards
    joins, grouping, aggregation, having
    PostgreSQL 
    Integrity constraints 
    Transaction concept 
    ACID
    Views

5 Basic query optimization
    Overview
    from WHAT to HOW 
    Costs of different operations
    EXPLAIN 
    Physical Design 
    Indexes, Tuning 
    Database tuning 
    Rule-based query optimization 
    Cost-based query optimization

6 Automatic Concurrency control 
    Serializability theory
    Isolation levels 
    Pessimistic concurrency control
    lock-based approaches, 2PL-variants

7 Grahical Data
    recursion in SQL, WITH RECURSIVE
    graph-oriented query languages: e.g. Cypher, Neo4J

8 Database Security
    SQL injection
    passwords
    salt and pepper

9 Ethical Aspects of Big Data
    mass surveillance
    NSA
    the "big data arithmetic"
    counter measures

Literature & Reading

Will be announced before the start of the course on the course page on the Internet.

Additional Information

This module was formerly also known as Informationssysteme. This module is identical in content to the German language module Big Data Engineering.

Curriculum

This module is part of the following study programmes:

Computer Science BSc (English): Grundlagen der Informatik
study semester: 4 / standard study semester: 6
Cybersecurity BSc (English): Komplementäre Themen der Cybersicherheit
study semester: 4-5 / standard study semester: 6