TH Köln

Master Digital Sciences

Documents for Study Program Accreditation

Module »Seminar Knowledge Discovery« (SKD)

Organizational Details

Responsible for the module
Prof. Dr. Philipp Schaer (Faculty F03)
Lecturer(s)
Prof. Dr. Philipp Schaer (Faculty F03), Prof. Dr. Gernot Heisenberg (Faculty F03), Prof. Dr. Klaus Lepsky (Faculty F03), Prof. Dr. Konrad Förstner (Faculty F03)
Language
English
Offered in
Each Semester (Duration 1 Semester)
Location
Campus Köln Süd, or remote
Number of participants
minimum 5, maximum 20
Precondition
none
Recommendation
basic knowledge in one of the fields of knowledge discovery (like text or data mining, information retrieval, NLP)
ECTS
3
Effort
Total effort 90h
Total contact time
30h (30h seminar)
Time for self-learning
60h
Exam
Scientific paper in conjunction with a presentation
Competences taught by the module
Develop Visions, Analyze Domains, Model Systems
General criteria covered by the module
Internationalization, Interdisciplinarity, Digitization

Mapping to Focus Areas

Below, you find the module's mapping to the study program's focus areas. This is done as a contribution to all relevant focus areas (in ECTS, and content-wise). This is also relevant for setting the module in relation to other modules, and tells to what extent the module might be part of other study programs.

Focus Area ECTS (prop.) Module Contribution to Focus Area
Generating and Accessing Knowledge 3

In this seminar students will work on the most recent trends and topics related to Knowledge Discovery.

Learning Outcome

Knowledge Discovery describes the process of automated searches for patterns in large amounts of data that can be regarded as knowledge about the domain and use cases under investigation. These usecases can originate from fields like business, economics, social sciences and many other. Ideally knowledge discovery takes advantage of structured data (e.g. customer data, buying behavior, etc.). Most often only unstructured heterogeneous data is available. Therefore knowledge discovery can be seen as a holistic apporach to generate knowledge from unstructured data and information sources. The methods and approaches have evolved from data mining and are closely related to it both methodologically and terminologically. This process generates an abstraction of the input data, which in turn can lead to new data, information and knowledge.

In this seminar students will work on the most recent trends and topics related to Knowledge Discovery. They will read recent scientific papers in the field to get to know the current state-of-the-art. By analysing the state-of-the-art they will and later presenting the results of this analysis they will learn and practice how to communicate and discuss on these topics

The independent acquisition of specialized knowledge is a core competence of a Master’s student. Reading, analysing, generating a comprehensive overview and finaly presenting and discussion the results of this knowledge aquisition is transferable to all other areas of research and a cornerstone of scientific work.

Module Content

  • Data acquisition including crawling and scraping
  • Data preparation including cleasing, reduction and extraction
  • Relevance evaluation
  • Ranking by relevance criteria
  • Modern modelling techniques like specialized word embeddings, deep sequence modelling (LSTM, GRU, transformer-based models)

Forms of Teaching and Learning

  • Meetings to present and discuss papers

Learning Material Provided by Lecturer

  • List of selected literature and web resources

Literature

  • Chengxiang Zhai and Sean Massung (2016): Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining. Association for Computing Machinery and Morgan & Claypool. https://doi.org/10.1145/2915031