基礎科目 2019-2020學年

理學碩士學位(數據科學 )課程

FST提供的4門基礎數據科學科目使學生能夠獲得:

– 使用編程語言來處理數據,產生數據可視化,並解釋這些可視化數據;

– 了解不同領域的大數據分析、開發的數據庫和數據挖掘概念及技術;

– 了解數據可視化在數據科學和大數據分析中的重要意義,獲取知識和技能以使用數據可視化工具呈現大量數據;

– 學習基本的深度學習方法;

– Bayesian Networks(貝葉斯網絡);

– CNN (卷積神經網絡);

– 表征學習和強化學習。

數據科學編程引論– 3學分

課程簡介

This course is designed for students who are new to the world of data science. After the introduction of some basic arithmetic, variables, and data structures in Python, students will start to learn how to collect and extract data from real datasets. Some data analytical skills using the control flows and Python packages (e.g., NumPy, SciPy, Pandas, etc.) will be introduced. To address the needs of big data processing, some distributed computing frameworks (e.g., Spark) and visualization tools with Python will be discussed. Students may apply some basic learning algorithms with Python packages (e.g., scikit-learn) to extract knowledge from data.

課程目標

  • apply the Python language fundamentals, including basic syntax, variables, and process flows, to write their first program
  • apply functions and import packages to work with complex and/or large data sets
  • apply scientific packages (e.g., NumPy and SciPy) to perform useful computations
  • process text file using external packages (e.g., tabula)
  • apply stunning data visualization tools to visualize large data sets

課本

Learning Python, 5th Edition 5th Edition by Mark Lutz

備註:本課程也將提供給高年級UG學生(四年級學生)作為選修科目。

數據科學與數據可視化 -3 學分

課程簡介

This course is designed to enable students to learn the significance of data visualization in data science and big data analytics, and develop knowledge and skills to present quantitative data using data visualization tools. This course emphasizes on the practical aspects of data science with a focus on using R or Python programming language to process data, produce visualizations, and interpret these visualizations. Students will learn the practice of data cleaning, reshaping of data, basic tabulations, aggregations and visual representation in order to increase the understanding of complex data and models.

課程目標
  • Describe the development and principles of data analytics and data visualization
  • Identify different types of data (qualitative vs quantitative) and use appropriate analysis techniques (probabilistic, regression, cluster, etc.) best to explore them
  • Draw conclusions and formulate hypotheses from data presented graphically
  • Apply theories of data analytics and data visualization and competence in using software (Python, R/RStudio, Excel, etc.) for data visualization and data analytics
  • Analyze, critique, and revise data visualizations

數據庫和數據挖掘技術 -3 學分

課程簡介

This course is designed to enable students to learn the database and data mining concepts and techniques for big data analytics and development in different domains. The course concentrates on the practical issues of database and data mining for solving big data problems. The content includes data modeling in database and data warehouse, SQL, Python programming for database, Python programming and R programming for data mining applications. Students will learn the skills of database modeling, querying, and programming, as well as the programming techniques for data mining.

課程目標

  • Model data in relational database using ER techniques
  • Construct and develop database applications using SQL and Python language
  • Perform data warehouse analysis
  • Construct and perform data mining tasks using Python or R language

備註:本課程還將提供給高年級UG學生(四年級學生)作為選修科目。

機器學習工具 -3 學分

課程簡介

The course will start from the very beginning of the ML basis. First, the basic concepts such as liner algebra; probability and information theory, and numerical methods will be introduced. Next machine learning overview, inductive learning, and representation learning will be introduced. Basic deep learning processes are designed as artificial neural network; Bayesian Networks and learning; Deep learning and deep neural networks; convolution neural network. Throughout the course, practical methodology of using tools such as Tensorflow, Keras or Scikit-learn etc. will be emphasized.

課程目標

  • Undestand the fundamentals of machine learning, including basic learning techniques through big dataset, and learning process flows
  • Use machine learning tools (e.g., Keras, Scikit-learn and Tensorflow etc) on datasets.
  • Design basic learning approaches of Bayesian networks, inductive learning and representation learning etc with the tools
  • Design basic artificial neural networks, feedforward neural network; BP algorithm and deep models such as convolution neural networks with the tools
  • 先修條件

    • 數據科學編程引論

課本

Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning, An MIT Press; 2016.