Conteúdo / Main content
Menu Rodapé
  1. Início
  2. Cursos
  3. Engenharia Informática
  4. Ciência de Dados

Ciência de Dados

Código 14479
Ano 1
Semestre S2
Créditos ECTS 6
Carga Horária PL(30H)/T(30H)
Área Científica Informática
Learning outcomes This course aims to introduce students to the fundamental topics of data science. At the end of the course, the student should be able to (1) list the steps involved in a data science project as well as the role of each one; (2) know the data science toolkit; (3) know how to apply data acquisition methods to get information from web pages and social web using python packages, apis, web scraping and web crawling; (4) import, manipulate, transform, relate, analyze and store numerical data, namely vectors and matrices, using Numpy; (5) import, clean, transform, manipulate, filter, aggregate, sort and conduct exploratory data analysis using Pandas; (6) communicate results through data visualization using matplotlib, plotly, seaborn and streamlit; (7) understand what is Generative AI and know how to use large language models; (8) be able to discuss ethical, privacy and transparency concerns associated with obtaining, using and manipulating data in data science projects;
Syllabus 1. Introduction to Data Science
2. Data Science Toolkit
3. Data Acquisition
4. Manipulation and Numerical Data Analysis with Numpy
5. Manipulation and Data Analysis with Pandas
6. Data Visualization with Matplotlib, Plotly, Seaborn and Streamlit
7. Introduction to Large Language Models (LLMs) for Natural Language Processing (NLP)
8. Ethics and Data Privacy
Teaching Methodologies and Assessment Criteria Classes are predominantly expositive, divided into the presentations given by teacher and presentations given by students about different subjects related to data science. After the presentations, discussion and criticism about the transmitted subjects will be important, considering the level of scientific maturity desirable for a course of the 2nd cycle. The practical classes will be divided into the implementation of an individual project and the solving of practical exercises by the students. The evaluation is divided into 2 components: theoretical, by written exams (10 points); and practical, by presenting/discussing the individual projects (10 points).
Main Bibliography - Belorkar, A., Guntuku, S., Hora, S. & Kumar, A. (2020). Interactive Data Visualization with Python.
- Provost, F. & Fawcett, T. (2013). Data Science for Business.
- VanderPlas, J. (2017). Python Data Science Handbook.
- Loukides, M., Mason, H. & Patil, D. (2018). Ethics and Data Science.
- Molin, S. (2019). Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python.
- Blair, S. (2019). Python Data Science: The Ultimate Handbook for Beginners on How to Explore NumPy for Numerical Data, Pandas for Data Analysis, IPython, Scikit-Learn and Tensorflow for Machine Learning and Business
- McKinney, W. (2017). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython.
- Gomes, D., Demidova, E., Winters, J. & Risse, T. (2021). The Past Web: Exploring Web Archives.
- Alammar, J. & Grootendorst, M. (2024). Hands-On Large Language Models.
- Rodriguez, C. (2024). Generative AI. Foundations in Python.
Language Portuguese. Tutorial support is available in English.
Imagem d@ Hugo Pedro Martins Carriço Proença  [Ficheiro Local]

Curso

Engenharia Informática
Data da última atualização: 2025-02-26
As cookies utilizadas neste sítio web não recolhem informação pessoal que permitam a sua identificação. Ao continuar está a aceitar a política de cookies.