You need to activate javascript for this site.
Menu Conteúdo Rodapé
  1. Home
  2. Courses
  3. Mathematics and Applications
  4. Data Science

Data Science

Code 15643
Year 1
Semester S2
ECTS Credits 6
Workload PL(30H)/T(30H)
Scientific area Informatics
Entry requirements N/A
Learning outcomes This course aims to introduce students to the fundamental topics of data science. At the end of the course, the student should be able to (1) list the steps involved in a data science project as well as the role of each one; (2) know the data science toolkit; (3) know how to apply data acquisition methods to get information from web pages and social web using python packages, apis, web scraping and web crawling; (4) import, manipulate, transform, relate, analyze and store numerical data, namely vectors and matrices, using Numpy; (5) import, clean, transform, manipulate, filter, aggregate, sort and conduct exploratory data analysis using Pandas; (6) communicate results through data visualization using matplotlib, plotly, seaborn and streamlit; (7) understand what is Generative AI and know how to use large language models; (8) be able to discuss the ethical concerns associated to the acquisition and use of data in data science projects;
Syllabus 1. Introduction to Data Science
2. Data Science Toolkit
3. Data Acquisition
4. Manipulation and Numerical Data Analysis with Numpy
5. Manipulation and Data Analysis with Pandas
6. Data Visualization with Matplotlib, Plotly, Seaborn and Streamlit
7. Introduction to Large Language Models (LLMs) for Natural Language Processing (NLP)
8. Ethics and Data Privacy
Main Bibliography - Abha Belorkar, Sharath Guntuku, Shubhangi Hora, Anshu Kumar (2020). Interactive Data Visualization with Python. 2nd edition. Packt Publishing.
- Foster Provost, Tom Fawcett (2013). Data Science for Business. O'Reilly
- Jake VanderPlas (2017). Python Data Science Handbook. O’Reilly
- Mike Loukides, Hilary Mason, DJ Patil (2018). Ethics and Data Science. O'Reilly
- Stefanie Molin (2019). Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python. Packt Publishing.
- Steve Blair (2019). Python Data Science: The Ultimate Handbook for Beginners on How to Explore NumPy for Numerical Data, Pandas for Data Analysis, IPython, Scikit-Learn and Tensorflow for Machine Learning and Business
- Wes McKinney (2017). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O´Reilly
- Daniel Gomes, Elena Demidova, Jane Winters, Thomas Risse (2021). The Past Web: Exploring Web Archives. Springer
Teaching Methodologies and Assessment Criteria Teaching/Learning Assessment
- MP1 - Mini Project I (individual): 15%
- MP2 - Mini Project II (individual): 15%
- MP3 - Mini Project III (individual): 10%
- P - Project (groups of 3 elements): 60%

The final classification of the course results from the weighted average of the classifications obtained in the defined evaluation components. The student obtains approval at the course, being exempt from the Exam, in case he/she obtains a grade equal to or greater than 9.5 values.

Evaluation by Exam
- Exam: 100% (computer-based test without access to the contents)

Admission to the Teaching/Learning and Exams:
- Minimum of 70% class attendance during the teaching-learning period (except student workers);
- Minimum score of 6 points in AE, where AE = ((MP1 * 15%) + (MP2 * 15%) + (MP3 * 10%) + (P * 60%))

Failure to comply with any of these items (including the submission of any of the projects after the foreseen period) prevents the student from being approved.
Language Portuguese. Tutorial support is available in English.
Last updated on: 2024-02-22

The cookies used in this website do not collect personal information that helps to identify you. By continuing you agree to the cookie policy.