You need to activate javascript for this site.
Menu Conteúdo Rodapé
  1. Home
  2. Courses
  3. Artificial Intelligence and Data Science
  4. Multimedia Data Processing

Multimedia Data Processing

Code 16683
Year 3
Semester S1
ECTS Credits 6
Workload PL(30H)/T(30H)
Scientific area Informatics
Entry requirements N/A
Learning outcomes The course aims to introduce the fundamental concepts of image and video processing and analysis, covering everything from digital representation to modern techniques based on deep neural networks and generative models.

At the end of the course, students should be able to:
a. Understand the fundamentals of digital image and video representation and storage;
b. Apply basic and advanced image processing techniques to improve quality and extract relevant information;
c. Understand how image feature extraction methods work;
d. Apply object recognition techniques using classical and machine learning approaches;
e. Understand and use convolutional neural networks for image analysis;
f. Understand the principles of generative image methods, including GANs and diffusion models;
g. Apply Transformer-based architectures for image analysis and information extraction;
Syllabus A. Image and Video Fundamentals: Basic concepts of digital images: pixels, resolution, color depth. Representation and storage: file formats, compression.
B. Image Processing Techniques: Basic transformations: brightness/contrast adjustment, histogram, normalization. Spatial and frequency filtering: smoothing, enhancement, edge detection. Mathematical morphology: dilation, erosion, opening, closing.
C. Image Analysis and Description Techniques: Feature detection: corners, edges, points of interest. Image descriptors. Object recognition.
D. Convolutional Neural Networks (CNNs). How the CNN model works. Fundamental architectures for classification and object detection.
E. Generative Image Methods. Introduction to generative methods: discriminative vs. generative, latent space. Generative Adversarial Networks (GANs). Diffusion Models.
F. Transformers for Images: Vision Transformer (ViT) Model. Vision-Language Models.
Main Bibliography Gonzalez, R. C., & Woods, R. E. (2018). Digital Image Processing (4th ed.). Pearson.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
O'Shea, K., & Nash, R. (2015). An Introduction to Convolutional Neural Networks. arXiv.
Szeliski, R. (2022). Computer Vision: Algorithms and Applications
Russell, B. & Torralba, A. (2021). Computer Vision: Foundations and Applications.
Teaching Methodologies and Assessment Criteria Teaching methodologies:
• Theoretical classes;
• Practical laboratory classes;
• Individual projects;
• Tutoring to clarify doubts and accompany students in the development of their projects.

Assessment methods and criteria:
The theoretical and practical components are assessed using two main elements:
- a written test (T) to assess knowledge, accounting for 70% of the final grade;
- an individual practical assignment with a report on its execution and presentation, accounting for 30% of the final grade.
Teaching-Learning Classification (CEA) = 0.7T + 0.3TP
Admission to the final exam: CEA >= 6 points (UBI regulations).
Language Portuguese. Tutorial support is available in English.
Last updated on: 2025-09-25

The cookies used in this website do not collect personal information that helps to identify you. By continuing you agree to the cookie policy.