You need to activate javascript for this site.
Menu Conteúdo Rodapé
  1. Home
  2. Courses
  3. Artificial Intelligence and Data Science
  4. Multimedia Data Processing

Multimedia Data Processing

Code 16683
Year 3
Semester S1
ECTS Credits 6
Workload PL(30H)/T(30H)
Scientific area Informatics
Entry requirements N/A
Learning outcomes The course aims to introduce the fundamental concepts of image and video processing and analysis, covering everything from digital representation to modern techniques based on deep neural networks and generative models.

At the end of the course, students should be able to:
a. Understand the fundamentals of digital image and video representation and storage;
b. Apply basic and advanced image processing techniques to improve quality and extract relevant information;
c. Understand how image feature extraction methods work;
d. Apply object recognition techniques using classical and machine learning approaches;
e. Understand and use convolutional neural networks for image analysis;
f. Understand the principles of generative image methods, including GANs and diffusion models;
g. Apply Transformer-based architectures for image analysis and information extraction;
Syllabus A. Image and Video Fundamentals: Basic concepts of digital images: pixels, resolution, color depth. Representation and storage: file formats, compression.
B. Image Processing Techniques: Basic transformations: brightness/contrast adjustment, histogram, normalization. Spatial and frequency filtering: smoothing, enhancement, edge detection. Mathematical morphology: dilation, erosion, opening, closing.
C. Image Analysis and Description Techniques: Feature detection: corners, edges, points of interest. Image descriptors. Object recognition.
D. Convolutional Neural Networks (CNNs). How the CNN model works. Fundamental architectures for classification and object detection.
E. Generative Image Methods. Introduction to generative methods: discriminative vs. generative, latent space. Generative Adversarial Networks (GANs). Diffusion Models.
F. Transformers for Images: Vision Transformer (ViT) Model.
G. Vision-Language Models (CLIP).
Main Bibliography Gonzalez, R. C., & Woods, R. E. (2018). Digital Image Processing (4th ed.). Pearson.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
O'Shea, K., & Nash, R. (2015). An Introduction to Convolutional Neural Networks. arXiv.
Szeliski, R. (2022). Computer Vision: Algorithms and Applications
Russell, B. & Torralba, A. (2021). Computer Vision: Foundations and Applications.
Language Portuguese. Tutorial support is available in English.
Last updated on: 2025-10-28

The cookies used in this website do not collect personal information that helps to identify you. By continuing you agree to the cookie policy.