Home
Courses
Artificial Intelligence and Data Science
Multimedia Data Processing

Multimedia Data Processing

Code	16683
Year	3
Semester	S1
ECTS Credits	6
Workload	PL(30H)/T(30H)
Scientific area	Informatics
Entry requirements	N/A
Learning outcomes	The course aims to introduce the fundamental concepts of image and video processing and analysis, covering everything from digital representation to modern techniques based on deep neural networks and generative models. At the end of the course, students should be able to: a. Understand the fundamentals of digital image and video representation and storage; b. Apply basic and advanced image processing techniques to improve quality and extract relevant information; c. Understand how image feature extraction methods work; d. Apply object recognition techniques using classical and machine learning approaches; e. Understand and use convolutional neural networks for image analysis; f. Understand the principles of generative image methods, including GANs and diffusion models; g. Apply Transformer-based architectures for image analysis and information extraction;
Syllabus	A. Image and Video Fundamentals: Basic concepts of digital images: pixels, resolution, color depth. Representation and storage: file formats, compression. B. Image Processing Techniques: Basic transformations: brightness/contrast adjustment, histogram, normalization. Spatial and frequency filtering: smoothing, enhancement, edge detection. Mathematical morphology: dilation, erosion, opening, closing. C. Image Analysis and Description Techniques: Feature detection: corners, edges, points of interest. Image descriptors. Object recognition. D. Convolutional Neural Networks (CNNs). How the CNN model works. Fundamental architectures for classification and object detection. E. Generative Image Methods. Introduction to generative methods: discriminative vs. generative, latent space. Generative Adversarial Networks (GANs). Diffusion Models. F. Transformers for Images: Vision Transformer (ViT) Model. G. Vision-Language Models (CLIP).
Main Bibliography	Gonzalez, R. C., & Woods, R. E. (2018). Digital Image Processing (4th ed.). Pearson. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. O'Shea, K., & Nash, R. (2015). An Introduction to Convolutional Neural Networks. arXiv. Szeliski, R. (2022). Computer Vision: Algorithms and Applications Russell, B. & Torralba, A. (2021). Computer Vision: Foundations and Applications.
Language	Portuguese. Tutorial support is available in English.