{"@attributes":{"version":"2.0"},"channel":{"title":"Computer-Vision on Oriol Al\u00e0s Cerc\u00f3s","link":"https:\/\/oriolac.github.io\/tags\/computer-vision\/","description":"Recent content in Computer-Vision on Oriol Al\u00e0s Cerc\u00f3s","generator":"Hugo -- 0.150.0","language":"en-us","copyright":"Oriol Al\u00e0s Cerc\u00f3s","lastBuildDate":"Sat, 25 Apr 2026 20:10:23 +0100","item":[{"title":"Reviewing YOLO: You Only Look Once","link":"https:\/\/oriolac.github.io\/posts\/20260501-yolo\/","pubDate":"Sat, 25 Apr 2026 20:10:23 +0100","guid":"https:\/\/oriolac.github.io\/posts\/20260501-yolo\/","description":"<p>Object detection is one of the most popular tasks in computer vision, since it can be applied to a wide range of\napplications: robotics, autonomous driving or fault detection. In this post, we will try to give a brief overview of\nthe YOLO algorithm and the components that make it work.<\/p>\n<p>To do that, I have classified the main components of the algorithm into three categories:<\/p>\n<ul>\n<li>Characteristics based on the <strong>model architecture<\/strong>, how YOLO-based models improved the performance by using a new\narchitecture and which are the improvements made.<\/li>\n<li>Strategies based on the <strong>model training<\/strong>, such as the function loss or data augmentation.<\/li>\n<li>Methods for <strong>post-processing the output<\/strong> of the model, such as the non-maximum suppression (NMS) and the\nconfidence threshold.<\/li>\n<\/ul>\n<h2 id=\"two-stage-vs-one-stage-detectors\">Two-stage vs One-stage Detectors<\/h2>\n<p>Before YOLO, SoTA detectors were based on a <strong>two-stage detector<\/strong>: the first stage is used to detect the bounding\nboxes,\nand the second stage is used to classify the bounding boxes. This kind of model is called region-based detectors,\nbecause they need the region to then run the classification.<\/p>"},{"title":"The Generative Trilemma: A quick overview","link":"https:\/\/oriolac.github.io\/posts\/20250710-starting-diffusion\/","pubDate":"Thu, 10 Jul 2025 12:13:48 +0100","guid":"https:\/\/oriolac.github.io\/posts\/20250710-starting-diffusion\/","description":"<p>Generative models are a class of machine learning that learn a representation of the data trained on and they model the\ndata itself.<\/p>\n<p>Ideally, generative models should satisfy the following key requirements in a real environment:<\/p>\n<ul>\n<li><strong>High quality samples<\/strong> refers to those samples that captures the underlying patterns and\nstructures present in the data making them indistinguishable from human observers.<\/li>\n<li><strong>Fast Sampling<\/strong> is about the efficiency of image generation and the computational overhead\nthat can cause generative models.<\/li>\n<li><strong>Mode Coverage\/Diversity<\/strong> points out how the model is able to generate a full range of\nmods and diverse patterns present in the training data<\/li>\n<\/ul>\n<p>\n<figure>\n<img loading=\"lazy\" src=\"https:\/\/oriolac.github.io\/posts\/2025\/gen_tril\/gen_tril.png#center\" alt=\"alt text\" title=\"Fig. 1. The Generative Learning Trilemma\" \/>\n<figcaption\nstyle=\"\nfont-size: 15px;\ncolor: #7a7a7a;\nmargin-top: 0.5em;\ntext-align: center;\nfont-weight: 100;\n\"\n>\nFig. 1. The Generative Learning Trilemma\n<\/figcaption>\n<\/figure>\n<\/p>"},{"title":"Reconeixement de Vehicles i Reconstrucci\u00f3 de Tr\u00e0nsit amb NebulOus","link":"https:\/\/oriolac.github.io\/talks\/techmeeting-nebulous\/","pubDate":"Thu, 01 May 2025 00:00:00 +0000","guid":"https:\/\/oriolac.github.io\/talks\/techmeeting-nebulous\/","description":"<h2 id=\"overview\">Overview<\/h2>\n<p>Presentation on vehicle recognition and traffic reconstruction using the NebulOus cloud platform. The talk covers the application of computer vision techniques for traffic analysis and the deployment of ML models in cloud infrastructure.<\/p>\n<h2 id=\"key-topics\">Key Topics<\/h2>\n<ul>\n<li>Vehicle detection and tracking<\/li>\n<li>Traffic pattern reconstruction<\/li>\n<li>NebulOus cloud platform architecture<\/li>\n<li>Real-time processing challenges<\/li>\n<\/ul>\n<h2 id=\"event\">Event<\/h2>\n<p>TechMeeting is a technical meetup in Lleida focused on emerging technologies and their practical applications.<\/p>"},{"title":"Thresholding, filtering and morphological operations","link":"https:\/\/oriolac.github.io\/posts\/cv-techniques\/20240615-cv-techniques\/","pubDate":"Fri, 25 Oct 2024 17:51:55 +0200","guid":"https:\/\/oriolac.github.io\/posts\/cv-techniques\/20240615-cv-techniques\/","description":"Traditional computer vision techniques involve methods and algorithms that do not rely on deep learning or neural networks. Instead, these approaches are not data-driven and they use classical approaches to process and analyze images. So, in this post, we will explore three thresholding techniques"}]}}