A curated list of free, high-quality, university-level courses with video lectures related to the field of Computer Vision.
Signals and Systems 6.003 (MIT), Prof. Dennis Freeman
Signals and Systems 6.003 covers the fundamentals of signal and system analysis, focusing on representations of discrete-time and continuous-time signals (singularity functions, complex exponentials and geometrics, Fourier representations, Laplace and Z transforms, sampling) and representations of linear, time-invariant systems (difference and differential equations, block diagrams, system functions, poles and zeros, convolution, impulse and step responses, frequency responses). Applications are drawn broadly from engineering and physics, including feedback and control, communications, and signal processing.
This course provides a comprehensive treatment of the theory, design, and implementation of digital signal processing algorithms. In the first half of the course, we emphasize frequency-domain and Z-transform analysis. In the second half of the course, we investigate advanced topics in signal processing, including multirate signal processing, filter design, adaptive filtering, quantizer design, and power spectrum estimation. The course is fairly application-independent, to provide a strong theoretical foundation for future study in communications, control, or image processing. This course was originally offered at the graduate level but retooled in 2009 to be senior-level.
Digital Signal Processing (EPFL), Paolo Prandoni, Martin Vetterli
In this series of four courses, you will learn the fundamentals of Digital Signal Processing from the ground up. Starting from the basic definition of a discrete-time signal, we will work our way through Fourier analysis, filter design, sampling, interpolation and quantization to build a DSP toolset complete enough to analyze a practical communication system in detail. Hands-on examples and demonstration will be routinely used to close the gap between theory and practice.
In this course, you will learn the science behind how digital images and video are made, altered, stored, and used. We will look at the vast world of digital imaging, from how computers and digital cameras form images to how digital special effects are used in Hollywood movies to how the Mars Rover was able to send photographs across millions of miles of space.
The course starts by looking at how the human visual system works and then teaches you about the engineering, mathematics, and computer science that makes digital images work. You will learn the basic algorithms used for adjusting images, explore JPEG and MPEG standards for encoding and compressing video images, and go on to learn about image segmentation, noise removal and filtering. Finally, we will end with image processing techniques used in medicine.
An introduction to the field of image processing, covering both analytical and implementation aspects. Topics include the human visual system, cameras and image formation, image sampling and quantization, spatial- and frequency-domain image enhancement, filter design, image restoration, image coding and compression, morphological image processing, color image processing, image segmentation, and image reconstruction. Real-world examples and assignments drawn from consumer digital imaging, security and surveillance, and medical image processing. This course forms a good basis for our extensive graduate image processing and computer vision courses.
Fundamentals of Digital Image and Video Processing (Northwestern University), Prof. Aggelos K. Katsaggelos
This course will cover the fundamentals of image and video processing. We will provide a mathematical framework to describe and analyze images and videos as two- and three-dimensional signals in the spatial, spatio-temporal, and frequency domains. In this class not only will you learn the theory behind fundamental processing tasks including image/video enhancement, recovery, and compression - but you will also learn how to perform these key processing tasks in practice using state-of-the-art techniques and tools. We will introduce and use a wide variety of such tools – from optimization toolboxes to statistical techniques. Emphasis on the special role sparsity plays in modern image and video processing will also be given. In all cases, example images and videos pertaining to specific application domains will be utilized.
This course provides the student with the theoretical background to allow them to apply state of the art image and multi-dimensional signal processing techniques. The course teaches students to solve practical problems involving the processing of multidimensional data such as imagery, video sequences, and volumetric data. The types of problems students are expected to solve are automated mensuration from multidimensional data, and the restoration, reconstruction, or compression of multidimensional data. The tools used in solving these problems include a variety of feature extraction methods, filtering techniques, segmentation techniques, and transform methods.
Techniques for automated extraction of high-level information from images generated by cameras, three-dimensional surface sensors, and medical devices. Typical applications include detection of objects in various types of images and describing populations of biological specimens appearing in medical imagery.
Digital Image Processing EE225B (UC Berkeley), Prof. Avideh Zakhor
This course covers the following topics: 2-D sequences and systems, separable systems, projection slice thm, reconstruction from projections and partial Fourier information, Z transform, different equations, recursive computability, 2D DFT and FFT, 2D FIR filter design; human eye, perception, psychophysical vision properties, photometry and colorimetry, optics and image systems; image enhancement, image restoration, geometrical image modification, morphological image processing, halftoning, edge detection, image compression: scalar quantization, lossless coding, huffman coding, arithmetic coding dictionary techniques, waveform and transform coding DCT, KLT, Hadammard, multiresolution coding pyramid, subband coding, Fractal coding, vector quantization, motion estimation and compensation, standards: JPEG, MPEG, H.xxx, pre- and post-processing, scalable image and video coding, image and video communication over noisy channels.
Introduction to digital image processing techniques for enhancement, compression, restoration, reconstruction, and analysis. Lecture and laboratory experiments covering a wide range of topics including 2-D signals and systems, image analysis, image segmentation; achromatic vision, color image processing, color imaging systems, image sharpening, interpolation, decimation, linear and nonlinear filtering, printing and display of images; image compression, image restoration, and tomography.
The lecture focuses on the challenging task of extracting robust, quantitative metrics from imaging data and is intended to bridge the gap between pure signal processing and the experimental science of imaging. The course will focus on techniques, scalability, and science-driven analysis.
First Principles of Computer Vision is a lecture series presented by Shree Nayar who is faculty in the Computer Science Department, School of Engineering and Applied Sciences, Columbia University. Computer Vision is the enterprise of building machines that “see.” This series focuses on the physical and mathematical underpinnings of vision and has been designed for students, practitioners, and enthusiasts who have no prior knowledge of computer vision.
The course is introductory level. It will cover the basic topics of computer vision, and introduce some fundamental approaches for computer vision research.
Computer Vision EENG 512 (Colorado School of Mines), William Hoff
This course provides an overview of this field, starting with image formation and low level image processing. We then go into detail on the theory and techniques for extracting features from images, measuring shape and location, and recognizing objects.
3D Computer Vision CS4277/CS5477 (National University of Singapore), Gim Hee Lee
This is an introductory course on 3D Computer Vision which was recorded for online learning at NUS due to COVID-19. The topics covered include: Lecture 1: 2D and 1D projective geometry. Lecture 2: Rigid body motion and 3D projective geometry. Lecture 3: Circular points and Absolute conic. Lecture 4: Robust homography estimation. Lecture 5: Camera models and calibration. Lecture 6: Single view metrology. Lecture 7: The fundamental and essential matrices. Lecture 8: Absolute pose estimation from points or lines. Lecture 9: Three-view geometry from points and/or lines. Lecture 10: Structure-from-Motion (SfM) and bundle adjustment. Lecture 11: Two-view and multi-view stereo. Lecture 12: Generalized cameras. Lecture 13: Auto-Calibration.
Multiple View Geometry in Computer Vision (IT Sligo), Sean Mullery
Computer Vision (IIT Kanpur), Prof. Jayanta Mukhopadhyay
The course will have a comprehensive coverage of theory and computation related to imaging geometry, and scene understanding. It will also provide exposure to clustering, classification and deep learning techniques applied in this area.
Computer Vision CS-442 (EPFL), Pascal Fua
The students will be introduced to the basic techniques of the field of Computer Vision. They will learn to apply Image Processing techniques where appropriate. We will concentrate on the black and white and color images acquired using standard video cameras. We will introduce basic processing techniques, such as edge detection, segmentation, texture characterization, and shape recognition.
In this course, we will cover many of the basic concepts and algorithms of computer vision: single-view and multi-view geometry, lighting, linear filters, texture, interest points, tracking, RANSAC, K-means clustering, segmentation, EM algorithm, recognition, and so on. In homeworks, you will put many of these concepts into practice. As this is a survey course, we will not go into great depth on any topic, but at the end of the course, you should be prepared for any further vision-related investigation and application.
This course emphasizes research topics that underlie the advanced visual effects that are becoming increasingly common in commercials, music videos and movies. Topics include classical computer vision algorithms used on a regular basis in Hollywood (such as blue-screen matting, structure from motion, optical flow, and feature tracking) and exciting recent developments that form the basis for future effects (such as natural image matting, multi-image compositing, image retargeting, and view synthesis). We also discuss the technologies behind motion capture and three-dimensional data acquisition. Analysis of behind-the-scenes videos and in-depth interviews with Hollywood visual effects artists tie the mathematical concepts to real-world filmmaking.
Image processing and Computer Vision (CBCSL), Aleix M. Martinez
This class is a general introduction to computer vision. It covers standard techniques in image processing like filtering, edge detection, stereo, flow, etc. (old-school vision), as well as newer, machine-learning based computer vision.
This is an Advanced Computer Vision which will expose graduate students to the cutting-edge research. In each class we will discuss one recent research paper related to active areas of current research, in particular employing Deep Learning. Computer vision has been very active area of research for many decades and researchers have been working on solving important challenging problems. During the last few years, Deep Learning involving Artificial Neural Networks has been disruptive force in computer vision. Employing deep learning, tremendous progress has been made in a very short time in solving difficult problems and very impressive results have obtained in image and video classification, localization, semantic segmentation, etc. New techniques, datasets, hardware and software libraries are emerging almost every day. Deep Computer vision is impacting research in Robotics, Natural Language understanding, Computer Graphics, multi-modal analysis etc.
Variational Methods are among the most classical techniques for optimization of cost functions in higher dimension. Many challenges in Computer Vision and in other domains of research can be formulated as variational methods. Examples include denoising, deblurring, image segmentation, tracking, optical flow estimation, depth estimation from stereo images or 3D reconstruction from multiple views.
In this class, I will introduce the basic concepts of variational methods, the Euler-Lagrange calculus and partial differential equations. I will discuss how respective computer vision and image analysis challenges can be cast as variational problems and how they can be efficiently solved. Towards the end of the class, I will discuss convex formulations and convex relaxations which allow to compute optimal or near-optimal solutions in the variational setting.
The lecture introduces the basic concepts of image formation - perspective projection and camera motion. The goal is to reconstruct the three-dimensional world and the camera motion from multiple images. To this end, one determines correspondences between points in various images and respective constraints that allow to compute motion and 3D structure. A particular emphasis of the lecture is on mathematical descriptions of rigid body motion and of perspective projection. For estimating camera motion and 3D geometry we will make use of both spectral methods and methods of nonlinear optimization.
Advanced Computer Vision (CBCSL), Aleix M. Martinez
Graduate Summer School on Computer Vision (IPAM at UCLA)
Mobile Sensing And Robotics I (University of Bonn), Cyrill Stachniss
The lecture will cover different topics and techniques in the context of environment modeling with mobile robots. We will cover techniques such as SLAM with the family of Kalman filters, information filters, particle filters. We will furthermore investigate graph-based approaches, least-squares error minimization, techniques for place recognition and appearance-based mapping, and data association.
Introduction of Biometric traits and its aim, image processing basics, basic image operations, filtering, enhancement, sharpening, edge detection, smoothening, enhancement, thresholding, localization. Fourier Series, DFT, inverse of DFT. Biometric system, identification and verification. FAR/FRR, system design issues. Positive/negative identification. Biometric system security, authentication protocols, matching score distribution, ROC curve, DET curve, FAR/FRR curve. Expected overall error, EER, biometric myths and misrepresentations. Selection of suitable biometric. Biometric attributes, Zephyr charts, types of multi biometrics. Verification on multimodel system, normalization strategy, Fusion methods, Multimodel identification. Biometric system security, Biometric system vulnerabilities, circumvention, covert acquisition, quality control, template generation, interoperability, data storage. Recognition systems: Face,Signature, Fingerprint,Ear, Iris etc.
This course is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. The final assignment will involve training a multi-million parameter convolutional neural network and applying it on the largest image classification dataset (ImageNet). We will focus on teaching how to set up the problem of image recognition, the learning algorithms (e.g. backpropagation), practical engineering tricks for training and fine-tuning the networks and guide the students through hands-on assignments and a final course project. Much of the background and materials of this course will be drawn from the ImageNet Challenge.
Deep Learning for Computer Vision (University of Michigan), Justin Johnson
This course is a deep dive into details of neural-network based deep learning methods for computer vision. During this course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. We will cover learning algorithms, neural network architectures, and practical engineering tricks for training and fine-tuning networks for visual recognition tasks.
Convolutional Neural Networks, Prof. Andrew Ng
This course will teach you how to build convolutional neural networks and apply it to image data. Thanks to deep learning, computer vision is working far better than just two years ago, and this is enabling numerous exciting applications ranging from safe autonomous driving, to accurate face recognition, to automatic reading of radiology images.
Convolutional Networks, Ian Goodfellow
This course examines the neural bases of sensory perception. The focus is on physiological and anatomical studies of the mammalian nervous system as well as behavioral studies of animals and humans. Topics include visual pattern, color and depth perception, auditory responses and sound localization, and somatosensory perception.
Visual Perception and the Brain (Duke University), Dale Purves
Learners will be introduced to the problems that vision faces, using perception as a guide. The course will consider how what we see is generated by the visual system, what the central problem for vision is, and what visual perception indicates about how the brain works. The evidence will be drawn from neuroscience, psychology, the history of vision science and what philosophy has contributed. Although the discussions will be informed by visual system anatomy and physiology, the focus is on perception. We see the physical world in a strange way, and goal is to understand why.
High-level Vision (CBCSL)
This course provides a broad introduction to machine learning and statistical pattern recognition. Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance tradeoffs; VC theory; large margins); reinforcement learning and adaptive control. The course will also discuss recent applications of machine learning, such as to robotic control, data mining, autonomous navigation, bioinformatics, speech recognition, and text and web data processing.
This is an introductory course by Caltech Professor Yaser Abu-Mostafa on machine learning that covers the basic theory, algorithms, and applications. Machine learning (ML) enables computational systems to adaptively improve their performance with experience accumulated from the observed data. ML techniques are widely applied in engineering, science, finance, and commerce to build systems for which we do not have full mathematical specification (and that covers a lot of systems). The course balances theory and practice, and covers the mathematical as well as the heuristic aspects.
This course covers advanced machine learning methods allowing for so-called "structured prediction". The goal is to make multiple predictions that interact in a nontrivial way; and we take these interactions into account both during training and at test time.
In this lecture, the students will be introduced into the most frequently used machine learning methods in computer vision and robotics applications. The major aim of the lecture is to obtain a broad overview of existing methods, and to understand their motivations and main ideas in the context of computer vision and pattern recognition.
The goal of this course is to give an introduction to the field of machine learning. The course will teach you basic skills to decide which learning algorithm to use for what problem, code up your own learning algorithm and evaluate and debug it.
Introduction to Machine Learning and Pattern Recognition (CBCSL), Aleix M. Martinez
This class offers a hands-on approach to machine learning and data science. The class discusses the application of machine learning methods like SVMs, Random Forests, Gradient Boosting and neural networks on real world dataset, including data preparation, model selection and evaluation. This class complements COMS W4721 in that it relies entirely on available open source implementations in scikit-learn and tensor flow for all implementations. Apart from applying models, we will also discuss software development tools and practices relevant to productionizing machine learning models.
The focus of the lecture is on both algorithmic and theoretic aspects of machine learning. We will cover many of the standard algorithms and learn about the general principles and theoretic results for building good machine learning algorithms. Topics range from well-established results to very recent results.
Taught by Jeremy Howard (Kaggle's #1 competitor 2 years running, and founder of Enlitic). Learn the most important machine learning models, including how to create them yourself from scratch, as well as key skills in data preparation, model validation, and building data products.There are around 24 hours of lessons, and you should plan to spend around 8 hours a week for 12 weeks to complete the material. The course is based on lessons recorded at the University of San Francisco for the Masters of Science in Data Science program. We assume that you have at least one year of coding experience, and either remember what you learned in high school math, or are prepared to do some independent study to refresh your knowledge.
Deep Learning is one of the most highly sought after skills in AI. In this course, you will learn the foundations of Deep Learning, understand how to build neural networks, and learn how to lead successful machine learning projects. You will learn about Convolutional networks, RNNs, LSTM, Adam, Dropout, BatchNorm, Xavier/He initialization, and more.
Deep Learning Specialization, Prof. Andrew Ng, Kian Katanforoosh
In five courses, you will learn the foundations of Deep Learning, understand how to build neural networks, and learn how to lead successful machine learning projects. You will learn about Convolutional networks, RNNs, LSTM, Adam, Dropout, BatchNorm, Xavier/He initialization, and more. You will work on case studies from healthcare, autonomous driving, sign language reading, music generation, and natural language processing. You will master not only the theory, but also see how it is applied in industry. You will practice all these ideas in Python and in TensorFlow, which we will teach.
Deep Learning EE-559 (EPFL), François Fleuret
This course is a thorough introduction to deep-learning, with examples in the PyTorch framework: machine learning objectives and main challenges, tensor operations, automatic differentiation, gradient descent, deep-learning specific techniques (batchnorm, dropout, residual networks), image understanding, generative models, adversarial generative models, recurrent models, attention models, NLP.
MIT's introductory course on deep learning methods with applications to computer vision, natural language processing, biology, and more! Students will gain foundational knowledge of deep learning algorithms and get practical experience in building neural networks in TensorFlow. Course concludes with a project proposal competition with feedback from staff and panel of industry sponsors. Prerequisites assume calculus (i.e. taking derivatives) and linear algebra (i.e. matrix multiplication), we'll try to explain everything else along the way! Experience in Python is helpful but not necessary.
Deep Learning for Coders with fastai and PyTorch: AI Applications Without a PhD.
This course will expose students to cutting-edge research — starting from a refresher in basics of neural networks, to recent developments.
In this course we will learn about the basics of deep neural networks, and their applications to various AI tasks. By the end of the course, it is expected that students will have significant familiarity with the subject, and be able to apply Deep Learning to a variety of tasks. They will also be positioned to understand much of the current literature on the topic and extend their knowledge through further study.
Lecture videos for the introductory Computer Graphics class at Carnegie Mellon University.
Computer Graphics (Utrecht University), Wolfgang Huerst
Recordings from an introductory lecture about computer graphics given by Wolfgang Hürst, Utrecht University, The Netherlands, from April 2012 till June 2012.
Computer Graphics ECS175 (UC Davis), Prof. Kenneth Joy
Computer Graphics (ECS175) teaches the basic principles of 3-dimensional computer graphics. The focus will be the elementary mathematics techniques for positioning objects in three dimensional space, the geometric optics necessary to determine how light bounces off surfaces, and the ways to utilize a computer system and methods to implement the algorithms and techniques necessary to produce basic 3-dimensional illustrations. Detailed topics will include the following: transformational geometry, positioning of virtual cameras and light sources, hierarchical modeling of complex objects, rendering of complex models, shading algorithms, and methods for rendering and shading curved objects.
Computer Graphics CS184 (UC Berkeley), Ravi Ramamoorthi
This course is an introduction to the foundations of 3-dimensional computer graphics. Topics covered include 2D and 3D transformations, interactive 3D graphics programming with OpenGL, shading and lighting models, geometric modeling using Bézier and B-Spline curves, computer graphics rendering including ray tracing and global illumination, signal processing for anti-aliasing and texture mapping, and animation and inverse kinematics. There will be an emphasis on both the mathematical and geometric aspects of graphics, as well as the ability to write complete 3D graphics programs.
This course aims to give an overview of basic and state-of-the-art methods of rendering. Offline methods such as ray and path tracing, photon mapping and many other algorithms are introduced and various refinement are explained. The basics of the involved physics, such as geometric optics, surface and media interaction with light and camera models are outlined. The apparatus of Monte Carlo methods is introduced which is heavily used in several algorithms and its refinement in the form of stratified sampling and the Metropolis-Hastings method is explained.