Steve is the Head of Data Science and AI at Australian Computer Society, a proactive social media contributor and LinkedIn influencer.
Computers back in the day were good at understanding numbers yet failed miserably at reasoning visual data. Over the years, many researchers and engineers have worked on the path of making computer reason better on the image data that humans perceived so naturally. The generation of image data increases at a rate of petabytes per minute, and humans cannot process it all. Hence, computer vision should be a crucial driver of intelligent technology in the coming years.
Computer vision has become ubiquitous in our society, with applications in search, medicine, image understanding, apps, mapping, drones and self-driving cars. Core to many of these applications are visual recognition tasks, such as image classification, localization and detection. Recent research developments have significantly advanced the performance of these state-of-the-art optical recognition systems.
Computer vision, a subdomain of artificial intelligence, is one of the most in-demand skills for jobs, according to LinkedIn. Every year, thousands of scientists contribute to it, and there has been an exponential rise in research work over the last decade. Computer vision technology should also drive some of the most exciting innovations of the 21st century, like autonomous vehicles, medical imaging diagnosis and military applications.
Hence, it an excellent prospect for anyone looking for a well-paid career in an exciting and cutting-edge field. Given the wealth of education available online, you don’t necessarily need an Ivy League education to learn computer vision. Kick-start your learning today with the best-in-class online courses to elevate your understanding of machine vision to the next level.
These courses can cater to varied audiences. Maybe you want to learn how to design and code AI algorithms. Perhaps you want to mess around with the tools and frameworks that are available, or perhaps you need to understand the business side of computer vision in your company. Whatever your goals are, you are likely to find something that will expand your horizons.
For example, Stanford offers a course called “Convolutional Neural Networks for Visual Recognition” that dives deep into details of the deep learning architectures focused on learning end-to-end models for these tasks, particularly image classification by the school’s Vision Lab under the supervision of Fei Fei Li. During the 10 weeks, students learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. It focuses on teaching how to set up the problem of image recognition, the learning algorithms (e.g., backpropagation), practical engineering tricks for training and fine-tuning the networks, and it guide the students through hands-on assignments and a final project.
There are elementary-level courses aimed at anyone who wants to play around with the nuts and bolts that go into building computer vision applications and what it can be used for without getting involved in the underlying mathematics and statistics. For instance, Computer Vision I by OpenCV teaches fundamental computer vision concepts, like image operations, image and video processing, deep learning and more using the OpenCV toolkit. It can help boost your hands-on understanding with an in-depth explanation of primitive code samples, and it also covers a wide range of real-world systems, like document scanner, human post estimation, selfie application and face detection, to name a few. You can also learn how to approach almost any computer vision task using the OpenCV framework. It has a collection of both image and video processing methods.
Other courses can help you get things up and running and solve a problem without diving into theory. OpenCV’s Computer Vision II covers robust real-time object detection algorithm “Yolo” as a case study and how to implement Snapchat filters. If rapid prototyping interests you, it is worth the time. You should gain a considerable empirical understanding of building real-world applications using the techniques learned in the first part. In contrast to the above program, deep learning will be the main focus. Consequently, it teaches how to deploy a computer vision application on the web using AWS services.
There are also courses designed for lovers of math and theory. For instance, world-renowned computer vision research institution Georgia Tech offers an “Introduction to Computer Vision” course that is less focused on the machine learning aspect of CV. The program avoids the use of high-level APIs and instead teaches low-level primitives to analyze the image and extract structural information. It heavily focuses on the fundamentals of computer vision with underlying mathematics behind it, along with a few applications like depth recovery from the stereo, camera calibration, image stabilization, automated alignment, tracking and action recognition.
Some courses, such as one from Udemy titled “Deep Learning: Advanced Computer Vision,” jump right into developing well-known computer vision models using artificial neural networks using APIs. These are perfect for those who have a decent Python programming experience and a good grasp of computer vision and deep learning fundamentals. No mathematic constraints are involved. You will learn advanced concepts in deep learning, like generative adversarial networks (GANs) and real-time object detectors, like single shot detector (SSD). You’ll also apply transfer learning and grasp an understanding of state-of-the-art CNN architectures, like ResNet, Inception, VGG, etc. The focus is on how to use high-level APIs to build real-world AI systems. You should learn how to quickly prototype a working computer vision system using available tools and frameworks.
Lastly, there are courses for those who aren’t novices and have a theoretical background of signal and systems and linear algebra. NPTEL’s “Computer Vision” course is taught by one of the oldest and prestigious IITs (Indian Institute of Technology) in India, taught by professors having more than two decades of experience in computer vision research. The program has some of the best explanations of signal processing fundamentals. Although not very interactive, it touches upon important aspects of image processing that you might not find in any other class. It teaches the fundamentals of 2D and 3D image processing techniques, color fundamentals, camera geometry and sparse representation based on signal processing.
Online courses have democratized the power of computer vision, contributing to building a positive shared future for all. Take the plunge, learn from anywhere and be at the frontier of vision technology.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?