MACHINE LEARNING: THE SILENT REVOLUTION OF PHOTOGRAPHY
Hardware players in a software world
Every Photographer on YouTube, photography blogger and photography magazine are talking about the “mirrorless revolution” and how Sony is winning the race, how Canon and Nikon are falling behind and how the DSLRs are about to die. They are missing the point. While this is certainly innovative, I don’t see mirrorless cameras as a revolution but rather an evolutionary step that was bound to happen. The camera industry has a slow pace of innovation compared to the technology industry and while consumers change their smartphones, tablets and other gadgets every 18 to 24 months, professional photographers will stick to the same camera system for 5 to 10 years. So I understand why the photography industry thinks that removing the mirror off DSRLs feels like the biggest leap since the move to digital photography. If this is the case, just wait until Artificial Intelligence makes its way into every single aspect of photography. This is already happening and it will have a profound impact every aspect of a photographer’s workflow: from the moment the shutter button is pressed, to how software editing works and to the way images are stored and catalogued.
Photography and computer science are near and dear to my heart and for me, seeing these two fields converge is fascinating and as exciting as the introduction of the internet and the smartphone. However, since I started my photographic journey four years ago, I noticed that the camera industry is hardware-first industry and we live in a software-first world. As the creator of Netscape and technology visionary Marc Andreessen said in an interview with the Wallstreet Journal in 2011, “software is eating the world”. This assertion is more accurate today than ever before and Sony, Nikon, Canon and the rest of the camera manufacturers better realize this.
Going back to the topic of Artificial Intelligence, here are three areas in photography where advances in Machine Learning is changing the game.
Easier Photo Management
“Computer Vision” is the ability for a computer program to extract relevant and contextual information from a digital photography or video. In recent years, it has become mainstream thanks to a subfield of Machine Learning called Deep Learning. Without getting too technical, Deep Learning is the ability for a software algorithm (a.k.a Neural Networks) to identify and classify patterns of digital data by training it with large amounts of pre-existing datasets. In plain english, the algorithm is capable of learning by example (lots of them). i.e. One can “teach” the algorithm to recognize dogs by presenting it a lot images containing dogs until eventually, it learns how a dog looks like.
For us, the end consumer, it means that applications like Facebook, Google Photos, Microsoft’s OneDrive or Apple’s Photos, are capable of auto-tagging and auto-categorizing thousands of our photos for us. In an era where we amass thousands of digital photos per year, computer vision makes it a lot easier for us to find that photo we took of our kids 5 years ago during a birthday party.
As shown in the image on the right, the Microsoft OneDrive app has automatically tagged thousands of my photos and recognized scenes (“Outdoor”, “Sky”, “Airshow”) and objects (“building”, “Animal”, “People”) without my intervention. In the next, Google Photos is capable of finding photos of Kangaroos without any previous tagging. Try it yourself and search for “Pet”, “babies”, “beach” and see Computer Vision in action.
In-camera Computational Photography
When M.L. is applied to the process of taking photos/videos, it is often referred to as computational photography. Wikipedia has a great definition for what this means: “Computational photography refers to digital image capture and processing techniques that use digital computation instead of optical processes”.
This is to me, the most revolutionary use of M.L. in the photography workflow and the one that could put traditional camera manufacturers to bed. Why? Because all the innovation in computational photography is software based and it is happening in smartphone cameras, not in traditional DSRLs or Mirrorless cameras. Nikon, Canon, Sony, Fuji, Hasselblad, Leica… these are not software companies and while some are better than others when it comes to designing their in-camera software, they simply don’t have the Software engineering pedigree that a Google, Microsoft or Apple have. Sony vaguely mentioned at Photokina they would be using Machine Learning to improve eye detection autofocus, but this is a laughable effort compared to the innovation Google is bringing to the Pixel 3 camera software.
And while Apple, Samsung, Huawei and other smartphone manufacturers strive to achieve better image quality by adding more cameras lenses, sensors in the smartphone, Google is able to achieve (and beat them) with a single camera lens combined with the power of machine learning.
Rather than me explaining it, you can enjoy the video below.
Smarter and faster photo editing software
This is also a very exciting application of Machine Learning. The example that comes to mind is that of Adobe’s Artificial Intelligence platform called Sensei. Back in December 2017, Adobe announced it would be leveraging the power of Sensei to modify how the “Auto” setting works in Lightroom. Before this happened, no sane photographer would use the “Auto” button in Lightroom to edit its photos because the results were eratic, mediocre at best. However, this dramatically changed after Adobe leveraged their Sensei platform. Now, when the “Auto” setting is used, Lightroom is capable of analyzing the photo’s exposure, highlights, shadows, colours, temperature (…) , compare it to “tens of thousands of professionally edited photos” and suggest the appropriate adjustments to provide a more pleasing image. And while it doesn’t get you 100% there, it certainly speeds up the editing time significantly by providing an amazing starting point.
In Adobe Photoshop, Sensei is being used to help humans do complex object selection. Manually selecting or isolating a an object to with the mouse or a digitizer to edit a photo can be one of the most excruciating and boring tasks. With the help of Computer Vision algorithms, this becomes a much easier task reducing the process from minutes to seconds. The video below illustrates this particular use case.
I can only assume Adobe is not the only company using Machine Learning in their Image Editing software, but it’s the one I use and I’m familiar with. However, those software companies that aren’t investing in infusing their tools with M.L. will be left behind with the risk of becoming obsolete.
The Outlook
The introduction of Artificial Intelligence into the field of photography is an opportunity to reduce the friction between the tools and the art itself. A.I. can lower some of the barriers to entry such the learning curve of using image editing software, it can significantly speed-up a professional’s editing workflow, it reduces the amount of time spent categorizing and tagging images and it can improve the image quality of entry-level cameras, including smartphones, for those who can’t afford of don’t want to by a professional camera.
If you’re not a good photographer and don’t know have a good eye for it, none of this will turn you into Ansel Adams. However, if you have the eye and the sensitivity to identify what makes a beautiful image, but have been held back by the complexity of the tools (camera, editing software...) than A.I. is here to rock your world.
And while the traditional camera manufacturers like Sony, Nikon and Canon are fighting the mirrorless battle, the real revolution is happening in the software space and computational photography; which makes me wonder if Apple or Google are thinking of entering the Camera industry and turning it on its head.
What do you think? Where else can A.I. have an impact in photography, printing maybe? Please share your thoughts below!
Disclosure, I work for Microsoft.