The star feature of the iPhone 7 Plus is its dual-camera system. The typical 28mm-equivalent iPhone camera is joined by a 56mm-equivalent. This allows for a 2x optical zoom at the touch of a virtual button, but, more interestingly, it also opens the door to some interesting computational photography.
Two cameras don't really make for a light field camera, where a computer model is built of the captured light rays, allowing them to be projected onto a virtual image capture plane, through a synthetic aperture. That's what Lytro is doing with their plenoptic Cinema Camera (see previous post), and more analogously, what Light is promising with the 16-lensed L16 camera (two posts on that one so far).
Computational Photography is Here (and Has Been for a While actually)
I'm pretty sure that with only two cameras, you can't build a useful light field. But can you do computational photography? That's a trick question, as the iPhone, and many other mobile phone cameras, are already doing computational photography. Already the iPhone will automatically perform an HDR merge of two exposures, for example. But even when the iPhone snaps a single, non-HDR exposure, the amount of post-processing it does is considerable.
We've gotten to test this first hand recently with Apple opening up raw capture to developers. Adobe jumped on this right away with Lightroom Mobile, having already implemented raw in their Android version. The first thing you notice when shooting raw DNG files with your iPhone is how noisy the image are. Turns out Apple's been doing a ton of noise reduction on their photos for a few generations now. It's entirely possible that they are using multiple exposures to aid in this process, but I don't know if anyone's ever confirmed that.
Portrait Mode, Depth Effect
Apple calls their initial two-lens computational photo offering Portrait Mode, and the most recent developer beta of iOS 10.1 includes a beta version of it. Under the right circumstances, this mode enables a so-called "Depth Effect," where both cameras fire simultaneously, and a depth map is built based on the subtle stereo disparity between the captured images. This nine-level depth map is used to drive a variable-radius blur. The result is a photo with simulated shallow depth of field.
This process can never be perfect, but can it be good enough?
Oh hell yes it can.
Why Do We Care?
When I first started testing Portrait Mode, I was alone in my backyard, with only inanimate props. I took some shots where the Depth Effect shined, and some where it flopped. I posted some samples on Instagram, using an unforgiving split-screen effect that dramatically highlights the imperfections of the processing.
Most notably, the processing gives the foreground a bit of a haircut, which you can see clearly in this example.
This stands to reason. The depth map is very likely computed at a reduced resolution, and I bet it’s noisy. Any smoothing is going to also eliminate certain edge details, and Apple's engineers have, I'm surmising, estimated that eating into the edges a bit overall is better than seeing a halo of crisp background between the foreground subject and the blurred background.
The next night, my family came over for a cookout. As we ate and drank into the evening, reveling in global warming, I remembered that I had a new toy to play with. I pulled out my phone, toggled over to Portrait Mode, and snapped a few shots of my brother-in-law and his adorable son.
This is the photo that convinced me that Portrait Mode is a real thing. Here it captured a fast-moving, uncooperative subject, at ISO 500 lighting, and produced results that are not just good, but actually a photo I cherish.
Is it perfect? No. The effect ate some of the little guy's flyaway hairs. But "perfection" would be a strange goal for a process designed to simulate an artifact. Without a side-by-side comparison, no one will miss those hairs.
So don't ask if Depth Effect is perfect. A better question is if its failures are distracting. And I have certainly taken some test photos where they are. But the funny thing about test photos is that there's often nothing worth photographing in them, so you just stare at the problems. In my own testing, whenever I've pointed Portrait Mode at something I actually care about, the results have been solid.
So back to the question of whether we should care about a fake blur applied in post to a telephone photo. When I tweeted the above shot, someone replied with a reasonable question: wouldn't I love the photo just as much without the effect? I replied no, and added:
Composition matters, and focus is composition in depth.
Portrait Mode photos aren't just photos with a blur applied. They have the potential to be photos that are more about what they are photos of. It gets back to one of the oldest, most durable posts on this site: Less is More. We frame our shots carefully, and shallow depth of field allows us to frame our shots in depth as well.
Sometimes that makes the photo prettier. Often, it can make the photo.
As an example, here's another photo of something I care about: my son, and his uncharacteristic (for a 7-year-old) love of raw oysters.
In the non-depth-effect version, the background is so distracting that I probably wouldn't have shared this photo, but the shallow-depth-of-field version not only looks better, it succeeds in communicating my feelings at the moment of capture.
See, our eyes actually have very deep focus, but our brains and our hearts fire at ƒ/0.95.
The instrument of our eyeballs "sees" everything, but we don't see with our eyes. We see with our brains. And our brains ignore stuff that doesn't matter (to a hilarious fault). This is maybe the number one failing of amateur photography, and certainly mobile photography—we take a ton of photos that wind up being more about weird ugly background details than the reason we wanted to take a picture.
So I posted this oyster shot on Instagram, and someone pointed out how the white blob on the left is eating into my son's elbow a little. This is true, and it's definitely a technical failing of the depth effect. It also reminded me of something that VFX master Dennis Muren once said in dailies when I was a young artist at ILM. Someone pointed out a very real flaw with a shot. But instead of demanding that it be fixed, Dennis brushed the concern aside and declared the shot final. The flaw was not near the subject of the shot, so it didn't concern him. "If they're looking there," he said, "we've lost them."
I sent this shot to a photographer friend who knows how I took it and what problems to look for. His review of the shot is the one that resonates with me: "Your son is growing up too fast, and I want some oysters."
I have a bunch of shots that show Portrait Mode failing. Small foreground objects getting blurred along with the background, edges being eroded noticeably. But I'm not going to post them here because it's not really very fun, and there's a whole internet happy to bash Apple's beta 1 drop of anything. The edge problems will be worth fixating on if they don't improve.
What Are You Giving Up?
It so happens that I also have a full-frame 35mm ƒ/2.0 shot of my kid eating oysters at that same restaurant, earlier this year:
Some obvious differences:
The wider 35mm angle of view, vs the iPhone's 56mm equivalent. Shallow DOF at wide AOVs is an expensive look.
Bloom! The boke of the background lights blooms and pops as one would expect.
Foreground blur: Apple's Depth Effect mostly blurs things behind the point of focus, only very subtly softening anything in front of it. That's either a magic cheat or a huge failing, depending on your point of view. I think it's brilliant that Apple decided to call this "Portrait Mode" and describe it as blurring the background, rather than a more ambitious, and ultimately inaccurate, description of it as a shallow depth of field mode.
All the little frizzy hairs rendered in tack-sharp 42 megapixel glory.
And then there's raw. No, not the food, the photo. This photo was shot with my Sony RX1R II, in raw. I edited both it and the iPhone shot in Lightroom, using the same presets from my Prolost Graduated Presets set, but this one took the edits way better, because it's raw. Slurp!
Of course, the RX1R II is expensive, at almost $4,000. It's tiny for a full-frame camera, and I do bring it with me almost everywhere I go, but despite being smaller than the iPhone 7 Plus in two out of three dimensions, it's not something you slip into your jeans pocket.
Big cameras that shoot raw from a big sensor and fast glass are still king, no question. It's not useful to think of Portrait Mode as competition for them. I look at it this way: Despite all my badass cameras, I will continue to take photos with my iPhone, as I always have. And now they can sometimes look a lot better.
Okay Wanna Nerd Out for a Bit?
Now that we've celebrated that emotional impact matters more than technical accuracy for like the billionth Prolost time, I thought it might be fun to geek out on the images a little and consider just how mighty the task is of producing emotionally resonant fake depth of field on a telephone.
There's been a lot of conjecture about the specific type of blur Apple is doing. I've heard Apple pundits say authoritatively that it's a "gaussian blur." Spoiler: It's not, at least not in any way I've ever seen.
The "right" way to fake a focus blur is with a blur kernel that looks like the boke shape you're going for. Here's a simple example:
As you can see, the gaussian blur (approximated here with five box blurs, which is how Photoshop does it!) looks like mush, and in no way resembles an out of focus image. The focus blur kernel gives better results, at the expense of much longer processing time. And the results are still not great. We all know what an out-of-focus Christmas tree looks like, and that is not it.
There are two more things we need to do to truly simulate focus blur. If we do them, our simulation becomes very, very accurate. I've heard other Apple pundits claim that a synthetic blur can never convincingly match the real thing. That is simply not true. But you gotta do it right.
To do it right, you need to do three things:
Use a properly-shaped boke kernel.
Process the blur in linear-light color space, not gamma-encoded space.
Have an HDR source.
What's linear light? It's something I've talked about a lot here over the years. This video explains it well (though it simplifies gamma down to a power of 2).
Here's what it looks like when you take these three steps:
Okay! That was a lot of work, and the fans on my iMac spun up a bit, but we got there.
Why is it a lot of work? Let's go back to that list:
Use a properly-shaped boke kernel. This is considerably more computationally expensive than a regular soft blur, because you can't iterate a cheap function or easily parallelize the processing for GPU acceleration. There are some optimizations you can do if you keep the center of the blur flat, but for the most part, you just gotta brute force it.
Linear light: This requires high bit depths to avoid banding, and high bit-depths are expensive in both memory footprint and processing.
HDR requires not just high bit depths, but possibly even floating-point color. It also requires that you have enough overhead in your single-exposure raw to count as HDR, because you're probably not going to do an exposure merge with only two cameras.
I did my blurs in 32bpc floating point, using the cleverly optimized software lens blur in After Effects. It took several seconds to process each image, and my 4 GHz Core i7 iMac's fans spun up audibly.
This problem of realistic depth of field has been the subject of much research. With GPU processing, which one must imagine is how Apple is going about this, it's entirely possible that a kernel-based blur isn't the fastest approach. They could be tracing rays, or, maybe more likely, doing a recursive approximation of a disk-shaped blur.
It's heavy math, and Apple is doing it in real-time while you're framing the shot. Impressive.
So now we know what it takes to compute a realistic lens blur. So how do Apple's results compare? Click to enlarge:
Apple's blur is not a perfect match for any of mine, but that makes sense. However Apple is doing the blur, it's probably not the last thing in the image processing pipeline.
To my eye, Apple's blur is obviously not gaussian, or even gaussian-esque. It's some kind of sharp-edged circular blur kernel, maybe computed at a lower spatial resolution than the final image, which would account for some of the softness—and the miraculous speed with which the iPhone 7 Plus can do the job. Their blur is neither as flat as my gamma-space example, nor as distinct as my linear-space simulation, so I can't quite tell where Apple's doing the blur in their order of operations. But it looks closer to my gamma-space version.
This is an area of possible improvement: If Apple could operate on pure linear-light pixels, the blur could become more realistic and pleasing.
They might even get a hint of HDR-esque highlights, as raw files are effectively (modestly) HDR images.
They'll never be as poppy as a real photo of bright, out-of-focus lights though, because there's just not enough headroom to hold all that exposure in an iPhone's raw capture. So why not artificially boost the highlights like I did in my final example up there? Again, I think Apple is being conservative here. Highlight boosting can go horribly wrong, resulting in glowing teeth or eyeballs. It requires manual dialing-in to look acceptable, and even then, it never looks perfect.
Another reason that Apple's blurred highlights won't look as punchy as photographed boke is that Apple is usually applying their blur to a image that is already slightly out-of-focus. This means the boke shape, even from a perfectly crisp kernel, will always be a little mushy, because the source was not a pinpoint. Again, Apple could try to account for this, but it would be nearly impossible to do that in an unattended algorithm.
So in conclusion, here's my assessment of the Portrait mode blur processing:
It's good. Apple is doing it right. It's not a gaussian blur.
There's opportunity for improvement.
It's a miracle that a telephone can do all this processing as fast as the iPhone 7 Plus does.
I didn't geek out on the depth map generation and segmenting, but I could. There's a lot going on there too. And I imagine, somewhat optimistically, that this is an area where we will see improvements in future builds.
The Power of Low Expectations
I didn't think the results would be this good. Apple, uncharacteristically, undersold them. And this created room for a delightful surprise when Portrait Mode turned out to be something I will most certainly use.
One might fairly take me to task for being soft on Apple when I have been so hard on others. I gave Light.co a pretty hard time about their (still unproven) claims of being able to match the multi-dimensional, layered boke from a Canon 7D shot.
One of the things I love about photography is that I simply never know what's going to get me excited about it all over again. It could be something strange and stupid, like an app that makes you wait an hour before viewing the photos it takes, or one that takes pictures that look like an Macintosh Plus screen. It could be a strange old lens with fungus growing in it, or a broken chunk of glass held up in front of a matte box. I reserve the right to be delighted by things, even things that were specifically designed to delight me.
Well done, Apple.