もっと詳しく

Michael Zhang writes via PetaPixel: In a post titled “High Fidelity Image Generation Using Diffusion Models” published on the Google AI Blog (and spotted by DPR), Google researchers in the company’s Brain Team share about new breakthroughs they’ve made in image super-resolution. […] The first approach is called SR3, or Super-Resolution via Repeated Refinement. Here’s the technical explanation: “SR3 is a super-resolution diffusion model that takes as input a low-resolution image, and builds a corresponding high resolution image from pure noise,” Google writes. “The model is trained on an image corruption process in which noise is progressively added to a high-resolution image until only pure noise remains.” “It then learns to reverse this process, beginning from pure noise and progressively removing noise to reach a target distribution through the guidance of the input low-resolution image.” SR3 has been found to work well on upscaling portraits and natural images. When used to do 8x upscaling on faces, it has a “confusion rate” of nearly 50% while existing methods only go up to 34%, suggesting that the results are indeed photo-realistic.

Once Google saw how effective SR3 was in upscaling photos, the company went a step further with a second approach called CDM, a class-conditional diffusion model. “CDM is a class-conditional diffusion model trained on ImageNet data to generate high-resolution natural images,” Google writes. “Since ImageNet is a difficult, high-entropy dataset, we built CDM as a cascade of multiple diffusion models. This cascade approach involves chaining together multiple generative models over several spatial resolutions: one diffusion model that generates data at a low resolution, followed by a sequence of SR3 super-resolution diffusion models that gradually increase the resolution of the generated image to the highest resolution.” “With SR3 and CDM, we have pushed the performance of diffusion models to state-of-the-art on super-resolution and class-conditional ImageNet generation benchmarks,” Google researchers write. “We are excited to further test the limits of diffusion models for a wide variety of generative modeling problems.”

Read more of this story at Slashdot.