“Nano Banana,” Google’s artificial intelligence (AI) model that generates and edits images, has also..
https://www.mk.co.kr/en/it/12027493
Publish Date: 2026-04-26 01:29:00
Source Domain: www.mk.co.kr
-
Integration of Image Generation and Comprehension: Google DeepMind’s ‘Vision Banana’ combines image generation and comprehension in a single model, named ‘Nano Banana’, rather than using specialized models for each visual task.
-
Object Recognition and Depth Estimation: The model can distinguish and categorize various objects within images and can estimate the depth of these objects, even classifying people in a beach scene by their actions (sitting, walking, standing).
-
RGB Value Processing: Vision Banana can manipulate images based on prompts, such as marking a cat’s ears with specific RGB values, demonstrating its flexibility in image processing.
-
Performance and Development Stage: The model shows performance comparable or superior to existing specialized models in understanding 2D and 3D images, though it remains an experimental project and is not yet commercialized.
-
Computational Efficiency and Future Goals: The researchers acknowledge that generative models are more compute-intensive and emphasize that speed improvements and cost reductions are necessary for potential commercialization.