The shift from Arm’s Immortalis-G720 to the new Mali-G1 Ultra represents a specific hardware pivot in how mobile silicon handles light and geometry.

Note: If you buy something from our links, we might earn a commission. See our disclosure statement.

While the G720 utilized Deferred Vertex Shading to address memory bandwidth bottlenecks, the latest G1 Ultra inside the Lumex Compute Subsystem focuses on fixing ray tracing divergence through a new Single Ray architecture.

This report tracks the technical progression across three generations, examining how specific silicon changes—from the G925’s Fragment Prepass to Opacity Micromap (OMM) integration—impact thermal stability and sustained frame rates in heavy workloads like Genshin Impact and Warzone Mobile.

Immortalis-G1 Ultra vs G925 vs G720 | Faceofit

Faceofit

Silicon Analysis

The GPU Trifecta.
Arm’s Evolution.

Mobile graphics are no longer just “good enough.” We track the trajectory from the Immortalis-G720 to the G925 and the G1 Ultra. A look at how ray tracing went from a checklist feature to a 120 FPS reality.

By Faceofit Tech Team • Updated Oct 2025

Immortalis-G720

Foundational 5th Gen architecture. Introduced Deferred Vertex Shading to combat memory bandwidth limits.

2023

Dimensity 9300

Immortalis-G925

Refined efficiency. Fragment Prepass technology reduced shader workload by 43%. A massive leap in rasterization.

2024

Dimensity 9400

Mali-G1 Ultra

The Lumex Era. Ray Tracing Unit v2 switches to Single Ray tracing. 120 FPS gaming with global illumination.

2025

Dimensity 9500

Performance Metrics

Comparing pure rasterization vs. ray tracing capabilities.

Data Source: 3DMark Solar Bay & GFXBench Aztec Ruins (Internal Testing & Arm Disclosures)

Architecture Evolution

The Memory Wall & G720

Modern mobile SoCs face a specific physical hurdle. The CPU, GPU, and NPU all fight for system memory bandwidth. Arm’s solution in the G720 was Deferred Vertex Shading (DVS). Instead of shading geometry immediately, the GPU waits. It tiles the screen and determines visibility first.

The result is that only visible vertices get shaded. This approach cut geometry memory traffic by nearly 40%. It freed up bandwidth for higher resolution textures, laying the groundwork for complex scenes in mobile gaming.

G925: The Efficiency of Pre-Pass

The Immortalis-G925 kept the DVS foundation but attacked a different inefficiency: pixel overdraw. In complex games, objects often sit behind other objects. Rendering pixels that eventually get covered is wasted energy.

Arm introduced the Fragment Prepass. This lightweight stage checks depth before the main shader runs. It ensures the GPU only colors the pixel that the user actually sees. This single change reduced shader thread invocation by 43%. Combined with the TSMC N3E node, the G925 delivered a 30% power reduction.

Mali-G1 Ultra: Ray Tracing v2

For 2025, Arm dropped the “Immortalis” name for the simpler “Mali-G1 Ultra,” integrated into the Lumex platform. The headline feature is the Ray Tracing Unit v2 (RTUv2).

Previous units used “packed ray” traversal. It grouped rays together for efficiency. The problem is that rays bounce randomly (divergence), breaking the groups and stalling the system. The RTUv2 switches to a “Single Ray” model. It handles chaotic, divergent rays independently. This architectural shift enables a 119% boost in ray tracing performance. It finally makes 120 FPS gaming with global illumination viable on a phone.

The Neural Link: GPU + NPU Synergy

Raw rasterization is no longer the only metric. The relationship between the GPU and the NPU (Neural Processing Unit) dictates modern performance, specifically for AI upscaling techniques like Neural Super Resolution (NSR).

G720 Era: Relied heavily on compute shaders for upscaling, stealing cycles from graphics rendering.
G925 + APU 890: Introduced dedicated hardware handshakes. The G925 can offload INT8 calculations to the NPU for upscaling, freeing up the GPU FP32 ALUs for geometry and lighting.
G1 Ultra + Lumex: Moves to a unified memory addressing system. The GPU and NPU share data pointers without copying buffers, reducing latency for frame generation tasks by approximately 15ms.

Developer Perspective: The Divergence Problem

For game engines like Unreal Engine 5.5, “Packed Ray” tracing (used in G720/G925) was a bottleneck. It forced developers to simplify lighting models to keep rays parallel. If a ray hit a rough surface and scattered, the GPU stalled.

The G1 Ultra’s “Single Ray” architecture removes this constraint. Developers can now use high-fidelity roughness maps and complex geometry without fear of stalling the pipeline. This hardware change aligns mobile rendering pipelines closer to desktop PC architectures, simplifying the porting process for AAA titles.

The Geometry Pipeline: Mesh Shading

The transition from G720 to G1 Ultra marks the end of the traditional Vertex Shader pipeline.

Mesh Shaders operate on clusters of vertices called “meshlets.” While the G720 introduced support, the G925 optimized the culling rate. The G1 Ultra now uses Mesh Shading as the primary geometry path. This allows the GPU to cull (discard) invisible geometry at the cluster level before it even hits the rasterizer, enabling massive crowd scenes in games like *Assassin’s Creed Jade* without CPU draw-call bottlenecks.

Tech Snapshot

Current Flagship Dimensity 9400

Immortalis-G925 MC12

Next Gen Dimensity 9500

Mali-G1 Ultra (Lumex)

Ray Tracing Type

G720 Packed (Gen 1)

G925 Packed + OMM

G1 Ultra Single Ray (Gen 2)

Supported Features

VRS Tier 2 Mesh Shading AFME 2.0 Lumen (SW) ASR Support

Thermal Dynamics: Peak vs. Sustained

Mobile GPUs live and die by their thermal envelope. A chip that hits 200 FPS for one minute and throttles to 40 FPS is useless for competitive gaming.

The G925 Shift: By implementing the Fragment Prepass, the G925 reduces the total energy required to render a frame. In tests with the Dimensity 9400, this resulted in a 20% improvement in sustained performance over 30 minutes compared to the G720.

The G1 Ultra Advantage: The G1 Ultra introduces aggressive “Power Gating.” It can completely cut power to shader cores that are idle for microseconds. This granular control allows the G1 Ultra to maintain peak frequencies longer, reducing the “sawtooth” thermal throttling pattern seen in older generations.

G720 (D9300) Throttles at 12 min

G925 (D9400) Throttles at 21 min

G1 Ultra (D9500) Stable > 30 min

Estimated throttling points based on 8W TDP limit standard testing.

Gaming Reality Check

Genshin Impact (5.x)

G720

60 FPS / High

Consistent, but phone warms up significantly after 20 mins.

G925

60 FPS / Max

Runs cool. Load times reduced by 15% via OMM.

G1 Ultra

120 FPS / Max

First to hit 120Hz native. Uses ASR to maintain frame time.

Warzone Mobile

G720

Uncapped / Med

Texture streaming stutters occasionally.

G925

120 FPS / High

Stable. Prepass eliminates overdraw in dense urban maps.

G1 Ultra

120 FPS / Peak

Ray Traced Shadows enabled without FPS penalty.

Zenless Zone Zero

G720

60 FPS / High

Solid performance, standard rasterization.

G925

60 FPS / Ultra

Volumetric fog quality maxed out.

G1 Ultra

Unlimited / Ultra

Global Illumination active. Reflections in real-time.

Visual Fidelity: Beyond Raw Power

ASR (Arm Accuracy Super Resolution)

Competitors use DLSS or FSR. Arm introduces ASR with the G1 Ultra. Unlike simple spatial upscaling (FSR 1.0), ASR uses temporal data from previous frames.

Impact: The G1 Ultra can render a game at 720p internally to save power, then use ASR (accelerated by the NPU link) to output a crisp 1440p image. This is key to achieving that “30 min stable” thermal target.

Variable Rate Shading (VRS) Tier 2

The G720 supported Tier 1 VRS (draw call level). The G925 and G1 Ultra support Tier 2 (primitive level).

Impact: The GPU analyzes the scene. If a region is dark or fast-moving (motion blur), it reduces shading precision in just that spot. This saves 15-20% shading performance without perceptible visual loss in racing or FPS games.

Research Modules

The Foliage Problem: Opacity Micromaps (OMM)

The “Alpha Test” is the enemy of GPU efficiency. In games, things like leaves, chain-link fences, and grass are often simple rectangles with a transparent texture. The GPU usually has to fully calculate the texture to know if a pixel is transparent or solid.

The G925 Solution: It integrates Opacity Micromaps directly into the Ray Tracing pipeline. The GPU creates a tiny simplified map of what is solid and what is transparent before firing rays. This allows the ray traverser to ignore transparent parts of leaves entirely, speeding up jungle environments by up to 2x compared to G720.

Memory Hierarchy: The SLC Lifeline

Accessing main RAM (LPDDR5T) is expensive in terms of energy. Arm’s Tile-Based architecture relies on keeping data on-chip.

SLC Evolution: The G1 Ultra introduces a new partitioning scheme for the System Level Cache (SLC). It allows the GPU to lock critical shader binaries and texture headers into the cache, preventing them from being evicted by CPU requests. This reduces “cache thrashing” during heavy multitasking, smoothing out 1% low FPS drops significantly.

RAM Access = ~100pJ energy
SLC Access = ~10pJ energy
Goal: Keep 80% of hits in SLC.

Glass-to-Glass: The Display Handshake

Generating a frame is only half the battle. Delivering it to the screen without tearing is the other.

Q-Sync & VRR: The G1 Ultra has a direct hardware link to the MediaTek Display Controller. This allows for “Queue Synchronization” (Q-Sync). If the GPU misses a 120Hz deadline by 1ms, instead of dropping to 60Hz (V-Sync stutter), the Display Controller holds the refresh window open slightly longer (Variable Refresh Rate). This hardware handshake makes 90 FPS feel as smooth as 120 FPS.

The TBDR Advantage

Unlike desktop GPUs (Immediate Mode), Arm uses Tile-Based Deferred Rendering (TBDR). The screen is split into small squares (tiles) that fit entirely in the GPU’s fast local memory.

This is why mobile GPUs can compete with low-end PC cards despite having 1/10th the power budget. By processing a tile entirely on-chip, they avoid the massive energy cost of constantly writing to external VRAM. The G1 Ultra increases this tile buffer size, allowing for more complex lighting passes per tile before flushing to memory.

Technical Specifications

Feature	Immortalis-G720	Immortalis-G925	Mali-G1 Ultra

Frequently Asked Questions

Why did Arm drop the ‘Immortalis’ name?

With the G1 Ultra, Arm integrated the GPU into the broader “Lumex” Compute Subsystem. The new naming scheme aligns with their CPU cores (Cortex-C1) to signify a unified platform rather than a standalone component.

Does the G1 Ultra support Unreal Engine 5?

Yes. The Dimensity 9500 with G1 Ultra supports Unreal Engine 5.5 features like MegaLights and Nanite. It specifically targets 120 FPS performance in these heavy engines.

What is the benefit of Single Ray Tracing?

It handles divergent light rays better. When light scatters on rough surfaces, rays go in random directions. Previous “packed” methods failed here. Single Ray processing handles this chaos efficiently, boosting performance by 119%.

How does Fragment Prepass save battery?

It stops the GPU from drawing things you can’t see. By checking depth first, it discards hidden pixels before the heavy shading work begins. This saves about 30% power compared to the G720.

Can G925 run PC ports?

Absolutely. The G925 supports modern APIs like Vulkan 1.3 and features like Variable Rate Shading (Tier 2). It powers devices capable of running titles like *Death Stranding* and *Resident Evil* on mobile.

What is the Lumex Compute Subsystem?

Lumex is Arm’s new physical design platform. It physically places the CPU and GPU cores adjacent on the silicon die to minimize latency and shared L3 cache access times, optimizing the entire compute cluster.

Affiliate Disclosure: Faceofit.com is a participant in the Amazon Services LLC Associates Program. As an Amazon Associate we earn from qualifying purchases.

What's your reaction?

Excited

Happy

In Love

Not Sure

Silly

The GPU Trifecta. Arm’s Evolution.

Immortalis-G720

Immortalis-G925

Mali-G1 Ultra

Performance Metrics

Architecture Evolution

The Memory Wall & G720

G925: The Efficiency of Pre-Pass

Mali-G1 Ultra: Ray Tracing v2

The Neural Link: GPU + NPU Synergy

Developer Perspective: The Divergence Problem

The Geometry Pipeline: Mesh Shading

Tech Snapshot

Thermal Dynamics: Peak vs. Sustained

Gaming Reality Check

Genshin Impact (5.x)

Warzone Mobile

Zenless Zone Zero

Visual Fidelity: Beyond Raw Power

ASR (Arm Accuracy Super Resolution)

Variable Rate Shading (VRS) Tier 2

Research Modules

The Foliage Problem: Opacity Micromaps (OMM)

Memory Hierarchy: The SLC Lifeline

Glass-to-Glass: The Display Handshake

The TBDR Advantage

Technical Specifications

Frequently Asked Questions

Why did Arm drop the ‘Immortalis’ name?

Does the G1 Ultra support Unreal Engine 5?

What is the benefit of Single Ray Tracing?

How does Fragment Prepass save battery?

Can G925 run PC ports?

What is the Lumex Compute Subsystem?

Share

What's your reaction?

You may also like

More in:Tech Posts

Latest Posts

Popular Tags

The GPU Trifecta.
Arm’s Evolution.