New York: In a major leap towards enhancing its artificial intelligence (AI) capabilities without compromising user privacy, Apple has introduced a new AI training strategy that leverages on-device data while upholding its longstanding commitment to user confidentiality. The development was recently highlighted in a blog post by the Apple Machine Learning Research team and further reported by Bloomberg.
Traditionally, Apple has relied heavily on synthetic data—artificially generated content—to train its AI models. While this approach has safeguarded user privacy, it presents certain limitations, particularly in training models for more complex tasks such as long-form summarisation.
To overcome these challenges, Apple is implementing a privacy-preserving system that uses small, anonymous samples of recent user emails—but only from devices that have opted in to Device Analytics. The company emphasizes that the system does not allow access to user identities or specific email content.
Instead, Apple’s new method uses a process known as embeddings, which categorizes emails based on elements like language, topic, and length. These are then compared against synthetic messages directly on the device. With the help of differential privacy techniques, Apple can evaluate how closely synthetic messages mimic actual user behavior, all without ever seeing the real emails or knowing which devices participated.
Ultimately, this approach will enhance AI features like content summarisation in Mail, Notes, and other Apple applications, while helping the company to continuously refine its synthetic training data.
This breakthrough is expected to be part of upcoming beta releases, including macOS 15.5 and iOS 18.5, though an official rollout schedule has not yet been announced.