Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement

Oct 7, 2024ยท
Zhehao Huang
Zhehao Huang
,
Xinwen Cheng
,
Jinghao Zheng
,
Haoran Wang
,
Zhengbao He
,
Tao Li
,
Xiaolin Huang
ยท 2 min read
Overview of our Saliency Forgetting in the Remain-preserving manifold online (SFR-on).
Abstract
Machine unlearning (MU) has emerged to enhance the privacy and trustworthiness of deep neural networks. Approximate MU is a practical method for large-scale models. Our investigation into approximate MU starts with identifying the steepest descent direction, minimizing the output Kullback-Leibler divergence to exact MU inside a parameters’ neighborhood. This probed direction decomposes into three components weighted forgetting gradient ascent, fine-tuning retaining gradient descent, and a weight saliency matrix. Such decomposition derived from Euclidean metric encompasses most existing gradient-based MU methods. Nevertheless, adhering to Euclidean space may result in sub-optimal iterative trajectories due to the overlooked geometric structure of the output probability space. We suggest embedding the unlearning update into a manifold rendered by the remaining geometry, incorporating second-order Hessian from the remaining data. It helps prevent effective unlearning from interfering with the retained performance. However, computing the second-order Hessian for large-scale models is intractable. To efficiently leverage the benefits of Hessian modulation, we propose a fast-slow parameter update strategy to implicitly approximate the up-to-date salient unlearning direction. Free from specific modal constraints, our approach is adaptable across computer vision unlearning tasks, including classification and generation. Extensive experiments validate our efficacy and efficiency. Notably, our method successfully performs class-forgetting on ImageNet using DiT and forgets a class on CIFAR-10 using DDPM in just 50 steps, compared to thousands of steps required by previous methods.
Type
Publication
In Neural Information Processing Systems 2024 Spotlight

Background

Machine Unlearning (MU) aims to remove the influence of samples from a pre-trained model, ensuring the model behaves as if it has never encountered those samples.

Contributions

  • We provide a novel perspective to unify previous approaches by decomposing the vanilla gradient descent direction of approximate MU into three components: weighted forgetting gradient ascent, remaining gradient descent, and a weight saliency matrix.
  • We derive the steepest descent direction for approximate MU on the remain-preserved manifold.
  • We propose a fast-slow weight method to implicitly approximate online Hessian-modulated salient forgetting updates.
  • We conduct experiments on a wide range of CV unlearning tasks across multiple datasets and models of different architectures, verifying the effectiveness and efficiency of our method.

Key Findings

Approximate MU Problem Formulation

The optimization problem of approximate MU at each step can be formulated as:

$$ \begin{align*} &\theta_{t+1} =\mathop{\arg\min}\limits_{\theta_{t+1}} D_{\text{KL}}\left(p_z(\theta_*)||p_z(\theta_{t+1})\right)+\frac{1}{\alpha_t}\rho(\theta_t,\theta_{t+1})\\ &=\mathop{\arg\min}\limits_{\theta_{t+1}} \underbrace{D_{\text{KL}}\left(p_{z^f}(\theta_*)||p_{z^f}(\theta_{t+1})\right)}_{(a)}p^f+ \underbrace{D_{\text{KL}}\left(p_{z^r}(\theta_*)||p_{z^r}(\theta_{t+1})\right)}_{(b)}p^r+ \frac{1}{\alpha_t}\underbrace{\rho(\theta_t,\theta_{t+1})}_{(c)}. \end{align*} $$

($a$ ) seeks to eliminate the influence of the target forgetting samples.

($b$ ) aims to maintain the performance on the remaining samples.

($c$ ) employs the metric $\rho$ to constrain the magnitude of each update.

Vanilla Gradient Descent

Under the Euclidean manifold metric, $\rho(\theta_t,\theta_{t+1})=\frac12\lVert \theta_t-\theta_{t+1} \rVert^2$ . The steepest descent direction that minimizes approximate MU is:

$$ \begin{equation*} \theta_{t+1}-\theta_t:\approx-\alpha_t [ \underbrace{H^f_*(H_*^r)^{-1}}_{(S)} \underbrace{[-\nabla \mathcal L^f(\theta_t;\boldsymbol{\varepsilon}_t)]}_{(F)}p^f+ \underbrace{\nabla \mathcal L^r(\theta_t)}_{(R)}p^r ]. \end{equation*} $$

Approximate MU in Remain-preserving Manifold

Employing Euclidean distance treats all coordinates as equally important. A practical objective is to constrain parameter updates during unlearning within a manifold that minimally impacts the retained performance.

Using the model output KL divergence on the remaining set as the manifold metric, $\rho(\theta_t,\theta_{t+1})=D_{\text{KL}}\left(p_{z^r}(\theta_t)||p_{z^r}(\theta_{t+1}))\right)$ . The steepest descent direction that minimizes approximate MU is:

$$ \begin{equation*} \theta_{t+1}-\theta_t:\approx-\tilde\alpha_t \underbrace{(H_t^r)^{-1}}_{(R)} [ \underbrace{H^f_*(H_*^r)^{-1}}_{(S)} \underbrace{[-\nabla \mathcal L^f(\theta_t;\boldsymbol{\varepsilon}_t)]}_{(F)} ]. \end{equation*} $$

($F$ ) represents the gradient ascent direction of the weighted forgetting loss, which directs the model to discard the information of the forgetting samples.

($R$ ) fine-tuning on the remaining set to sustain the model’s general capabilities.

($S$ ) amplifies the parameter updates crucial for forgetting, while dampens those important for maintaining.

Citation

If you find this work useful for your research, please consider citing:

@article{huang2024unified,
  title={Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement},
  author={Huang, Zhehao and Cheng, Xinwen and Zheng, JingHao and Wang, Haoran and He, Zhengbao and Li, Tao and Huang, Xiaolin},
  journal={arXiv preprint arXiv:2409.19732},
  year={2024}
}