A Sample Paper Review which I tried during my Ph.D. interview!!!
I reviewed many papers during my Ph.D. interviews. Though I didn't make it till the final round, I learnt about reviewing and understanding the core contributions, the missing pieces in the research, potential improvements, and the paper's relation to my ongoing research.
This is one such paper that I liked reading and dissecting into different segments.
Unleashing the Tiger: Inference Attacks on Split Learning
The name itself was very confusing for me initially.
This Manuscript Investigates two scenarios of split learning (1. The server is malicious and 2. The Client is malicious) and provides a detailed security analysis on the Split Learning Protocol that attains peak performance by consistently utilizing fewer resources. The current study is of great significance since Split learning is gaining momentum into both Academia and Industry. The authors draw attention to the vulnerabilities in the protocol and illustrate its inherent insecurity by proving that a motivated adversary can subvert the defenses of the training framework. The authors highlight the most pervasive vulnerability of the framework, which is the server’s entrusted ability to control the learning process of the clients’ network. To demonstrate the attack, the authors implement a general attack strategy that allows a malicious server to reconstruct the private training instances. Unlike the previous attacks in collaborative learning, here, the adversary can recover the individual training instances from the clients, rather than only prototypical examples. The authors have done a great job in devising a novel attack strategy called Feature Space Hijacking Attack (FHSA), where the malicious server hijacks the model’s learning process and drives them to an insecure functional state that can be exploited to recover the private data of the client. The authors perform the attack in a realistic scenario where it might be difficult for the server to obtain the details regarding the client’s models. However, the attack requires the server to poses a dataset that follows a distribution similar to that of the client’s training sets. The attack has been carried out in the Image domain and has been applied to various split learning variants. The results seem to align with the author’s claim of domain independence since their adversarial models perform well on different datasets. The authors claim that the recently proposed defense mechanisms like Minimizing distance correlation are ineffective against FHSA, and the learning objective injected by the adversary will naturally negate the distance correlation minimization, circumventing its effect.
The authors carry out a property inference attack and propose it to be an optimized solution for an adversary who may be interested in inferring only a few specific properties of the private training instances rather than reconstructing the whole private training instance. Finally, the authors draw attention to the protocol’s inherent insecurity against client-side attacks that were previously defined on the Federated learning framework. Furthermore, they have demonstrated the attack by extending the inference attack proposed in  to make it work in split learning and reconstruct the prototypical training instances of another honest client. Overall, the authors highlight the various structural vulnerabilities in the split learning protocol and demonstrate its insecurity against a malicious server and a malicious client.
1. Novel algorithm to recover the private training instances: The FHSA proposed by the authors can exactly recover the private training instances of the client without being detected during the training Phase. Previous approaches either operated on a white box setting where the adversary was aware about the model architecture or carried out a membership inference attack or property inference attack or only reconstructed prototypical examples of the private training set .
2. Successfully executing the client-side attack under the circumstances that the client cannot control the weight updates of the server: During the client-side attack, the malicious client has no control over the weight update operation performed by the server, which makes it difficult to extract the prototypical examples of the honest client. To overcome this limitation, the authors scale the gradient sent and received by the server during the split learning protocol. This gradient scaling trick prevents any functional change on the weights of the server. This ensures that the server’s actual learning process is not hampered, and simultaneously the malicious client can recover the honest client’s prototypical examples.
3. Ability to circumvent the recently proposed defense techniques: Interestingly, the proposed FHSA attack can circumvent the effect of distance correlation minimization. Here the attacker is able to create an “adversarial” feature-space that minimizes the distance correlation but allows the attacker to obtain a precise reconstruction of the input.
1. Dependence on the client’s data distribution: The proposed FHSA attack assumes that the adversary has the auxiliary training dataset that follows a distribution similar to that of the client’s training sets. For generic tasks like Gender classification or MNIST, the Auxiliary datasets are easy to obtain, but the same may not be true for highly targeted tasks like identifying diabetic retinopathy. In real life, most distributed learning occurs for highly specific tasks like “Texts from Google keyboard,” etc., which would act as a limitation to this attack. To support this statement, we can look at the reconstruction of 0 given 0 is missing from the MNIST dataset. Even though the adversary has other parts of the MNIST dataset which follow a similar distribution, it was not able to recover the private training instance like 0 accurately. So, in scenarios like Diabetic Retinopathy detection or tumor detection, even a minor reconstruction error can make the data instance obsolete to any kind of inference. The authors aim to demonstrate data-agnostic property; however, the results do not fully support this conclusion and slightly seem like an oversimplification of the problem. Although the authors have done a great job in making the FHSA model agnostic, being data agonistic would make it more suitable for realistic scenarios.
2. Differential Privacy: The authors have missed the aspect of Differential Privacy(DP), which has been a crucial component in almost all the recent literature which are targeting attacks on collaborative learning. Even in , the authors have considered the threshold of privacy budget up-to which the adversarial attack would work. This current study would be technically more sound if the authors could have considered the aspects of DP as one of their test cases. Since they are trying to recover the exact private training instances, it becomes instrumental to study the outcomes under differentially private training sets.
3. Low Entropy datasets: The authors are off to a good start; however, this study requires additional experiments on different datasets. The Authors currently consider datasets like
- MNIST- Single channel and low entropy
- Fashion MNIST — Single channel and low entropy
- Omniglot — Though high variance is available, but single channel
- CelebA- Though it has a large number of samples- all the individual instances are faces that have the same global/common features like eyes, nose, and mouth, etc.
The samples in these datasets are highly correlated, so to provide robust results, the authors could have explored other datasets which align with realistic scenarios like ImageNet or Diabetic Retinopathy.
- Differential Privacy: Exploring novel Differential Privacy (DP) techniques and analyzing the tradeoff between Record level and participant level DP. One major problem in DP is the constant tug of war between data privacy and model utility. Trying to solve this problem using some ensemble approach by using machine learning or trying to generate differentially private datasets using GAN, or developing data-aware/gradient aware DP are some interesting open research problems. We can try adding skewed noise (non-normal noise) in a controlled way to explore if it can prevent such adversarial attacks.
- Making the attack data Agnostic: Currently, there is a dependency on the client’s data distribution to carry out the attack. Future research could look into developing similar approaches without depending on the client’s data distribution.
- Exploring other defense mechanisms: Developing novel defense mechanisms like Selective gradient sharing and feature anonymization to overcome this attack. We could also try using the principles of the disentangled autoencoder to develop a highly uncorrelated forward vector to see if we can prevent such attacks since the adversary would observe fewer gradients. We could also explore the combination of Differential Privacy and techniques like Knowledge distillation or Network pruning or Model quantization to see 1) If we can achieve the desired privacy without compromising much on the model utility, 2) If we can understand exactly information is present in the forward vector and 3) if we can prevent such adversary attacks. We can also explore novel gradient clipping mechanisms to reduce the effect of such attacks. We should also evaluate the attack’s effectiveness on the recent defenses like DISCO , Prediction Purification , and Shredder .
- Exploring other metrics for Providing quantitative evaluation: This paper uses MSE as the metric for assessing how similar the actual and the reconstructed images are, but to quantify the results accurately, we could also try using PSNR, SSIM, and Inception scores, CNN based reidentification to obtain both subjective and objective evaluation. Similarly, we could explore other loss functions for GAN like Min Max loss, etc.
- Zero-shot learning: Exploring zero-shot learning for quicker convergence of the Adversary network.
- Unsupervised learning: During the property inference attack, we could try using unsupervised learning methods to find other unintended feature leakages to reverse engineer better defense mechanisms.
- The abstract is clear but it misses the prominent quantitative results of the paper.
- Though the paper is novel and the introduction fails to capture the essence of the work which is exactly recovering the private training instances, unlike prototypical examples.
- The usage of Wasserstein loss to train the discriminator is a great approach but it would have been good if the authors could justify the reason for choosing such loss functions and other methods.
- While the study appears to be sound the authors have missed citations in a few places and have missed the citation of the seminal paper in this field as well.
- The authors could provide the reconstruction error for the client-side attack carried out in the split learning framework.
- Though the authors have provided the number of iterations taken by the network to converge, it would be more practical if the authors could provide the time taken to converge, since the time taken for each iteration might vary from dataset to dataset and having time as a metric could help in evaluating network latency as well.
- The flow of gradients from the server to the client especially in the client-side attack of the private label scenario requires a pictorial depiction to improve the flow and readability of the content.
- The usage of the Omniglot dataset is a good approach to prove few-shot learning.
I thank Abbas Alif for helping me.