We study the problem of robust community recovery: efficiently recovering communities in sparse stochastic block models in the presence of adversarial corruptions. In the absence of adversarial corruptions, there are efficient algorithms when the signal-to-noise ratio exceeds the Kesten–Stigum (KS) threshold, widely believed to be the computational threshold for this problem. The question we study is: does the computational threshold for robust community recovery also lie at the KS threshold? We answer this question affirmatively, providing an algorithm for robust community recovery for arbitrary stochastic block models on any constant number of communities, generalizing the work of Ding, d’Orsi, Nasser & Steurer on an efficient algorithm above the KS threshold in the case of 2-community block models. There are three main ingredients to our work: (1) The Bethe Hessian of the graph is defined as HG(t) ≜ (DG−I)t2 − AGt + I where DG is the diagonal matrix of degrees and AG is the adjacency matrix. Empirical work suggested that the Bethe Hessian for the stochastic block model has outlier eigenvectors corresponding to the communities right above the Kesten-Stigum threshold. We formally confirm the existence of outlier eigenvalues for the Bethe Hessian, by explicitly constructing outlier eigenvectors from the community vectors. (2) We develop an algorithm for a variant of robust PCA on sparse matrices. Specifically, an algorithm to partially recover top eigenspaces from adversarially corrupted sparse matrices under mild delocalization constraints. (3) A rounding algorithm to turn vector assignments of vertices into a community assignment, inspired by the algorithm of Charikar & Wirth for 2XOR.
我们研究鲁棒社区恢复问题:在存在对抗性破坏的情况下,在稀疏随机块模型中高效地恢复社区。在没有对抗性破坏时,当信噪比超过凯斯滕 - 斯蒂格姆(KS)阈值时存在高效算法,人们普遍认为该阈值是此问题的计算阈值。我们所研究的问题是:鲁棒社区恢复的计算阈值是否也位于KS阈值处?我们肯定地回答了这个问题,针对任意数量的常数个社区的任意随机块模型,提供了一种鲁棒社区恢复算法,推广了丁(Ding)、多尔西(d’Orsi)、纳赛尔(Nasser)和施特勒尔(Steurer)在2 - 社区块模型情况下关于高于KS阈值的高效算法的工作。我们的工作有三个主要部分:(1)图的贝特黑塞矩阵(Bethe Hessian)定义为$H_G(t) \triangleq (D_G - I)t^2 - A_Gt + I$,其中$D_G$是度的对角矩阵,$A_G$是邻接矩阵。实证研究表明,随机块模型的贝特黑塞矩阵在刚好高于凯斯滕 - 斯蒂格姆阈值处有对应于社区的异常特征向量。我们通过从社区向量中明确构造异常特征向量,正式确认了贝特黑塞矩阵异常特征值的存在。(2)我们针对稀疏矩阵上的一种鲁棒主成分分析(PCA)变体开发了一种算法。具体来说,是一种在温和的离域约束下,从受对抗性破坏的稀疏矩阵中部分恢复顶部特征空间的算法。(3)一种舍入算法,用于将顶点的向量分配转换为社区分配,其灵感来自于查里卡尔(Charikar)和维尔特(Wirth)针对2XOR的算法。