当前位置: X-MOL 学术Algorithmica › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Explicit Correlation Amplifiers for Finding Outlier Correlations in Deterministic Subquadratic Time
Algorithmica ( IF 1.1 ) Pub Date : 2020-06-20 , DOI: 10.1007/s00453-020-00727-1
Matti Karppa , Petteri Kaski , Jukka Kohonen , Padraig Ó Catháin

We derandomize Valiant’s (J ACM 62, Article 13, 2015) subquadratic-time algorithm for finding outlier correlations in binary data. This demonstrates that it is possible to perform a deterministic subquadratic-time similarity join of high dimensionality. Our derandomized algorithm gives deterministic subquadratic scaling essentially for the same parameter range as Valiant’s randomized algorithm, but the precise constants we save over quadratic scaling are more modest. Our main technical tool for derandomization is an explicit family of correlation amplifiers built via a family of zigzag-product expanders by Reingold et al. (Ann Math 155(1):157–187, 2002). We say that a function $$f:\{-1,1\}^d\rightarrow \{-1,1\}^D$$ f : { - 1 , 1 } d → { - 1 , 1 } D is a correlation amplifier with threshold $$0\le \tau \le 1$$ 0 ≤ τ ≤ 1 , error $$\gamma \ge 1$$ γ ≥ 1 , and strength p an even positive integer if for all pairs of vectors $$x,y\in \{-1,1\}^d$$ x , y ∈ { - 1 , 1 } d it holds that (i) $$|\langle x,y\rangle |<\tau d$$ | ⟨ x , y ⟩ | < τ d implies $$|\langle f(x),f(y)\rangle |\le (\tau \gamma )^pD$$ | ⟨ f ( x ) , f ( y ) ⟩ | ≤ ( τ γ ) p D ; and (ii) $$|\langle x,y\rangle |\ge \tau d$$ | ⟨ x , y ⟩ | ≥ τ d implies $$\left (\frac{\langle x,y\rangle }{\gamma d}\right )^pD \le \langle f(x),f(y)\rangle \le \left (\frac{\gamma \langle x,y\rangle }{d}\right )^pD$$ ⟨ x , y ⟩ γ d p D ≤ ⟨ f ( x ) , f ( y ) ⟩ ≤ γ ⟨ x , y ⟩ d p D .

中文翻译:

用于在确定性次二次时间中寻找异常值相关的显式相关放大器

我们对 Valiant 的(J ACM 62,第 13 条,2015 年)次二次时间算法进行去随机化,以查找二进制数据中的异常值相关性。这表明可以执行高维的确定性次二次时间相似性连接。我们的去随机化算法本质上为与 Valiant 的随机化算法相同的参数范围提供了确定性的次二次缩放,但我们在二次缩放上保存的精确常数更为适中。我们用于去随机化的主要技术工具是由 Reingold 等人通过一系列锯齿形乘积扩展器构建的显式相关放大器系列。(Ann Math 155(1):157–187, 2002)。我们说一个函数 $$f:\{-1,1\}^d\rightarrow \{-1,1\}^D$$ f : { - 1 , 1 } d → { - 1 , 1 } D是具有阈值 $$0\le \tau \le 1$$ 0 ≤ τ ≤ 1 的相关放大器,误差 $$\gamma \ge 1$$ γ ≥ 1 ,并且强度 p 是偶数正整数,如果对于所有向量对 $$x,y\in \{-1,1\}^d$$ x , y ∈ { - 1 , 1 } d 成立 (i) $$|\langle x,y\rangle |<\tau d$$ | ⟨ x , y ⟩ | < τ d 意味着 $$|\langle f(x),f(y)\rangle |\le (\tau \gamma )^pD$$ | ⟨ f ( x ) , f ( y ) ⟩ | ≤ ( τ γ ) p D ; 和 (ii) $$|\langle x,y\rangle |\ge \tau d$$ | ⟨ x , y ⟩ | ≥ τ d 意味着 $$\left (\frac{\langle x,y\rangle }{\gamma d}\right )^pD \le \langle f(x),f(y)\rangle \le \left ( \frac{\gamma \langle x,y\rangle }{d}\right )^pD$$ ⟨ x , y ⟩ γ dp D ≤ ⟨ f ( x ) , f ( y ) ⟩ ≤ γ ⟨ x , y ⟩ dp D . f(y)\rangle |\le (\tau \gamma )^pD$$ | ⟨ f ( x ) , f ( y ) ⟩ | ≤ ( τ γ ) p D ; 和 (ii) $$|\langle x,y\rangle |\ge \tau d$$ | ⟨ x , y ⟩ | ≥ τ d 意味着 $$\left (\frac{\langle x,y\rangle }{\gamma d}\right )^pD \le \langle f(x),f(y)\rangle \le \left ( \frac{\gamma \langle x,y\rangle }{d}\right )^pD$$ ⟨ x , y ⟩ γ dp D ≤ ⟨ f ( x ) , f ( y ) ⟩ ≤ γ ⟨ x , y ⟩ dp D . f(y)\rangle |\le (\tau \gamma )^pD$$ | ⟨ f ( x ) , f ( y ) ⟩ | ≤ ( τ γ ) p D ; 和 (ii) $$|\langle x,y\rangle |\ge \tau d$$ | ⟨ x , y ⟩ | ≥ τ d 意味着 $$\left (\frac{\langle x,y\rangle }{\gamma d}\right )^pD \le \langle f(x),f(y)\rangle \le \left ( \frac{\gamma \langle x,y\rangle }{d}\right )^pD$$ ⟨ x , y ⟩ γ dp D ≤ ⟨ f ( x ) , f ( y ) ⟩ ≤ γ ⟨ x , y ⟩ dp D .
更新日期:2020-06-20
down
wechat
bug