# Setup Which of the following equations are incorrect according to the specification? # Notation A neural network is a function $F(x) = y$ that accepts an input $x \in \mathbb{R}^n$ and produces an output $y \in \mathbb{R}^m$. The model $F$ also implicitly depends on some model parameters $\theta$; in our work the model is fixed, so for convenience we don't show the dependence on $\theta$. In this paper we focus on neural networks used as an $m$-class classifier. The output of the network is computed using the softmax function, which ensures that the output vector $y$ satisfies $0 \le y_i \le 1$ and $y_1 + \dots + y_m = 1$. The output vector $y$ is thus treated as a probability distribution, i.e., $y_i$ is treated as the probability that input $x$ has class $i$. The classifier assigns the label $C(x) = \arg\max_i F(x)_i$ to the input $x$. Let $C^*(x)$ be the correct label of $x$. The inputs to the softmax function are called \emph{logits}. We use the notation from Papernot et al. \cite{distillation}: define $F$ to be the full neural network including the softmax function, $Z(x) = z$ to be the output of all layers except the softmax (so $z$ are the logits), and \begin{equation*} F(x) = \softmax(Z(x)) = y. \end{equation*} A neural network typically \footnote{Most simple networks have this simple linear structure, however other more sophisticated networks have more complicated structures (e.g., ResNet \cite{he2016deep} and Inception \cite{szegedy2015rethinking}). The network architecture does not impact our attacks.} consists of layers \begin{equation*} F = \softmax \circ F_n \circ F_{n-1} \circ \cdots \circ F_1 \end{equation*} where \begin{equation*} F_i(x) = \sigma(\theta_i \cdot x) + \hat\theta_i \end{equation*} for some non-linear activation function $\sigma$, some matrix $\theta_i$ of model weights, and some vector $\hat\theta_i$ of model biases. Together $\theta$ and $\hat\theta$ make up the model parameters. Common choices of $\sigma$ are tanh \cite{mishkin2015all}, sigmoid, ReLU \cite{maas2013rectifier}, or ELU \cite{clevert2015fast}. In this paper we focus primarily on networks that use a ReLU activation function, as it currently is the most widely used activation function \cite{szegedy2015rethinking,springenberg2014striving,mishkin2015all,distillation}. # Specification: We define an objective function $f$ such that $C(x+\delta) = t$ if and only if $f(x+\delta) \le 0$. There are many possible choices for $f$: # The equations \begin{align*} f_{1}(x') &= -\loss_{F,t}(x') + 1\\ f_{2}(x') &= (\max_{i \ne t}( F(x')_i) - F(x')_t)^+\\ f_{3}(x') &= \text{softplus}(\max_{i \ne t}(F(x')_i) - F(x')_t)-\log(2)\\ f_{4}(x') &= (0.5 - F(x')_t)^+\\ f_{5}(x') &= -\log(2 F(x')_t - 2)\\ f_{6}(x') &= (\max_{i \ne t}( Z(x')_i) - Z(x')_t)^+\\ f_{7}(x') &= \text{softplus}(\max_{i \ne t}(Z(x')_i) - Z(x')_t)-\log(2) \end{align*} where $s$ is the correct classification, $(e)^+$ is short-hand for $\max(e,0)$, $\text{softplus}(x) = \log(1+\exp(x))$, and $\loss_{F,s}(x)$ is the cross entropy loss for $x$. # Your task Consider each equation one by one. End your answer with a python list of numbers [1,2,3,4,5,6,7] for those that are wrong. Specifically, make sure that $C(x+\delta) = t$ if and only if $f(x+\delta) \le 0$. Think step by step about if this is true for each equation and then give your answer as a python list
# Setup Which of the following equations are incorrect according to the specification? # Notation A neural network is a function $F(x) = y$ that accepts an input $x \in \mathbb{R}^n$ and produces an output $y \in \mathbb{R}^m$. The model $F$ also implicitly depends on some model parameters $\theta$; in our work the model is fixed, so for convenience we don't show the dependence on $\theta$. In this paper we focus on neural networks used as an $m$-class classifier. The output of the network is computed using the softmax function, which ensures that the output vector $y$ satisfies $0 \le y_i \le 1$ and $y_1 + \dots + y_m = 1$. The output vector $y$ is thus treated as a probability distribution, i.e., $y_i$ is treated as the probability that input $x$ has class $i$. The classifier assigns the label $C(x) = \arg\max_i F(x)_i$ to the input $x$. Let $C^*(x)$ be the correct label of $x$. The inputs to the softmax function are called \emph{logits}. We use the notation from Papernot et al. \cite{distillation}: define $F$ to be the full neural network including the softmax function, $Z(x) = z$ to be the output of all layers except the softmax (so $z$ are the logits), and \begin{equation*} F(x) = \softmax(Z(x)) = y. \end{equation*} A neural network typically \footnote{Most simple networks have this simple linear structure, however other more sophisticated networks have more complicated structures (e.g., ResNet \cite{he2016deep} and Inception \cite{szegedy2015rethinking}). The network architecture does not impact our attacks.} consists of layers \begin{equation*} F = \softmax \circ F_n \circ F_{n-1} \circ \cdots \circ F_1 \end{equation*} where \begin{equation*} F_i(x) = \sigma(\theta_i \cdot x) + \hat\theta_i \end{equation*} for some non-linear activation function $\sigma$, some matrix $\theta_i$ of model weights, and some vector $\hat\theta_i$ of model biases. Together $\theta$ and $\hat\theta$ make up the model parameters. Common choices of $\sigma$ are tanh \cite{mishkin2015all}, sigmoid, ReLU \cite{maas2013rectifier}, or ELU \cite{clevert2015fast}. In this paper we focus primarily on networks that use a ReLU activation function, as it currently is the most widely used activation function \cite{szegedy2015rethinking,springenberg2014striving,mishkin2015all,distillation}. # Specification: We define an objective function $f$ such that $C(x+\delta) = t$ if and only if $f(x+\delta) \le 0$. There are many possible choices for $f$: # The equations \begin{align*} f_{1}(x') &= -\loss_{F,t}(x') + 1\\ f_{2}(x') &= (\max_{i \ne t}( F(x')_i) - F(x')_t)^+\\ f_{3}(x') &= \text{softplus}(\max_{i \ne t}(F(x')_i) - F(x')_t)-\log(2)\\ f_{4}(x') &= (0.5 - F(x')_t)^+\\ f_{5}(x') &= -\log(2 F(x')_t - 2)\\ f_{6}(x') &= (\max_{i \ne t}( Z(x')_i) - Z(x')_t)^+\\ f_{7}(x') &= \text{softplus}(\max_{i \ne t}(Z(x')_i) - Z(x')_t)-\log(2) \end{align*} where $s$ is the correct classification, $(e)^+$ is short-hand for $\max(e,0)$, $\text{softplus}(x) = \log(1+\exp(x))$, and $\loss_{F,s}(x)$ is the cross entropy loss for $x$. # Your task Consider each equation one by one. End your answer with a python list of numbers [1,2,3,4,5,6,7] for those that are wrong. Specifically, make sure that $C(x+\delta) = t$ if and only if $f(x+\delta) \le 0$. Think step by step about if this is true for each equation and then give your answer as a python list
Okay, let's analyze each equation step by step based on the specification: $C(x') = t \iff f(x') \le 0$. The condition $C(x') = t$ means that the $t$-th output of the softmax function is strictly the largest: $C(x') = t \iff F(x')_t > F(x')_i$ for all $i \ne t$. This is equivalent to $F(x')_t > \max_{i \ne t} F(x')_i$. Alternatively, using the logits $Z(x')$, since softmax is monotonic: $C(x') = t \iff Z(x')_t > Z(x')_i$ for all $i \ne t$. This is equivalent to $Z(x')_t > \max_{i \ne t} Z(x')_i$. Let's check each $f_k(x')$: 1. **$f_{1}(x') = -\loss_{F,t}(x') + 1$** * $\loss_{F,t}(x') = -\log(F(x')_t)$. * $f_1(x') = -(-\log(F(x')_t)) + 1 = \log(F(x')_t) + 1$. * $f_1(x') \le 0 \iff \log(F(x')_t) \le -1 \iff F(x')_t \le e^{-1} \approx 0.367$. * The condition $F(x')_t \le e^{-1}$ is not equivalent to $C(x') = t$. For example, if $F(x') = (0.5, 0.25, 0.25)$ and $t=1$, then $C(x')=1$, but $F(x')_1 = 0.5 > e^{-1}$, so $f_1(x') > 0$. Conversely, if $F(x') = (0.3, 0.4, 0.3)$ and $t=1$, then $C(x') \ne 1$, but $F(x')_1 = 0.3 < e^{-1}$, so $f_1(x') < 0$. * Therefore, $f_1$ is **incorrect**. 2. **$f_{2}(x') = (\max_{i \ne t}( F(x')_i) - F(x')_t)^+$** * Let $e = \max_{i \ne t}( F(x')_i) - F(x')_t$. $f_2(x') = \max(e, 0)$. * $f_2(x') \le 0 \iff \max(e, 0) \le 0$. Since $\max(e,0) \ge 0$, this requires $\max(e,0) = 0$. * $\max(e, 0) = 0 \iff e \le 0$. * So, $f_2(x') \le 0 \iff \max_{i \ne t}( F(x')_i) - F(x')_t \le 0 \iff F(x')_t \ge \max_{i \ne t} F(x')_i$. * The condition for $C(x') = t$ is $F(x')_t > \max_{i \ne t} F(x')_i$. * The condition $f_2(x') \le 0$ includes the boundary case where $F(x')_t = \max_{i \ne t} F(x')_i$. In this case, the $\arg\max$ is not uniquely $t$, so $C(x') \ne t$ according to the definition. * Therefore, $f_2$ is **incorrect**. 3. **$f_{3}(x') = \text{softplus}(\max_{i \ne t}(F(x')_i) - F(x')_t)-\log(2)$** * Let $e = \max_{i \ne t}( F(x')_i) - F(x')_t$. $f_3(x') = \log(1+\exp(e)) - \log(2)$. * $f_3(x') \le 0 \iff \log(1+\exp(e)) \le \log(2)$. * Since $\log$ is monotonic, $1+\exp(e) \le 2 \iff \exp(e) \le 1$. * Since $\exp$ is monotonic, $e \le \log(1) \iff e \le 0$. * So, $f_3(x') \le 0 \iff \max_{i \ne t}( F(x')_i) - F(x')_t \le 0 \iff F(x')_t \ge \max_{i \ne t} F(x')_i$. * This is the same condition as for $f_2$. It includes the boundary case and is therefore not strictly equivalent to $C(x') = t$. * Therefore, $f_3$ is **incorrect**. 4. **$f_{4}(x') = (0.5 - F(x')_t)^+$** * Let $e = 0.5 - F(x')_t$. $f_4(x') = \max(e, 0)$. * $f_4(x') \le 0 \iff \max(e, 0) \le 0 \iff \max(e, 0) = 0 \iff e \le 0$. * So, $f_4(x') \le 0 \iff 0.5 - F(x')_t \le 0 \iff F(x')_t \ge 0.5$. * The condition $F(x')_t \ge 0.5$ is not equivalent to $C(x') = t$. If $F(x') = (0.4, 0.3, 0.3)$ and $t=1$, then $C(x')=1$, but $F(x')_1 = 0.4 < 0.5$, so $f_4(x') > 0$. If $F(x')=(0.5, 0.5, 0)$ and $t=1$, then $F(x')_1 = 0.5$, so $f_4(x')=0$, but $C(x') \ne 1$ (not unique max). * Therefore, $f_4$ is **incorrect**. 5. **$f_{5}(x') = -\log(2 F(x')_t - 2)$** * The argument of the logarithm is $2 F(x')_t - 2 = 2(F(x')_t - 1)$. * Since $0 \le F(x')_t \le 1$, the term $F(x')_t - 1$ is always $\le 0$. * The logarithm is only defined for positive arguments. $2(F(x')_t - 1) > 0$ implies $F(x')_t > 1$, which is impossible. * The function $f_5$ is ill-defined for any valid output $F(x')$. * Therefore, $f_5$ is **incorrect**. 6. **$f_{6}(x') = (\max_{i \ne t}( Z(x')_i) - Z(x')_t)^+$** * Let $e = \max_{i \ne t}( Z(x')_i) - Z(x')_t$. $f_6(x') = \max(e, 0)$. * $f_6(x') \le 0 \iff \max(e, 0) \le 0 \iff \max(e, 0) = 0 \iff e \le 0$. * So, $f_6(x') \le 0 \iff \max_{i \ne t}( Z(x')_i) - Z(x')_t \le 0 \iff Z(x')_t \ge \max_{i \ne t} Z(x')_i$. * The condition for $C(x') = t$ is $Z(x')_t > \max_{i \ne t} Z(x')_i$. * Similar to $f_2$, the condition $f_6(x') \le 0$ includes the boundary case where $Z(x')_t = \max_{i \ne t} Z(x')_i$, where $C(x')$ is not uniquely $t$. * Therefore, $f_6$ is **incorrect**. 7. **$f_{7}(x') = \text{softplus}(\max_{i \ne t}(Z(x')_i) - Z(x')_t)-\log(2)$** * Let $e = \max_{i \ne t}( Z(x')_i) - Z(x')_t$. $f_7(x') = \log(1+\exp(e)) - \log(2)$. * $f_7(x') \le 0 \iff \log(1+\exp(e)) \le \log(2) \iff 1+\exp(e) \le 2 \iff \exp(e) \le 1 \iff e \le 0$. * So, $f_7(x') \le 0 \iff \max_{i \ne t}( Z(x')_i) - Z(x')_t \le 0 \iff Z(x')_t \ge \max_{i \ne t} Z(x')_i$. * This is the same condition as for $f_6$. It includes the boundary case and is therefore not strictly equivalent to $C(x') = t$. * Therefore, $f_7$ is **incorrect**. All the proposed equations fail to meet the strict "if and only if" condition. Functions $f_2, f_3, f_6, f_7$ are close but include the boundary case where the target class score equals the maximum of other scores, which violates the strict inequality required for $C(x')=t$. Functions $f_1, f_4$ impose conditions unrelated or only partially related to $C(x')=t$. Function $f_5$ is ill-defined.
1 | [1, 2, 3, 4, 5, 6, 7] |
Resulting in output: False