Mathematical Background
What is a polynomial-activation MLP, and what polynomials can it represent?
The Network Forward Pass
A polynomial-activation MLP with inputs $(x,y,z) \in \mathbb{R}^3$ computes, for each hidden layer $l$ and neuron $j$:
$$h^{(l)}_j = \Bigl(\sum_i W^{(l)}_{ji}\,h^{(l-1)}_i + b^{(l)}_j\Bigr)^{d_l}$$where $d_l \in \mathbb{N}$ is the monomial activation degree of layer $l$. The final output is a linear readout:
$$f_\theta = \sum_j W^{\mathrm{out}}_j\,h^{(L)}_j + b^{\mathrm{out}}$$Because each activation is a monomial $\sigma(t)=t^{d_l}$, the output $f_\theta \in \mathbb{R}[x,y,z]$ is a polynomial of total degree bounded by the degree product:
$$D = \prod_{l=1}^{L} d_l$$The Realization Family $\mathcal{F}_A$
For a fixed architecture $A$, define the realization map:
$$\Phi_A : \theta \in \mathbb{R}^P \;\longrightarrow\; f_\theta \in \mathbb{R}[x,y,z]_{\leq D}$$The realization family is the image $\mathcal{F}_A = \mathrm{Im}(\Phi_A)$, a semi-algebraic subset of the vector space of degree-$\leq D$ polynomials. We ask three questions:
- Family question: What is $\mathcal{F}_A \subseteq \mathbb{R}[x,y,z]_{\leq D}$?
- Exact representability: Does $p \in \mathcal{F}_A$?
- Training recovery: Does gradient descent find $\theta^*$ with $f_{\theta^*} = p$?
Target Polynomials
We study three targets $p : \mathbb{R}^3 \to \mathbb{R}$:
- $p_3(x,y,z) = z^3 - xy$ (degree 3)
- $p_4(x,y,z) = z^4 - xy$ (degree 4)
- $p_5(x,y,z) = z^5 - xy$ (degree 5, prime)
The $z^4-xy$ case admits an explicit construction via the double-squaring identity: $(a+b)^2-(a-b)^2=4ab$ applied twice through two layers of degree-2 activations.
Certificate Status — Mathematical Definitions
There exists $\theta^* \in \mathbb{R}^P$ such that $f_{\theta^*}(x,y,z) \equiv p(x,y,z)$ as polynomials (all coefficients match exactly). Proven by explicit algebraic construction or symbolic coefficient matching via SymPy.
It is mathematically impossible for this architecture to represent $p$. Proven by one of:
- Degree obstruction: $D = \prod_l d_l < \deg p$, so no polynomial in $\mathcal{F}_A$ can have degree $\deg p$
- Variable obstruction: a required variable cannot reach the output
- Monomial obstruction: no neuron path can produce the required monomial
The architecture passes all necessary conditions ($D \geq \deg p$, variables reachable, etc.) but neither a constructive proof of $p \in \mathcal{F}_A$ nor an obstruction was found. The true answer is YES or NO but currently unknown.
Gradient descent found weights $\hat\theta$ achieving near-zero MSE. Strong practical evidence for $p \in \mathcal{F}_A$, but not a mathematical proof — the solution may be approximate or the landscape may have been favourable by chance.
Sweep Overview
Statistics across 7230 architectures for each target polynomial.
Configuration Explorer
Filter and sort all swept architectures. Click a row to open the detail panel.
Custom Training
Define your own architecture and train it on demand against any target polynomial.