Implicit learning has shown a lot of promise, particularly for representing (near) discontinuous functions. For example, our recent work on ContactNets used implicit representations of geometry. Similarly, we’ve seen how unstructured, explicit approaches struggle with learning stiff or discontinuous functions. We’ve set out to better understand why (and when) implicit representations are useful.

Most obviously, an implicit parameterization can better represent non-smoothness. Other authors have exploited this, for instance via embedding differentiable optimization into the learning process (Belbute-Peres et al., “End-to-end differentiable physics for learning and control”). However, this is only part of the story. If the underlying function to be learned is stiff or discontinuous, this stiffness ultimately manifests in the loss function.

Instead, we’ve investigated ContactNets-inspired implicit losses which balance optimality of the embedded problem against prediction error, with better performance on near-discontinuous learning problems. In a new preprint, “Generalization Bounded Implicit Learning of Nearly Discontinuous Functions” by Bibit Bianchini et al., we show how this violation-implicit loss provably generalizes well to unseen data. The resulting loss landscape is well-conditioned (with low Lipschitz constants, despite the stiff underlying function). We also provably connect this loss to graph distance, a natural metric for evaluating steep or discontinuous functions.