The term measurable is referring to "measurable functions" in measure theory, which correspond to functions verifying that the pre-image of any measurable set belonging to the sigma-algebra of the codomain belongs to the sigma-algebra of the domain (https://en.wikipedia.org/wiki/Measurable_function). I do not know how to state it in simpler terms, sorry. When the measure of the domain is 1 (as in a probability space), we call measurable functions random variables, hence their relevance to this topic.
Now, tempered distributions are functions that assign a complex number to a very rapidly decaying function (a Schwarz space function), and it satisfies linearity properties. So this is a function that takes functions and maps them to complex numbers. https://secure.math.ubc.ca/~feldman/m321/distributions.pdf
Are they talking about the Borel sigma-algebra generated by the open sets of a topological space? What topology is in their mind?
Are tempered distributions functions? How does one compose two tempered distributions? (Hint: you can't, and they never actually use tempered distributions.)
This is just mathematical masturbation.
Everything is finite when implemented on a computer so there is no need for such dainty mathematical niceties unless you are trying lend credence to pedestrian observations about Chebyshev's inequality.
If not further specified, the topology is induced by the metric or norm of the space to be considered.
Tempered distributions are used in Subsection 3.1, resulting from the observation that the Fourier transform of a shallow neural network involves a Dirac delta.
Some mathematical concepts are needed in order to present rigorous results. While one can argue about the necessity and relevance of these results for real-world applications, they at least explain various aspects of deep learning in restricted settings, leading to a better general understanding and intuition.
Nyet, and nyet. This is why conscientious authors define the terms they use. A tempered distribution is a linear functional on a space of differentiable functions, for example, D_x(f) = f'(x), the derivative of f at x. This is why tempered distributions cannot be composed.
Now, tempered distributions are functions that assign a complex number to a very rapidly decaying function (a Schwarz space function), and it satisfies linearity properties. So this is a function that takes functions and maps them to complex numbers. https://secure.math.ubc.ca/~feldman/m321/distributions.pdf