More corrections made

This commit is contained in:
Shakil Rafi 2024-03-19 11:24:24 -05:00
parent 86910c81e8
commit 3d09719471
16 changed files with 149 additions and 88 deletions

BIN
.DS_Store vendored

Binary file not shown.

View File

@ -88,7 +88,7 @@ We will present here some standard invariants of Brownian motions. The proofs ar
\begin{definition}[Of $\mathfrak{k}$]\label{def:1.17}
\begin{definition}[Of $\mathfrak{k}$, the modified Kahane\textendash Kintchine constant]\label{def:1.17}
Let $p \in [2,\infty)$. We denote by $\mathfrak{k}_p \in \R$ the real number given by $\mathfrak{k}:=\inf \{ c\in \R \}$ where it holds that for every probability space $(\Omega, \mathcal{F}, \mathbb{P})$ and every random variable $\mathcal{X}: \Omega \rightarrow \R$ with $\E[|\mathcal{X}|] < \infty$ that $\lp \E \lb \lv \mathcal{X} - \E \lb \mathcal{X} \rb \rp^p \rb \rp ^{\frac{1}{p}} \leqslant c \lp \E \lb \lv \mathcal{X} \rv^p \rb \rp ^{\frac{1}{p}}.$
\end{definition}

View File

@ -10,11 +10,11 @@ Our goal in this dissertation is threefold:
\begin{enumerate}[label = (\roman*)]
\item Firstly, we will take something called Multi-Level Picard first introduced in \cite{e_multilevel_2019} and \cite{e_multilevel_2021}, and in particular, the version of Multi-Level Picard that appears in \cite{hutzenthaler_strong_2021}. We show that dropping the drift term and substantially simplifying the process still results in convergence of the method and polynomial bounds for the number of computations required and rather nice properties for the approximations, such as integrability and measurability.
\item We will then go on to realize that the solution to a modified version of the heat equation has a solution represented as a stochastic differential equation by Feynman-Kac and further that a version of this can be realized by the modified multi-level Picard technique mentioned in Item (i), with certain simplifying assumptions since we dropped the drift term. A substantial amount of this is inspired by \cite{bhj20} and much earlier work in \cite{karatzas1991brownian} and \cite{da_prato_zabczyk_2002}.
\item By far, the most significant part of this dissertation is dedicated to expanding and building upon a framework of neural networks as appears in \cite{grohs2019spacetime}. We modify this definition highly and introduce several new neural network architectures to this framework ($\pwr, \pnm, \tun,\etr, \xpn, \csn, \sne, \mathsf{E},\mathsf{UE}, \mathsf{UEX}$, and $\mathsf{UEX}$, among others) and show, for all these neural networks, that the parameter count grows only polynomially as the accuracy of our model increases, thus beating the curse of dimensionality. This finally paves the way for giving neural network approximations to the techniques realized in Item (ii). We show that it is not too wasteful (defined on the polynomiality of parameter counts) to use neural networks to approximate MLP to approximate a stochastic differential equation equivalent to certain parabolic PDEs as Feynman-Kac necessitates.
\item By far, the most significant part of this dissertation is dedicated to expanding and building upon a framework of neural networks as appears in \cite{grohs2019spacetime}. We modify this definition highly and introduce several new neural network architectures to this framework ($\pwr_n^{q,\ve}$, $\pnm_C^{q,\ve}$, $\tun^d_n$,$\etr^{N,h}$, $\xpn_n^{q,\ve}$, $\csn_n^{q,\ve}$, $\sne_n^{q,\ve}$, $\mathsf{E}^{N,h,q,\ve}_n$,$\mathsf{UE}^{N,h,q,\ve}_{n,\mathsf{G}_d}$, $\mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega}$, and $\mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\Omega}$, among others) and show, for all these neural networks, that the parameter count grows only polynomially as the accuracy of our model increases, thus beating the curse of dimensionality. This finally paves the way for giving neural network approximations to the techniques realized in Item (ii). We show that it is not too wasteful (defined on the polynomiality of parameter counts) to use neural networks to approximate MLP to approximate a stochastic differential equation equivalent to certain parabolic PDEs as Feynman-Kac necessitates.
\\~\\
We end this dissertation by proposing two avenues of further research: analytical and algebraic. This framework of understanding neural networks as ordered tuples of ordered pairs may be extended to give neural network approximation of classical PDE approximation techniques such as Runge-Kutta, Adams-Moulton, and Bashforth. We also propose three conjectures about neural networks, as defined in \cite{grohs2019spacetime}. They form a bimodule, and that realization is a functor.
\end{enumerate}
This dissertation is broken down into three parts. At the end of each part, we will encounter tent-pole theorems, which will eventually lead to the final neural network approximation outcome. These tentpole theorems are Theorem \ref{tentpole_1}, Theorem \ref{thm:3.21}, and Theorem \ref{ues}. Finally, the culmination of these three theorems is Corollary \ref{cor_ues}, the end product of the dissertation. We hope, you the reader will enjoy this.
This dissertation is broken down into three parts. At the end of each part, we will encounter tent-pole theorems, which will eventually lead to the final neural network approximation outcome. These tentpole theorems are Theorem \ref{tentpole_1}, Theorem \ref{thm:3.21}, and Theorem \ref{ues}. Finally, the culmination of these three theorems is Corollary \ref{cor_ues}, the end product of the dissertation. We hope, you, the reader will enjoy this.
\section{Notation, Definitions \& Basic notions.}
We introduce here basic notations that we will be using throughout this dissertation. Large parts are taken from standard literature inspired by \textit{Matrix Computations} by Golub \& van Loan, \cite{golub2013matrix}, \textit{Probability: Theory \& Examples} by Rick Durrett, \cite{durrett2019probability}, and \textit{Concrete Mathematics} by Knuth, Graham \& Patashnik, \cite{graham_concrete_1994}.

View File

@ -1,4 +1,11 @@
\chapter{ANN first approximations}
We will give here a few ANN representations of common functions. Specifically we will posit the existence of a $1$-dimensional identity and show that it acts as a compositional idenity for neural networks with fixed end-widths. Thus under composition neural networks with fixed end-widths under composition act as a monoid with $\id_d$ as the compositional identity.
We will also posit two new neural networks $\trp^h$, and $\etr^{n,h}$ neural networks for approximating trapezoidal rule integration.
We will then go on to posit the $\nrm_1^d$ and $\mxm^d$ network, taken mainly from \cite[Chapter~3]{bigbook}, our contribution will be to add parameter estimates.
We will finally go on to show the $\mathsf{MC}^{N,d}_{x,y}$ neural network which will perform the maximum convolution approximation for functions $f:\R^d \rightarrow \R$. Our contribution will be to show that the parameter counts are polynomial on dimension, $d$.
\section{ANN Representations for One-Dimensional Identity}
\begin{definition}[One Dimensional Identity Neural Network]\label{7.2.1}
@ -711,7 +718,7 @@ This completes the proof.
%\end{proof}
\section{Maximum Convolution Approximations for Multi-Dimensional Functions}
We will present here an approximation scheme for continuous functions called maximum convolution approximation. This derives mainly from Chapter 4 of \cite{bigbook}, and our contribution is mainly to show parameter bounds, and convergence in the case of $1$-D approximation.
\subsection{The $\nrm^d_1$ Networks}
\subsection{The $\nrm^d_1$ Neural Networks}
\begin{definition}[The $\nrm_1^d$ neural network]
We denote by $\lp \nrm_1^d \rp _{d\in \N} \subseteq \neu$ the family of neural networks that satisfy:
\begin{enumerate}[label = (\roman*)]
@ -1030,7 +1037,7 @@ Given $x\in \R$, it is straightforward to find the maximum; $ x$ is the maximum.
Item (vi) is a straightforward consequence of Item (i). This completes the proof of the lemma.
\end{proof}
\subsection{The $\mathsf{MC}^{N,d}_{x,y}$ Neural Network and Approximations via Maximum Convolutions }
\subsection{The $\mathsf{MC}^{N,d}_{x,y}$ Neural Networks}
Let $f: [a,b] \rightarrow \R$ be a continuous bounded function with Lipschitz constant $L$. Let $x_0 \les x_1 \les \cdots \les x_N$ be a set of sample points within $[a,b]$, with it being possibly the case that that for all $i \in \{0,1,\hdots, N\}$, $x_i \sim \unif([a,b])$. For all $i \in \{0,1,\hdots, N\}$, define a series of functions $f_0,f_1,\hdots f_N: [a,b] \rightarrow \R$, as such:
\begin{align}

View File

@ -1,4 +1,5 @@
\chapter{ANN Product Approximations}
\chapter{ANN Product Approximations and Their Consequences}\label{chp:ann_prod}
\section{Approximation for Products of Two Real Numbers}
We will build up the tools necessary to approximate $e^x$ via neural networks in the framework described in the previous sections. While much of the foundation comes from, e.g., \cite{grohs2019spacetime} way, we will, along the way, encounter neural networks not seen in the literature, such as the $\tay$, $\pwr$, $\tun$, and finally a neural network approximant for $e^x$. For each of these neural networks, we will be concerned with at least the following:
\begin{enumerate}[label = (\roman*)]
@ -8,7 +9,7 @@ We will build up the tools necessary to approximate $e^x$ via neural networks in
\item The accuracy of our neural networks.
\end{enumerate}
\subsection{The squares of real numbers in $\lb 0,1 \rb$}
One of the most important operators we can
One of the most important operators we will approximate is the product operator $\times$ for two real numbers. The following sections takes a streamlined version of the proof given in \cite[Section~3.1]{grohs2019spacetime}. In particular we will assert the existence of the neural network $\Phi$ and $\phi_d$ and work our way towards its properties.
\begin{definition}[The $\mathfrak{i}_d$ Network]\label{def:mathfrak_i}
For all $d \in \N$ we will define the following set of neural networks as ``activation neural networks'' denoted $\mathfrak{i}_d$ as:
\begin{align}
@ -48,18 +49,44 @@ One of the most important operators we can
\end{align}
Let $\Phi_k \in \neu$, $k\in \N$ satisfy for all $k \in [2,\infty) \cap \N$ that $\Phi_1 = \lp \aff_{C_1,0} \bullet \mathfrak{i}_4 \rp \bullet \aff_{\mymathbb{e}_4,B}$, that for all $d \in \N$, $\mathfrak{i}_d = \lp \lp \mathbb{I}_d, \mymathbb{0}_d \rp, \lp \mathbb{I}_d, \mymathbb{0}_d \rp \rp$ and that:
\begin{align}
\Phi_k =\lp \aff_{C_k,0}\bullet \mathfrak{i}_4 \rp \bullet \lp \aff_{A_{k-1},B} \bullet \mathfrak{i}_4\rp \bullet \cdots \bullet \lp \aff_{A_1,B} \bullet \mathfrak{i}_4 \rp \bullet \aff_{\mymathbb{e}_4,B}
\Phi_k =\lp \aff_{C_k,0}\bullet \mathfrak{i}_4 \rp \bullet \lp \aff_{A_{k-1},B} \bullet \mathfrak{i}_4\rp \bullet \cdots \bullet \lp \aff_{A_1,B} \bullet \mathfrak{i}_4 \rp \bullet \aff_{\mymathbb{e}_4,B} ,
\end{align}
It is then the case that:
\begin{enumerate}[label = (\roman*)]
\item for all $k \in \N$, $x \in \R$ we have $\real_{\rect}\lp \Phi_k\rp\lp x \rp \in C \lp \R, \R \rp $
\item for all $k \in \N$ we have $\lay \lp \Phi_k \rp = \lp 1,4,4,...,4,1 \rp \in \N^{k+2}$
\item for all $k \in \N$, $x \in \R \setminus \lb 0,1 \rb $ that $\lp \real_{\rect} \lp \Phi_k \rp \rp \lp x \rp = \rect \lp x \rp$
\item for all $k \in \N$, $x \in \lb 0,1 \rb$, we have $\left| x^2 - \lp \real_{\rect} \lp \xi_k \rp \rp \lp x \rp \right| \les 2^{-2k-2}$, and
\item for all $k \in \N$, $x \in \lb 0,1 \rb$, we have $\left| x^2 - \lp \real_{\rect} \lp \Phi_k \rp \rp \lp x \rp \right| \les 2^{-2k-2}$, and
\item for al $k \in \N$ , we have that $\param \lp \Phi_k \rp = 20k-7$
\end{enumerate}
\end{lemma}
\begin{proof}
Firstly note that Lemma \ref{aff_prop}, Lemma \ref{comp_prop}, and Lemma \ref{lem:mathfrak_i}
ensure that for all $k \in \N$, $x \in \R$ it is the case that $\real_{\rect}\lp \Phi_k\rp \lp x\rp \in C\lp \R, \R\rp$. This proves Item (i).
Note next that Lemma \ref{aff_prop}, Lemma \ref{lem:mathfrak_i}, and Lemma \ref{comp_prop} tells us that:
\begin{align}
\lay \lp \Phi_1 \rp = \lay \lp \aff_{\mymathbb{e}_4},B\rp = \lp 1,4,1\rp
\end{align}
and for all $k \in \N$ it is the case that:
\begin{align}
\lay \lp \aff_{A_k,B} \bullet \mathfrak{i}_4\rp = \lp 4,4,4,4\rp
\end{align}
Whence it is straightforward to see that for $\Phi_k$ where $k \in \N \cap \lb 2,\infty \rp$, Lemma \ref{comp_prop} tells us then that:
\begin{align}
\lay \lp \Phi_k\rp &= \lay \lp \lp \aff_{C_k,0}\bullet \mathfrak{i}_4 \rp \bullet \lp \aff_{A_{k-1},B} \bullet \mathfrak{i}_4\rp \bullet \cdots \bullet \lp \aff_{A_1,B} \bullet \mathfrak{i}_4 \rp \bullet \aff_{\mymathbb{e}_4,B} \rp \nonumber\\
&= (1,\overbrace{4) \: \overbrace{( 4}^{merged},4,4,\overbrace{4) \:( 4}^{merged},4,4,\overbrace{4)\: }^{merged}\hdots \overbrace{\: ( 4}^{merged},4,4,\overbrace{4) \:}^{merged} (4}^{k-1 \text{ many}},1)
\end{align}
This thus finally yields that:
\begin{align}
\lay \lp \Phi_k\rp = \lp 1,4,4,\hdots, 4,1\rp \in \N^{k+2}
\end{align}
Let $g_k: \R \rightarrow \lb 0,1 \rb$, $k \in \N$ be the functions defined as such, satisfying for all $k \in \N$, $x \in \R$ that:
\begin{align}\label{(6.0.3)}
g_1 \lp x \rp &= \begin{cases}
@ -69,7 +96,7 @@ One of the most important operators we can
\end{cases} \\
g_{k+1} &= g_1(g_{k}) \nonumber
\end{align}
and let $f_k: \lb 0,1 \rb \rightarrow \lb 0,1 \rb$, $k \in \N_0$ be the functions satisfying for all $k \in \N_0$, $n \in \{0,1,...,2^k-1\}$, $x \in \lb \frac{n}{2^k}, \frac{n+1}{2^k} \rp$ that $f_k(1)=1$ and:
and let $f_k: \lb 0,1 \rb \rightarrow \lb 0,1 \rb$, $k \in \N_0$ be the functions satisfying for all $k \in \N_0$, $n \in \{0,1,\hdots,2^k-1\}$, $x \in \lb \frac{n}{2^k}, \frac{n+1}{2^k} \rp$ that $f_k(1)=1$ and:
\begin{align}\label{(6.0.4.2)}
f_k(x) = \lb \frac{2n+1}{2^k} \rb x-\frac{n^2+n}{2^{2k}}
\end{align}
@ -80,7 +107,7 @@ One of the most important operators we can
\end{bmatrix}= \rect \lp \begin{bmatrix}
x \\ x-\frac{1}{2} \\ x-1 \\ x
\end{bmatrix} \rp \\
r_{k+1} &= A_{k+1}r_k(x) \nonumber
r_{k+1} &= \rect \lp A_{k+1}r_k(x) +B \rp \nonumber
\end{align}
Note that since it is the case that for all $x \in \R$ that $\rect(x) = \max\{x,0\}$, (\ref{(6.0.3)}) and (\ref{(6.0.5)}) shows that it holds for all $x \in \R$ that:
\begin{align}\label{6.0.6}
@ -106,7 +133,7 @@ One of the most important operators we can
\max\{x,0\} & : x \in \R \setminus \lb 0,1\rb
\end{cases} \rp
\end{align}
We prove (\ref{6.0.8}) and (\ref{6.0.9}) by induction. The base base of $k=1$ is proved by (\ref{6.0.6}) and (\ref{6.0.7}). For the induction step $\N \ni k \rightarrow k+1$ assume there does exist a $k \in \N$ such that for all $x \in \R$ it is the case that:
We prove (\ref{6.0.8}) and (\ref{6.0.9}) by induction. The base base of $k=1$ is proved by (\ref{6.0.6}) and (\ref{6.0.7}) respectively. For the induction step $\N \ni k \rightarrow k+1$ assume there does exist a $k \in \N$ such that for all $x \in \R$ it is the case that:
\begin{align}
2r_{1,k}(x) - 4r_{2,k}(x) + 2r_{3,k}(x) = g_k(x)
\end{align}
@ -117,7 +144,7 @@ One of the most important operators we can
\max\{x,0\} &: x \in \R \setminus \lb 0,1 \rb
\end{cases}
\end{align}
Note that then (\ref{(6.0.3)}),(\ref{(6.0.5)}), and (\ref{6.0.6}) then tells us that for all $x \in \R$ it is the case that:
Note that then (\ref{(6.0.3)}), (\ref{(6.0.5)}), and (\ref{6.0.6}) then tells us that for all $x \in \R$ it is the case that:
\begin{align}\label{6.0.12}
g_{k+1}\lp x \rp &= g_1(g_k(x)) = g_1(2r_{1,k}(x)+4r_{2,k}(x) + 2r_{3,k}(x)) \nonumber \\
&= 2\rect \lp 2r_{1,k}(x)) + 4r_{2,k} +2r_{3,k}(x) \rp \nonumber \\
@ -160,7 +187,7 @@ One of the most important operators we can
\end{align}
Which then implies for all $k\in \N$, $x \in \lb 0,1\rb$ that it holds that:
\begin{align}
\left\| x^2-\lp \real_{\rect} \lp \Phi_k \rp \rp \lp x \rp \right\| \les 2^{-2k-2}
\left| x^2-\lp \real_{\rect} \lp \Phi_k \rp \rp \lp x \rp \right| \les 2^{-2k-2}
\end{align}
This, in turn, establishes Item (i).
@ -250,9 +277,9 @@ One of the most important operators we can
\end{remark}
Now that we have neural networks that perform the squaring operation inside $\lb -1,1\rb$, we may extend to all of $\R$. Note that this neural network representation differs somewhat from the ones in \cite{grohs2019spacetime}.
\subsection{The $\sqr^{q,\ve}$ network}
\subsection{The $\sqr^{q,\ve}$ Neural Networks and Squares of Real Numbers}
\begin{lemma}\label{6.0.3}\label{lem:sqr_network}
Let $\delta,\epsilon \in (0,\infty)$, $\alpha \in (0,\infty)$, $q\in (2,\infty)$, $ \Phi \in \neu$ satisfy that $\delta = 2^{\frac{-2}{q-2}}\ve ^{\frac{q}{q-2}}$, $\alpha = \lp \frac{\ve}{2}\rp^{\frac{1}{q-2}}$, $\real{\rect}\lp\Phi\rp \in C\lp \R,\R\rp$, $\dep(\Phi) \les \max \left\{\frac{1}{2} \log_2(\delta^{-1})+1,2\right\}$, $\param(\Phi) \les \max\left\{10\log_2\lp \delta^{-1}\rp-7,13\right\}$, $\sup_{x \in \R \setminus [0,1]} | \lp \real_{\rect} \lp \Phi \rp -\rect(x) \right| =0$, and $\sup_{x\in \lb 0,1\rb} |x^2-\lp \real_{\rect} \lp \Phi \rp \rp \lp x\rp | \les \delta$, let $\Psi \in \neu$ be the neural network given by:
Let $\delta,\epsilon \in (0,\infty)$, $\alpha \in (0,\infty)$, $q\in (2,\infty)$, $ \Phi \in \neu$ satisfy that $\delta = 2^{\frac{-2}{q-2}}\ve ^{\frac{q}{q-2}}$, $\alpha = \lp \frac{\ve}{2}\rp^{\frac{1}{q-2}}$, $\real_{\rect}\lp\Phi\rp \in C\lp \R,\R\rp$, $\dep(\Phi) \les \max \left\{\frac{1}{2} \log_2(\delta^{-1})+1,2\right\}$, $\param(\Phi) \les \max\left\{10\log_2\lp \delta^{-1}\rp \right.\\\left. -7,13\right\}$, $\sup_{x \in \R \setminus [0,1]} | \lp \real_{\rect} \lp \Phi \rp -\rect(x) \right| =0$, and $\sup_{x\in \lb 0,1\rb} |x^2-\lp \real_{\rect} \lp \Phi \rp \rp \lp x\rp | \les \delta$, let $\Psi \in \neu$ be the neural network given by:
\begin{align}
\Psi = \lp \aff_{\alpha^{-2},0} \bullet \Phi \bullet \aff_{\alpha,0} \rp \bigoplus\lp \aff_{\alpha^{-2},0} \bullet \Phi \bullet \aff_{-\alpha,0}\rp
\end{align}
@ -273,7 +300,7 @@ Now that we have neural networks that perform the squaring operation inside $\lb
&= \frac{1}{\alpha^2}\lp \real_{\rect}\lp \Phi \rp \rp \lp \alpha x\rp + \frac{1}{\alpha^2}\lp \real_{\rect} \lp \Phi \rp \rp \lp -\alpha x\rp \nonumber\\
&= \frac{1}{\lp \frac{\ve}{2}\rp^{\frac{2}{q-2}}}\lb \lp \real_{\rect}\lp \Phi \rp \rp \lp \lp \frac{\ve}{2}\rp ^{\frac{1}{q-2}}x \rp + \lp \real_{\rect}\lp \Phi \rp \rp \lp -\lp \frac{\ve}{2}\rp^{\frac{1}{q-2}}x\rp \rb
\end{align}
This and the assumption that $\Phi \in C\lp \R, \R \rp$ along with the assumption that $\sup_{x\in \R \setminus \lb 0,1\rb } | \lp \real_{\rect} \lp \Phi \rp \rp \lp x \rp -\rect\lp x\rp | =0$ tells us that for all $x\in \R$ it holds that:
This and the assumption that $\Phi \in C\lp \R, \R \rp$ along with the assumption that $\sup_{x\in \R \setminus \lb 0,1\rb } \left| \lp \real_{\rect} \lp \Phi \rp \rp \right.\\ \left.\lp x \rp -\rect\lp x\rp \right| =0$ tells us that for all $x\in \R$ it holds that:
\begin{align}
\lp \real_{\rect}\lp \Psi \rp \rp \lp 0 \rp &= \lp \frac{\ve}{2}\rp^{\frac{-2}{q-2}}\lb \lp \real_{\rect}\lp \Phi \rp \rp \lp 0 \rp +\lp \real_{\rect} \lp \Phi\rp \rp \lp 0 \rp \rb \nonumber \\
&=\lp \frac{\ve}{2}\rp ^{\frac{-2}{q-2}} \lb \rect (0)+\rect(0) \rb \nonumber \\
@ -401,7 +428,7 @@ Forward Difference & 0.01 & 1.6012 & 9.8655 & 44.9141 & 40.7102 & 1230
\end{table}
\subsection{The $\prd^{q,\ve}$ network}
\subsection{The $\prd^{q,\ve}$ Neural Networks and Products of Two Real Numbers}
We are finally ready to give neural network representations of arbitrary products of real numbers. However, this representation differs somewhat from those found in the literature, especially \cite{grohs2019spacetime}, where parallelization (stacking) is used instead of neural network sums. This will help us calculate $\wid_1$ and the width of the second to last layer.
\begin{lemma}\label{prd_network}
Let $\delta,\ve \in \lp 0,\infty \rp $, $q\in \lp 2,\infty \rp$, $A_1,A_2,A_3 \in \R^{1\times 2}$, $\Psi \in \neu$ satisfy for all $x\in \R$ that $\delta = \ve \lp 2^{q-1} +1\rp^{-1}$, $A_1 = \lb 1 \quad 1 \rb$, $A_2 = \lb 1 \quad 0 \rb$, $A_3 = \lb 0 \quad 1 \rb$, $\real_{\rect} \in C\lp \R, \R \rp$, $\lp \real_{\rect} \lp \Psi \rp \rp \lp 0\rp = 0$, $0\les \lp \real_{\rect} \lp \Psi \rp \rp \lp x \rp \les \delta+|x|^2$, $|x^2-\lp \real_{\rect}\lp \Psi \rp \rp \lp x \rp |\les \delta \max \{1,|x|^q\}$, $\dep\lp \Psi \rp \les \max\{ 1+\frac{1}{q-2}+\frac{q}{2(q-2)}\log_2 \lp \delta^{-1} \rp ,2\}$, and $\param \lp \Psi \rp \les \max\left\{\lb \frac{40q}{q-2} \rb \log_2\lp \delta^{-1} \rp +\frac{80}{q-2}-28,52\right\}$, then:
@ -1339,13 +1366,13 @@ Let $\mathfrak{p}_i$ for $i \in \{1,2,...\}$ be the set of functions defined for
% Text Node
\draw (525,162.4) node [anchor=north west][inner sep=0.75pt] {$\mathsf{Cpy}_{n+1,1}$};
% Text Node
\draw (471.33,198.4) node [anchor=north west][inner sep=0.75pt] {$\vdots $};
\draw (471.33,198.4) node [anchor=north west][inner sep=0.75pt] {$\vdots$};
% Text Node
\draw (83,163.73) node [anchor=north west][inner sep=0.75pt] {$\mathsf{Sum}_{n+1,1}$};
% Text Node
\draw (230.67,214.4) node [anchor=north west][inner sep=0.75pt] {$\vdots $};
\draw (230.67,217.7) node [anchor=north west][inner sep=0.75pt] {$\vdots$};
% Text Node
\draw (172,193.73) node [anchor=north west][inner sep=0.75pt] {$\vdots $};
\draw (172,198.4) node [anchor=north west][inner sep=0.75pt] {$\vdots$};
\end{tikzpicture}
@ -1453,7 +1480,7 @@ Let $\mathfrak{p}_i$ for $i \in \{1,2,...\}$ be the set of functions defined for
\end{align}
This completes the proof of the Lemma.
\end{proof}
\subsection{$\xpn_n^{q,\ve}$, $\csn_n^{q,\ve}$, $\sne_n^{q,\ve}$, and Artificial Neural Network Approximations of $e^x$, $\cos(x)$, and $\sin(x)$.}
\subsection{$\xpn_n^{q,\ve}$, $\csn_n^{q,\ve}$, $\sne_n^{q,\ve}$, and ANN Approximations of $e^x$, $\cos(x)$, and $\sin(x)$.}
Once we have neural network polynomials, we may take the next leap to transcendental functions. For approximating them we will use Taylor expansions which will swiftly give us our approximations for our desired functions. Here, we will explore neural network approximations for three common transcendental functions: $e^x$, $\cos(x)$, and $\sin(x)$.
\begin{lemma}
@ -1698,7 +1725,9 @@ Once we have neural network polynomials, we may take the next leap to transcende
\end{proof}
\begin{remark}\label{rem:pyth_idt}
Note that under these neural network architectures the famous Pythagorean identity $\sin^2\lp x\rp + \cos^2 \lp x\rp = 1$, may be rendered approximately, for fixed $n,q,\ve$ as:\\ $\lb \sqr^{q,\ve}\bullet \csn^{q,\ve}_n \rb \oplus\lb \sqr^{q,\ve}\bullet \sne^{q,\ve}_n\rb$. A full discussion of the associated parameter, depth, and accuracy bounds are beyond the scope of this dissertation, and may be appropriate for future work.
Note that under these neural network architectures the famous Pythagorean identity $\sin^2\lp x\rp + \cos^2 \lp x\rp = 1$, may be rendered approximately, for appropriately fixed $n,q,\ve$ as: $\lb \sqr^{q,\ve}\bullet \csn^{q,\ve}_n \rb \oplus\lb \sqr^{q,\ve}\bullet \sne^{q,\ve}_n\rb \approx 1$. On a similar note, it is the case, with appropriate $n,q,\ve$ that $\real_{\rect}\lp \xpn^{q,\ve}_n \triangleleft \:i \rp\lp \pi \rp \approx -1$
A full discussion of the associated parameter, depth, and accuracy bounds are beyond the scope of this dissertation, and may be appropriate for future work.
\end{remark}

View File

@ -79,7 +79,7 @@ Items (ii)--(iii) together shows that for all $\theta \in \Theta$, $t \in [0,T]$
\end{align}
This proves Item (v) and hence the whole lemma.
\end{proof}
\section{The $\mathsf{E}^{N,h,q,\ve}_n$ Neural Network}
\section{The $\mathsf{E}^{N,h,q,\ve}_n$ Neural Networks}
\begin{lemma}[R\textemdash, 2023]\label{mathsfE}
Let $n, N\in \N$ and $h \in \lp 0,\infty\rp$. Let $\delta,\ve \in \lp 0,\infty \rp $, $q\in \lp 2,\infty \rp$, satisfy that $\delta = \ve \lp 2^{q-1} +1\rp^{-1}$. Let $a\in \lp -\infty,\infty \rp$, $b \in \lb a, \infty \rp$. Let $f:[a,b] \rightarrow \R$ be continuous and have second derivatives almost everywhere in $\lb a,b \rb$. Let $a=x_0 \les x_1\les \cdots \les x_{N-1} \les x_N=b$ such that for all $i \in \{0,1,...,N\}$ it is the case that $h = \frac{b-a}{N}$, and $x_i = x_0+i\cdot h$ . Let $x = \lb x_0 \: x_1\: \cdots \: x_N \rb$ and as such let $f\lp\lb x \rb_{*,*} \rp = \lb f(x_0) \: f(x_1)\: \cdots \: f(x_N) \rb$. Let $\mathsf{E}^{N,h,q,\ve}_{n} \in \neu$ be the neural network given by:
\begin{align}
@ -340,7 +340,7 @@ This proves Item (v) and hence the whole lemma.
\end{center}
\caption{Diagram of $\mathsf{E}^{N,h,q,\ve}_n$.}
\end{figure}
\section{The $\mathsf{UE}^{N,h,q,\ve}_{n,\mathsf{G}_d}$ Neural Network}
\section{The $\mathsf{UE}^{N,h,q,\ve}_{n,\mathsf{G}_d}$ Neural Networks}
\begin{lemma}[R\textemdash,2023]\label{UE-prop}
Let $n, N,h\in \N$. Let $\delta,\ve \in \lp 0,\infty \rp $, $q\in \lp 2,\infty \rp$, satisfy that $\delta = \ve \lp 2^{q-1} +1\rp^{-1}$. Let $a\in \lp -\infty,\infty \rp$, $b \in \lb a, \infty \rp$. Let $f:[a,b] \rightarrow \R$ be continuous and have second derivatives almost everywhere in $\lb a,b \rb$. Let $a=x_0 \les x_1\les \cdots \les x_{N-1} \les x_N=b$ such that for all $i \in \{0,1,...,N\}$ it is the case that $h = \frac{b-a}{N}$, and $x_i = x_0+i\cdot h$ . Let $x = \lb x_0 \: x_1\: \cdots x_N \rb$ and as such let $f\lp\lb x \rb_{*,*} \rp = \lb f(x_0) \: f(x_1)\: \cdots \: f(x_N) \rb$. Let $\mathsf{E}^{\exp}_{n,h,q,\ve} \in \neu$ be the neural network given by:
@ -503,7 +503,7 @@ Let $n, N,h\in \N$. Let $\delta,\ve \in \lp 0,\infty \rp $, $q\in \lp 2,\infty \
&= 3\ve +2\ve \left| \mathfrak{u}_d\lp x\rp\right|^q+2\ve \left| \exp \lp \int^b_afdx\rp\right|^q + \ve \left| \exp \lp \int^b_afdx\rp - \mathfrak{e}\right|^q -\mathfrak{e}\mathfrak{u}_d\lp x \rp \nonumber
\end{align}
\end{proof}
\section{The $\mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega_i}$ network}
\section{The $\mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega_i}$ Neural Networks}
\begin{lemma}[R\textemdash,2023]\label{UEX}
Let $n, N,h\in \N$. Let $\delta,\ve \in \lp 0,\infty \rp $, $q\in \lp 2,\infty \rp$, satisfy that $\delta = \ve \lp 2^{q-1} +1\rp^{-1}$. Let $a\in \lp -\infty,\infty \rp$, $b \in \lb a, \infty \rp$. Let $f:[a,b] \rightarrow \R$ be continuous and have second derivatives almost everywhere in $\lb a,b \rb$. Let $a=x_0 \les x_1\les \cdots \les x_{N-1} \les x_N=b$ such that for all $i \in \{0,1,...,N\}$ it is the case that $h = \frac{b-a}{N}$, and $x_i = x_0+i\cdot h$ . Let $x = \lb x_0 \: x_1\: \cdots \: x_N \rb$ and as such let $f\lp\lb x \rb_{*,*} \rp = \lb f(x_0) \: f(x_1)\: \cdots \: f(x_N) \rb$. Let $\mathsf{E}^{\exp}_{n,h,q,\ve} \in \neu$ be the neural network given by:
@ -695,7 +695,7 @@ Note that for a fixed $T \in \lp 0,\infty \rp$ it is the case that $u_d\lp t,x \
\end{tikzpicture}
\end{center}
\end{remark}
\section{The $\mathsf{UES}^{N,h,q,\ve}_{n,\mathsf{G}_d,\Omega,\fn}$ network}
\section{The $\mathsf{UES}^{N,h,q,\ve}_{n,\mathsf{G}_d,\Omega,\fn}$ Neural Networks}
\begin{lemma}\label{lem:sm_sum}
Let $\nu_1,\nu_2,\hdots, \nu_n \in \neu$ such that for all $i \in \{1,2,\hdots, n\}$ it is the cast that $\out\lp \nu_i\rp = 1$, and it is also the case that $\dep \lp \nu_1 \rp = \dep \lp \nu_2 \rp = \cdots =\dep \lp \nu_n\rp$. Let $x_1 \in \R^{\inn\lp \nu_1\rp},x_2 \in \R^{\inn\lp \nu_2\rp},\hdots, x_n \in \R^{\inn\lp \nu_n\rp}$ and $\fx \in \R^{\sum_{i=1}^n \inn \lp \nu_i\rp}$. It is then the case that we have that:
@ -836,7 +836,7 @@ Let $t \in \lp 0,\infty\rp$ and $T \in \lp t,\infty\rp$. Let $\lp \Omega, \mathc
&\les \param \lp \sm_{\mathfrak{n},1}\bullet\lb \boxminus_{i=1}^{\mathfrak{n}} \mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega_i}\rb\rp \nonumber\\
&\les \param \lp \boxminus_{i=1}^{\mathfrak{n}} \mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega_i} \rp \nonumber\\
&\les \mathfrak{n}^2\cdot \param \lp \mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega_i}\rp \nonumber \\
&\les \fn^2 \cdot \lb \frac{360q}{q-2} \lb \log_2 \lp \ve^{-1} \rp +q+1 \rb +324+ 48n\right. \nonumber\\ &\left. +24 \wid_{\hid\lp \mathsf{G}_d\rp}\lp \mathsf{G}_d\rp + 4\max \left\{\param \lp \mathsf{E}^{N,h,q,\ve}_{n}\rp, \param \lp \mathsf{G}_d\rp \right\} \rb
&\les \fn^2 \cdot \lb \frac{360q}{q-2} \lb \log_2 \lp \ve^{-1} \rp +q+1 \rb +324+ 48n\right. \nonumber\\ &\left. + 24 \wid_{\hid\lp \mathsf{G}_d\rp}\lp \mathsf{G}_d\rp + 4\max \left\{\param \lp \mathsf{E}^{N,h,q,\ve}_{n}\rp, \param \lp \mathsf{G}_d\rp \right\} \rb
\end{align}
Observe that the absolute homogeneity condition for norms, the fact that the Brownian motions are independent of each other, Lemma \ref{lem:sm_sum}, the fact that $\mathfrak{n}\in \N$, the fact that the upper limit of error remains bounded by the same bound for all $\omega_i \in \Omega$, and Lemma \ref{sum_of_errors_of_stacking}, then yields us:
\begin{align}
@ -844,7 +844,7 @@ Let $t \in \lp 0,\infty\rp$ and $T \in \lp t,\infty\rp$. Let $\lp \Omega, \mathc
&=\left| \frac{1}{\mathfrak{n}}\lb \sum^{\mathfrak{n}}_{i=1}\lb \exp \lp \int_t^T f\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp ds \cdot u_d^T\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp\rp\rb \rb - \real_{\rect}\lb \frac{1}{\mathfrak{n}} \triangleright\lp \sm_{\mathfrak{n},1}\bullet\lb \boxminus_{i=1}^{\mathfrak{n}} \mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega_i}\rb\rp\rb\right| \nonumber \\
&\les \left|\frac{1}{\mathfrak{n}}\lb \sum^{\mathfrak{n}}_{i=1} \exp \lp \int_t^T f\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp ds \cdot u_d^T\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp\rp \rb - \frac{1}{\mathfrak{n}}\lb \sum^{\mathfrak{n}}_{i=1}\lp \real_{\rect}\lb \mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega_i}\rb\rp \rb \right| \nonumber \\
&\les \cancel{\frac{1}{\mathfrak{n}} \sum^{\mathfrak{n}}_{i=1}}\left| \exp \lp \int^T_tf\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp ds \cdot u^T_d\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp\rp - \real_{\rect}\lp \mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega_i} \rp \right| \nonumber\\
&\les \left| \exp \lp \int^T_tf\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp ds \cdot u^T_d\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp\rp - \real_{\rect}\lp \mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega_i}\rp \right| \nonumber \\
&= \left| \exp \lp \int^T_tf\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp ds \cdot u^T_d\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp\rp - \real_{\rect}\lp \mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega_i}\rp \right| \nonumber \\
&\les 3\ve +2\ve \left| \mathfrak{u}_d^T\lp t,x\rp\right|^q+2\ve \left| \exp \lp \int^b_afdx\rp\right|^q + \ve \left| \exp \lp \int^b_afdx\rp - \mathfrak{e}\right|^q -\mathfrak{e}\mathfrak{u}_d^T\lp x \rp\nonumber
\end{align}
\end{proof}
@ -864,9 +864,10 @@ Let $t \in \lp 0,\infty\rp$ and $T \in \lp t,\infty\rp$. Let $\lp \Omega, \mathc
% &\les \left| \exp \lp \int^T_tf\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp ds \cdot u^T_d\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp\rp - \real_{\rect}\lp \mathsf{UEX}^{N,h,q,\ve}_{n,\mathsf{G}_d,\omega_i}\rp \right| \nonumber
% \end{align}
\begin{corollary}\label{cor_ues}
Let $N,n,\fn \in \N$, $h,\ve \in \lp 0,\infty\rp$, $q\in\lp 2,\infty\rp$, given $\mathsf{UES}^{N,h,q,\ve}_{n,\mathsf{G}_d, \Omega, \fn} \subsetneq \neu $, it is the case that:
Let $N,n,\fn \in \N$, $h,\ve \in \lp 0,\infty\rp$, $q\in\lp 2,\infty\rp$, given $\mathsf{UES}^{N,h,q,\ve}_{n,\mathsf{G}_d, \Omega, \fn} \subsetneq \neu $, it is then the case that:
\begin{align}
\E\left| \E \lb \exp\lp \int^T_t \alpha_d \circ \cX^{d,t,x}_{r,\Omega } ds\rp \cdot \fu_d^T\lp \cX^{d,t,x}_{r,\Omega}\rp\rb -\frac{1}{\mathfrak{n}}\lb \sum^{\mathfrak{n}}_{i=1}\lb \exp \lp \int_t^T \alpha_d \circ \mathcal{X}^{d,t,x}_{r,\omega_i} ds \rp \cdot \fu_d^T\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp\rb \rb \right|\nonumber
&\lp \E\lb \left| \E \lb \exp\lp \int^T_t \alpha_d \circ \cX^{d,t,x}_{r,\Omega } ds\rp \cdot \fu_d^T\lp \cX^{d,t,x}_{r,\Omega}\rp\rb \right.\right.\right.\nonumber\\ &\left. \left.\left.-\frac{1}{\mathfrak{n}}\lp \sum^{\mathfrak{n}}_{i=1}\lb \exp \lp \int_t^T \alpha_d \circ \mathcal{X}^{d,t,x}_{r,\omega_i} ds \rp \cdot \fu_d^T\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp\rb \rp \right|^2\rb\rp^{\frac{1}{2}} \nonumber \\
&\les \frac{\fk_p }{n^{\frac{1}{2}}} \cdot \fL \lp T+1\rp \exp \lp LT\rp \lb \sup_{s\in \lb 0,T\rb} \lp \E \lb \lp 1+\left\| x + \cW_s\right\|^p\rp^2\rb\rp^{\frac{1}{2}}\rb
\end{align}
\end{corollary}
@ -875,24 +876,36 @@ Let $t \in \lp 0,\infty\rp$ and $T \in \lp t,\infty\rp$. Let $\lp \Omega, \mathc
\begin{proof}
Note that $\E \lb \cX^{d,tx}_{r,\Omega}\rb < \infty$, and $\fu^T$ being bounded yields that $\E \lb \fu^T\lp \cX^{d,t,x}_{r,\Omega}\rp\rb < \infty$, and also that $\E \lb \alpha_d \circ \cX^{d,t,x}_{r,\Omega}\rb < \infty$. Thus we also see that $\E \lb \int^T_t\alpha_d\circ \cX^{d,t,x}_{r,\Omega} ds\rb < \infty$, and thus $\E \lb \exp \lp \int^T_t \alpha_d \circ \cX^{d,t,x}_{r,\Omega}ds\rp\rb < \infty$. Thus together these two facts, along with the fact that the two factors are independent, then assert that $\E \lb \exp \lp \int^T_t \alpha_d\circ \cX^{d,t,x}_{r,\Omega}\rp \cdot \fu^T\lp \cX^{d,t,x}_{r,\Omega}\rp\rb < \infty$.
Note that \cite[Corollary~3.8]{hutzenthaler_strong_2021} tells us that:
\begin{align}
&\E\left| \E \lb \exp\lp \int^T_t \alpha_d \circ \cX^{d,t,x}_{r,\Omega } ds\rp \cdot \fu_d^T\lp \cX^{d,t,x}_{r,\Omega}\rp\rb -\frac{1}{\mathfrak{n}}\lb \sum^{\mathfrak{n}}_{i=1}\lb \exp \lp \int_t^T \alpha_d \circ \mathcal{X}^{d,t,x}_{r,\omega_i}\rp ds \cdot \fu_d^T\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp\rb \rb \right|\nonumber \\
&\les \frac{\fK_p \sqrt{p-1}}{n^{\frac{1}{2}}} \lp \E \lb \left| \exp\lp \int^T_t \alpha_d \circ \cX^{d,t,x}_{r,\Omega } ds\rp \cdot \fu_d^T\lp\cX^{d,t,x}_{r,\Omega}\rp \right|\rb \rp
\begin{align}\label{kk_application}
&\lp \E\lb \left| \E \lb \exp\lp \int^T_t \alpha_d \circ \cX^{d,t,x}_{r,\Omega } ds\rp \cdot \fu_d^T\lp \cX^{d,t,x}_{r,\Omega}\rp\rb \right.\right.\right.\nonumber\\ &\left. \left.\left.-\frac{1}{\mathfrak{n}}\lp \sum^{\mathfrak{n}}_{i=1}\lb \exp \lp \int_t^T \alpha_d \circ \mathcal{X}^{d,t,x}_{r,\omega_i} ds \rp \cdot \fu_d^T\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp\rb \rp \right|^2\rb\rp^{\frac{1}{2}} \nonumber \\
&\les \frac{\fk_p }{n^{\frac{1}{2}}} \lp \E \lb \left| \exp\lp \int^T_t \alpha_d \circ \cX^{d,t,x}_{r,\Omega } ds\rp \cdot \fu_d^T\lp\cX^{d,t,x}_{r,\Omega}\rp \right|^2\rb \rp^{\frac{1}{2}}
\end{align}
For the purposes of this proof let it be the case that $\fF: [0,T] \rightarrow \R$ is the function represented for all $\ft \in \lb 0,T \rb$ as:
For the purposes of this proof let it be the case that $\ff: [0,T] \rightarrow \R$ is the function represented for all $t \in \lb 0,T \rb$ as:
\begin{align}
\ff\lp t\rp = \int^T_{T-t} \alpha_d\circ \cX^{d,t,x}_{r,\Omega} ds
\end{align}
In which case we have that $\fF\lp 0\rp = 0$, and thus we may define $u\lp t,x\rp$ as the function given by:
In which case we have that $\ff\lp 0\rp = 0$, and thus, stipulating $g\lp x\rp = \fu^T\lp \cX^{d,t,x}_{r,\Omega}\rp$ we may define $u\lp t,x\rp$ as the function given by:
\begin{align}
u\lp t,x\rp &= \exp \lp \ff\lp t\rp\rp \cdot \fu^T\lp \cX^{d,t,x}_{r,\Omega}\rp \nonumber\\
&= \lb \exp\lp \fF\lp 0\rp\rp + \int_0^s \ff'\lp s\rp\cdot \exp \lp \ff\lp s\rp\rp ds\rb \cdot \fu^T\lp \cX^{d,t,x}_{r,\Omega}\rp\nonumber \\
&=\fu^T\lp \cX^{d,t,x}_{r,\Omega}\rp + \int_0^s \ff'\lp s\rp \cdot \exp\lp \ff\lp s\rp\rp \cdot \fu^T\lp \cX^{d,t,x}_{r,\Omega}\rp ds \nonumber\\
&= \fu^T\lp \cX^{d,t,x}_{r,\Omega}\rp + \int^s_0 \ff'\lp s\rp\cdot u\lp s,\cX^{d,t,x}_{r,\Omega}\rp ds \nonumber \\
&=\fu^T\lp \cX^{d,t,x}\rp + \int^s_0 \fF \lp s,u\lp s,x + \cW^d_r\rp\rp
u\lp t,x\rp &= \exp \lp \ff\lp t\rp\rp \cdot g\lp x\rp \nonumber\\
&= \lb \exp\lp \ff\lp 0\rp\rp + \int_0^s \ff'\lp s\rp\cdot \exp \lp \ff\lp s\rp\rp ds\rb \cdot g\lp x\rp\nonumber \\
&= g\lp x\rp + \int_0^s \ff'\lp s\rp \cdot \exp\lp \ff\lp s\rp\rp \cdot g\lp x\rp ds \nonumber\\
&= g\lp x\rp + \int^s_0 \ff'\lp s\rp\cdot u\lp s,x \rp ds \nonumber \\
&= g\lp x\rp+ \int^s_0 \fF \lp s,x, u\lp s,x \rp\rp ds
\end{align}
Then \cite[Lemma~2.3]{hutzenthaler_strong_2021} with $u \curvearrowleft u$,
Then \cite[Corollary~2.5]{hutzenthaler_strong_2021} with $f \curvearrowleft \fF$, $u \curvearrowleft u$, $x+ \cW_{s-t} \curvearrowleft \cX^{d,t,s}_{r,\Omega}$, and tells us that with $q \curvearrowleft 2$ that:
\begin{align}
& \lp \E \lb \left| \exp\lp \int^T_t \alpha_d \circ \cX^{d,t,x}_{r,\Omega } ds\rp \cdot \fu_d^T\lp\cX^{d,t,x}_{r,\Omega}\rp \right|^2\rb \rp^{\frac{1}{2}} \nonumber\\
&\les \fL \lp T+1\rp \exp \lp LT\rp \lb \sup_{s\in \lb 0,T\rb} \lp \E \lb \lp 1+\left\| x + \cW_s\right\|^p\rp^2\rb\rp^{\frac{1}{2}}\rb
\end{align}
Together with (\ref{kk_application}) we then get that:
\begin{align}
&\lp \E\lb \left| \E \lb \exp\lp \int^T_t \alpha_d \circ \cX^{d,t,x}_{r,\Omega } ds\rp \cdot \fu_d^T\lp \cX^{d,t,x}_{r,\Omega}\rp\rb \right.\right.\right.\nonumber\\ &\left. \left.\left.-\frac{1}{\mathfrak{n}}\lp \sum^{\mathfrak{n}}_{i=1}\lb \exp \lp \int_t^T \alpha_d \circ \mathcal{X}^{d,t,x}_{r,\omega_i} ds \rp \cdot \fu_d^T\lp \mathcal{X}^{d,t,x}_{r,\omega_i}\rp\rb \rp \right|^2\rb\rp^{\frac{1}{2}} \nonumber \\
&\les \frac{\fk_p }{n^{\frac{1}{2}}} \cdot \fL \lp T+1\rp \exp \lp LT\rp \lb \sup_{s\in \lb 0,T\rb} \lp \E \lb \lp 1+\left\| x + \cW_s\right\|^p\rp^2\rb\rp^{\frac{1}{2}}\rb
\end{align}
\end{proof}
% Note that Taylor's theorem states that:
% \begin{align}
% \exp\lp \int^T_t \alpha_d \circ \cX^{d,t,x}_{r,\Omega}ds\rp = 1 + \int^T_t \alpha_d \circ \cX ^{d,t,x}_{r,\Omega}ds + \frac{1}{2}\lp \int^T_t \alpha_d \circ \cX^{d,t,x}_{r,\Omega }\rp^2 ds + \fR_3
@ -1055,7 +1068,7 @@ Let $t \in \lp 0,\infty\rp$ and $T \in \lp t,\infty\rp$. Let $\lp \Omega, \mathc
%\end{align}
%
%
\end{proof}
@ -1170,7 +1183,7 @@ Let $t \in \lp 0,\infty\rp$ and $T \in \lp t,\infty\rp$. Let $\lp \Omega, \mathc
\draw [shift={(17.22,237)}, rotate = 360] [color={rgb, 255:red, 0; green, 0; blue, 0 } ][line width=0.75] (10.93,-3.29) .. controls (6.95,-1.4) and (3.31,-0.3) .. (0,0) .. controls (3.31,0.3) and (6.95,1.4) .. (10.93,3.29) ;
% Text Node
\draw (428.28,22.2) node [anchor=north west][inner sep=0.75pt] {$\mathsf{E}_{N,n,h,q,\varepsilon }^{\exp ,f}$};
\draw (428.28,22.2) node [anchor=north west][inner sep=0.75pt] {$\mathsf{E}^{N,h,q,\ve}_{n}$};
% Text Node
\draw (444.46,108.6) node [anchor=north west][inner sep=0.75pt] {$\mathsf{G}_d$};
% Text Node
@ -1186,9 +1199,9 @@ Let $t \in \lp 0,\infty\rp$ and $T \in \lp t,\infty\rp$. Let $\lp \Omega, \mathc
% Text Node
\draw (535.1,28.4) node [anchor=north west][inner sep=0.75pt] {$\mathsf{Tun}_{1}^{N+1}$};
% Text Node
\draw (534.54,108.4) node [anchor=north west][inner sep=0.75pt] {$\mathsf{Aff}_{\mathbb{0}}{}_{_{d}{}_{,}{}_{d} ,\mathcal{X}}$};
\draw (534.54,108.4) node [anchor=north west][inner sep=0.75pt] {$\mathsf{Aff}_{\mymathbb{0}}{}_{_{d}{}_{,}{}_{d} ,\mathcal{X}}$};
% Text Node
\draw (426.15,340.4) node [anchor=north west][inner sep=0.75pt] {$\mathsf{E}_{N,n,h,q,\varepsilon }^{\exp ,f}$};
\draw (426.15,340.4) node [anchor=north west][inner sep=0.75pt] {$\mathsf{E}^{N,h,q,\ve}_{n}$};
% Text Node
\draw (442.34,426.8) node [anchor=north west][inner sep=0.75pt] {$\mathsf{G}_d$};
% Text Node
@ -1204,17 +1217,17 @@ Let $t \in \lp 0,\infty\rp$ and $T \in \lp t,\infty\rp$. Let $\lp \Omega, \mathc
% Text Node
\draw (532.97,346.6) node [anchor=north west][inner sep=0.75pt] {$\mathsf{Tun}_{1}^{N+1}$};
% Text Node
\draw (532.41,426.6) node [anchor=north west][inner sep=0.75pt] {$\mathsf{Aff}_{\mathbb{0}}{}_{_{d}{}_{,}{}_{d} ,\mathcal{X}}$};
\draw (532.41,426.6) node [anchor=north west][inner sep=0.75pt] {$\mathsf{Aff}_{\mymathbb{0}}{}_{_{d}{}_{,}{}_{d} ,\mathcal{X}}$};
% Text Node
\draw (444,215.4) node [anchor=north west][inner sep=0.75pt] [font=\Large] {$\vdots $};
% Text Node
\draw (553,216.4) node [anchor=north west][inner sep=0.75pt] [font=\Large] {$\vdots $};
\draw (553,215.4) node [anchor=north west][inner sep=0.75pt] [font=\Large] {$\vdots $};
% Text Node
\draw (336,215.4) node [anchor=north west][inner sep=0.75pt] [font=\Large] {$\vdots $};
% Text Node
\draw (619,214.4) node [anchor=north west][inner sep=0.75pt] [font=\Large] {$\vdots $};
\draw (619,215.4) node [anchor=north west][inner sep=0.75pt] [font=\Large] {$\vdots $};
% Text Node
\draw (234,217.4) node [anchor=north west][inner sep=0.75pt] [font=\Large] {$\vdots $};
\draw (234,215.4) node [anchor=north west][inner sep=0.75pt] [font=\Large] {$\vdots $};
% Text Node
\draw (132,221.4) node [anchor=north west][inner sep=0.75pt] {$\mathsf{Sum}$};
% Text Node
@ -1227,7 +1240,7 @@ Let $t \in \lp 0,\infty\rp$ and $T \in \lp t,\infty\rp$. Let $\lp \Omega, \mathc
\end{tikzpicture}
\end{center}
\caption{Neural network diagram for the $\mathsf{UES}$ network.}
\caption{Neural network diagram for the $\mathsf{UES}^{N,h,q,\ve}_{n,\mathsf{G}_d,\Omega,\fn}$ network.}
\end{figure}
\end{remark}

View File

@ -0,0 +1,3 @@
%\nocite{*}

View File

@ -1,12 +1,12 @@
\chapter{Conclusions and Further Research}
We will present three avenues of further research and related work on parameter estimates here.
We will present three avenues of further research and related work on parameter estimates here. We will present these as a series of recommendations and conjectures to further extend this framework for understanding neural networks.
\section{Further operations and further kinds of neural networks}
\section{Further operations}
Note, for instance, that several classical operations are done on neural networks that have yet to be accounted for in this framework and talked about in the literature. We will discuss two of them \textit{dropout} and \textit{merger} and discuss how they may be brought into this framework.
\subsection{Dropout}
Overfitting presents an important challenge for all machine learning models, including deep learning. There ex
\begin{definition}[Hadamard Product]
Let $m,n \in \N$. Let $A,B \in \R^{m \times n}$. For all $i \in \{ 1,2,\hdots,m\}$ and $j \in \{ 1,2,\hdots,n\}$ define the Hadamard product $\odot: \R^{m\times n} \times \R^{m \times n} \rightarrow \R^{m \times n}$ as:

View File

@ -33,7 +33,7 @@ This dissertation is approved for recommendation to the Graduate Council.
\begin{center}
\noindent\hspace*{0cm}\rule{7cm}{0.7pt} \\
Joshua Lee Padgett, Ph.D.\\
Dissertation Director
Dissertation Director, \textit{ex-officio}
\end{center}
\vspace{1cm}
@ -61,7 +61,7 @@ Tulin Kaman, Ph.D.\\
Committee Member
\end{center}
\vspace{1cm}
\end{singlespace}
\newpage
\begin{center}
\textbf{Abstract}
@ -83,8 +83,8 @@ Our appendix will contain code listings of these neural network operations, some
\begin{center}
\vspace*{\fill}
\copyright 2024 Shakil Ahmed Rafi \\
All rights reserved.
\copyright 2024 by Shakil Ahmed Rafi \\
All Rights Reserved.
\vspace*{\fill}
\end{center}
\newpage
@ -93,26 +93,28 @@ Our appendix will contain code listings of these neural network operations, some
\end{center}
I would like to acknowledge my advisor Dr. Joshua Padgett who has been instrumental in me Ph.D. journey. I am incredibly thankful for him taking the time out of his busy schedule to meet with me over the weekends and helping me finish my dissertation. Without his help, guidance, and patience I would never have been where I am today. You not only taught me mathematics, but also how to be a mathematician. Thank you.
\\~\\
I would also like to thank my department, and everyone there, including, but not limited to Dr. Andrew Raich, for his incredible patience and helpful guidance throughout the years. I would also like to thank Dr. Ukash Nakarmi for the excellent collaboartions I've had. I would also to Egan Meaux for all the little things he does to keep the department going.
I would also like to thank my department, and everyone there, including, but not limited to Dr. Andrew Raich, for his incredible patience and helpful guidance throughout the years. I would also like to thank Dr. Ukash Nakarmi for the excellent collaboartions I've had. I would also like to thank Egan Meaux for all the little things he does to keep the department going.
\\~\\
I would like to acknowledge Marufa Mumu for believing in me when I didn't. You really made the last few months of writing this dissertation, less painful.
\\~\\
I would like to acknowledge my cat, a beautiful Turkish Angora, Tommy. He was pretty useless, but stroking his fur made me stress a little less.
\\~\\
I would like to acknowledge my office-mate Eric Walker, without whom I would never haver realized that rage and spite are equally as valid motivators as encouragement and praise.
\\~\\
Finally, I would like to thank Valetta Ventures, Inc. and their product Texifier. It is marvel of software engineering and made the process of creating this dissertation much less painful than it already was.
\newpage
\begin{center}
\vspace*{\fill}
Dedicated to my grandparents, \\
\textbf{Dedication}\\
To my grandparents, \\
M.A. Hye, M.A., \& Nilufar Hye\\
who would've love to see this but can't; \\
to my parents, \\
Kamal Uddin Ahmed, M.A., \& Shahnaz Parveen, M.A.,\\
Kamal Uddin Ahmed, M.A. \& Shahnaz Parveen, M.A.,\\
who kept faith in me, always; \\
and finally to my brothers, \\
Wakil Ahmed Shabi, BBA \& Nabeel Ahmed Sami, B.Eng., \\
for whom I have been too imperfect a role model.\\
for whom I have been somewhat imperfect a role model.\\
@ -123,7 +125,7 @@ Finally, I would like to thank Valetta Ventures, Inc. and their product Texifier
\newpage
\begin{center}
\vspace*{\fill}
\textbf{Epigraph}\\~\\
\textit{Read, in the name of your Lord}\\
\textemdash Surah Al-Alaq:1\\~\\
\textit{The conquest of nature must be achieved with number and measure.} \\
@ -133,10 +135,19 @@ Finally, I would like to thank Valetta Ventures, Inc. and their product Texifier
\newpage
\tableofcontents
\listoffigures
\newpage
\textbf{List of Published Papers} \\~\\
Parts of Chapter \ref{chp:ann_prod} have been published as \textit{An Algebraic Framework for Understanding Fully Connected Feedforward Artificial Neural Networks, and Their Associated Parameter, Depth, and Accuracy Properties} by Rafi S., Padgett, J.L., and Nakarmi, U. and is currently undergoing review for publication for ICML 2024 at Vienna, Austria.
\\~\\
Parts of the simulation codebase have been submitted for review as \textit{nnR: Neural Networks Made Algebraic} at the Journal of Open Source Software.
\end{singlespace}

View File

@ -55,18 +55,16 @@ journal={Zenkoku Shijo Sugaku Danwakai},
year="1942",
volume="244",
number="1077",
pages={1352-1400},
URL="https://cir.nii.ac.jp/crid/1573105975386021120"
pages={1352{\textendash}1400}
}
@article{Ito1946,
author={It\^o, K.},
title={On a stochastic integral equation},
journal={Proc. Imperial Acad. Tokyo},
year="1942",
volume="244",
number="1077",
pages="1352-1400",
URL="https://cir.nii.ac.jp/crid/1573105975386021120"
year={1942},
volume={244},
number={1077},
pages={1352{\textendash}1400
}
@inbook{bass_2011, place={Cambridge}, series={Cambridge Series in Statistical and Probabilistic Mathematics}, title={Brownian Motion}, DOI={10.1017/CBO9780511997044.004}, booktitle={Stochastic Processes}, publisher={Cambridge University Press}, author={Bass, Richard F.}, year={2011}, pages={612}, collection={Cambridge Series in Statistical and Probabilistic Mathematics}}
@ -195,7 +193,7 @@ type: article},
author = {Crandall, Michael G. and Ishii, Hitoshi and Lions, Pierre-Louis},
year = {1992},
keywords = {comparison theorems, dynamic programming, elliptic equations, fully nonlinear equations, generalized solutions, Hamilton-Jacobi equations, maximum principles, nonlinear boundary value problems, parabolic equations, partial differential equations, Perrons method, Viscosity solutions},
pages = {1--67},
pages = {1{\textendash}67},
file = {Full Text PDF:files/129/Crandall et al. - 1992 - Users guide to viscosity solutions of second orde.pdf:application/pdf},
}
@ -217,7 +215,7 @@ place={Cambridge}, series={London Mathematical Society Lecture Note Series}, tit
month = mar,
year = {2009},
keywords = {60 F 05, 60 F 17, Martingale, Moment inequality, Projective criteria, Rosenthal inequality, Stationary sequences},
pages = {146--163},
pages = {146{\textendash}163},
}
@book{golub2013matrix,
title={Matrix Computations},
@ -231,13 +229,13 @@ place={Cambridge}, series={London Mathematical Society Lecture Note Series}, tit
}
@article{hjw2020,
author = {Martin Hutzenthaler and Arnulf Jentzen and von Wurstemberger Wurstemberger},
author = {Martin Hutzenthaler and Arnulf Jentzen and von Wurstemberger},
title = {{Overcoming the curse of dimensionality in the approximative pricing of financial derivatives with default risks}},
volume = {25},
journal = {Electronic Journal of Probability},
number = {none},
publisher = {Institute of Mathematical Statistics and Bernoulli Society},
pages = {1 -- 73},
pages = {1{\textendash}73},
keywords = {curse of dimensionality, high-dimensional PDEs, multilevel Picard method, semilinear KolmogorovPDEs, Semilinear PDEs},
year = {2020},
doi = {10.1214/20-EJP423},
@ -246,7 +244,7 @@ URL = {https://doi.org/10.1214/20-EJP423}
@article{bhj20,
author = {Beck, Christian and Hutzenthaler, Martin and Jentzen, Arnulf},
title = {On nonlinear FeynmanKac formulas for viscosity solutions of semilinear parabolic partial differential equations},
title = {On nonlinear {Feynman}{Kac} formulas for viscosity solutions of semilinear parabolic partial differential equations},
journal = {Stochastics and Dynamics},
volume = {21},
number = {08},
@ -320,7 +318,7 @@ Publisher: Nature Publishing Group},
note = {Number: 1
Publisher: Nature Publishing Group},
keywords = {Astronomy and planetary science, Computational science},
pages = {1--12},
pages = {1{\textendash}12},
file = {Full Text PDF:/Users/shakilrafi/Zotero/storage/JCCM78TZ/Zhao et al. - 2023 - Space-based gravitational wave signal detection an.pdf:application/pdf},
}
@misc{wu2022sustainable,
@ -424,7 +422,7 @@ title = {Xception: Deep Learning with Depthwise Separable Convolutions},
year = {2017},
volume = {},
issn = {1063-6919},
pages = {1800-1807},
pages = {1800{\textendash}1807},
abstract = {We present an interpretation of Inception modules in convolutional neural networks as being an intermediate step in-between regular convolution and the depthwise separable convolution operation (a depthwise convolution followed by a pointwise convolution). In this light, a depthwise separable convolution can be understood as an Inception module with a maximally large number of towers. This observation leads us to propose a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions. We show that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset (which Inception V3 was designed for), and significantly outperforms Inception V3 on a larger image classification dataset comprising 350 million images and 17,000 classes. Since the Xception architecture has the same number of parameters as Inception V3, the performance gains are not due to increased capacity but rather to a more efficient use of model parameters.},
keywords = {computer architecture;correlation;convolutional codes;google;biological neural networks},
doi = {10.1109/CVPR.2017.195},
@ -464,7 +462,7 @@ month = {jul}
year = {2018},
pmid = {30245431},
keywords = {Curse of dimension, Deep neural networks, Function approximation, Metric entropy, Neural Networks, Computer, Piecewise smooth functions, Sparse connectivity},
pages = {296--330},
pages = {296{\textendash}330},
file = {Submitted Version:/Users/shakilrafi/Zotero/storage/UL4GLF59/Petersen and Voigtlaender - 2018 - Optimal approximation of piecewise smooth function.pdf:application/pdf},
}
@ -692,8 +690,7 @@ year = {2021}
month = jun,
year = {1990},
keywords = {learnability theory, learning from examples, Machine learning, PAC learning, polynomial-time identification},
pages = {197--227},
file = {Full Text PDF:/Users/shakilrafi/Zotero/storage/B4J2KPSN/Schapire - 1990 - The strength of weak learnability.pdf:application/pdf},
pages = {197{\textendash}227}
}

Binary file not shown.

View File

@ -5,9 +5,9 @@
\include{front_matter}
\tableofcontents
\part{On Convergence of Brownian Motion Monte Carlo}
\setcounter{page}{1}
\include{Introduction}
\include{Brownian_motion_monte_carlo}
@ -32,12 +32,12 @@
\include{conclusions-further-research}
\chapter{Bibliography and Code Listings}
%\nocite{*}
\singlespacing
\bibliography{main.bib}
\bibliographystyle{apa}
\include{back_matter}
\include{appendices}

View File

@ -107,7 +107,7 @@ We seek here to introduce a unified framework for artificial neural networks. Th
\end{proof}
\section{Compositions of ANNs}
The first operation we want to be able to do is to compose neural networks. Note that the composition is not done in an obvious way; for instance, note that the last layer of the first component of the composition is superimposed with the first layer of the second component of the composition.
\subsection{Composition}
\begin{definition}[Compositions of ANNs]\label{5.2.1}\label{def:comp}
We denote by $\lp \cdot \rp \bullet \lp \cdot \rp: \{ \lp \nu_1,\nu_2 \rp \in \neu \times \neu: \inn(\nu_1) = \out (\nu_1) \} \rightarrow \neu$ the function satisfying for all $L,M \in \N, l_0,l_1,...,l_L, m_0, m_1,...,m_M \in \N$, $\nu_1 = \lp \lp W_1, b_1 \rp, \lp W_2, b_2 \rp,...,\lp W_L,b_L \rp \rp \in \lp \bigtimes^L_{k=1} \lb \R^{l_k \times l_{k-1}} \times \R^{l_k}\rb \rp$, and $\nu_2 = \\ \lp \lp W'_1, b'_1 \rp, \lp W'_2, b'_2 \rp,... \lp W'_M, b'_M \rp \rp \in \lp \bigtimes^M_{k=1} \lb \R^{m_k \times m_{k-1}} \times \R^{m_k}\rb \rp$ with $l_0 = \inn(\nu_1)= \out(\nu_2) = m_M$ and :
\begin{align}\label{5.2.1}

View File

@ -18,6 +18,7 @@
bottom=1in
}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
@ -118,7 +119,7 @@
\DeclareMathOperator{\lay}{\mathsf{L}}
\DeclareMathOperator{\dep}{\mathsf{D}}
\DeclareMathOperator{\we}{Weight}
\DeclareMathOperator{\bi}{Bias}
\DeclareMathOperator{\bi}{bias}
\DeclareMathOperator{\aff}{\mathsf{Aff}}
\DeclareMathOperator{\act}{\mathfrak{a}}
\DeclareMathOperator{\real}{\mathfrak{I}}

View File

@ -842,7 +842,7 @@ Since for all $n\in \N$, it is the case that $\mathcal{S} = \lp \supp(\mathfrak{
\end{proof}
\section{Solutions, Characterization, and Computational\\ Bounds to the Kolmogorov Backward Equations}
\section{Solutions, Characterization, and Computational \\ Bounds}
% \begin{proof}
% From Feynman-Kac, especially from \cite[(1.5)]{hutzenthaler_strong_2021} and setting $f=0$ in the notation of \cite[(1.5)]{hutzenthaler_strong_2021} we have that:
% \begin{align}

Binary file not shown.