neural networks

Layers of weighted connections that learn to approximate functions from data. The computational substrate of modern artificial intelligence.

architecture

neuron: weighted sum of inputs passed through an activation function (ReLU, sigmoid, tanh)
layer: collection of neurons. Input, hidden, and output layers.
feedforward: signals flow in one direction, universal approximation theorem
recurrent (RNN): connections loop back, modeling sequences, LSTM and GRU variants
convolutional (CNN): local receptive fields, weight sharing, dominant in vision
transformer: self-attention mechanism, parallelizable, dominant in language. GPT, BERT

learning

backpropagation: computing gradients of the loss function via chain rule, flowing error backward through layers
gradient descent: adjusting weights to minimize loss (SGD, Adam)
loss function: measures the gap between prediction and ground truth
overfitting and regularization: dropout, weight decay, early stopping

deep learning

Stacking many layers enables hierarchical feature extraction. Deeper networks learn increasingly abstract representations. Scale of data, compute, and parameters drives capability.

connections

algorithms underpin training optimization. data structures (tensors, graphs) organize network computation. type theory informs tensor shape checking. consensus algorithms in cyber enable decentralized training and inference.

Cyber

Explorer

neural networks

neural networks

architecture

learning

deep learning

connections

Graph View

Table of Contents

Backlinks