bat67
diff --git a/‎Udacity-Deep-Learning-with-PyTorch/Part 1 - Tensors in PyTorch.py
Lines changed: 178 additions & 0 deletions b/‎Udacity-Deep-Learning-with-PyTorch/Part 1 - Tensors in PyTorch.py
Lines changed: 178 additions & 0 deletions
diff --git a/‎Udacity-Deep-Learning-with-PyTorch/Part 2 - Neural Networks in PyTorch.py
Lines changed: 192 additions & 0 deletions b/‎Udacity-Deep-Learning-with-PyTorch/Part 2 - Neural Networks in PyTorch.py
Lines changed: 192 additions & 0 deletions
@@ -0,0 +1,178 @@
+#%% [markdown]
+# # Introduction to Deep Learning with PyTorch
+# 
+# In this notebook, you'll get introduced to [PyTorch](http://pytorch.org/), a framework for building and training neural networks. PyTorch in a lot of ways behaves like the arrays you love from Numpy. These Numpy arrays, after all, are just tensors. PyTorch takes these tensors and makes it simple to move them to GPUs for the faster processing needed when training neural networks. It also provides a module that automatically calculates gradients (for backpropagation!) and another module specifically for building neural networks. All together, PyTorch ends up being more coherent with Python and the Numpy/Scipy stack compared to TensorFlow and other frameworks.
+# 
+# 
+#%% [markdown]
+# ## Neural Networks
+# 
+# Deep Learning is based on artificial neural networks which have been around in some form since the late 1950s. The networks are built from individual parts approximating neurons, typically called units or simply "neurons." Each unit has some number of weighted inputs. These weighted inputs are summed together (a linear combination) then passed through an activation function to get the unit's output.
+# 
+# <img src="assets/simple_neuron.png" width=400px>
+# 
+# Mathematically this looks like: 
+# 
+# $$
+# \begin{align}
+# y &= f(w_1 x_1 + w_2 x_2 + b) \\
+# y &= f\left(\sum_i w_i x_i \right)
+# \end{align}
+# $$
+# 
+# With vectors this is the dot/inner product of two vectors:
+# 
+# $$
+# h = \begin{bmatrix}
+# x_1 \, x_2 \cdots  x_n
+# \end{bmatrix}
+# \cdot 
+# \begin{bmatrix}
+#            w_1 \\
+#            w_2 \\
+#            \vdots \\
+#            w_n
+# \end{bmatrix}
+# $$
+#%% [markdown]
+# ### Stack them up!
+# 
+# We can assemble these unit neurons into layers and stacks, into a network of neurons. The output of one layer of neurons becomes the input for the next layer. With multiple input units and output units, we now need to express the weights as a matrix.
+# 
+# <img src='assets/multilayer_diagram_weights.png' width=450px>
+# 
+# We can express this mathematically with matrices again and use matrix multiplication to get linear combinations for each unit in one operation. For example, the hidden layer ($h_1$ and $h_2$ here) can be calculated 
+# 
+# $$
+# \vec{h} = [h_1 \, h_2] = 
+# \begin{bmatrix}
+# x_1 \, x_2 \cdots \, x_n
+# \end{bmatrix}
+# \cdot 
+# \begin{bmatrix}
+#            w_{11} & w_{12} \\
+#            w_{21} &w_{22} \\
+#            \vdots &\vdots \\
+#            w_{n1} &w_{n2}
+# \end{bmatrix}
+# $$
+# 
+# The output for this small network is found by treating the hidden layer as inputs for the output unit. The network output is expressed simply
+# 
+# $$
+# y =  f_2 \! \left(\, f_1 \! \left(\vec{x} \, \mathbf{W_1}\right) \mathbf{W_2} \right)
+# $$
+#%% [markdown]
+# ## Tensors
+# 
+# It turns out neural network computations are just a bunch of linear algebra operations on *tensors*, a generalization of matrices. A vector is a 1-dimensional tensor, a matrix is a 2-dimensional tensor, an array with three indices is a 3-dimensional tensor (RGB color images for example). The fundamental data structure for neural networks are tensors and PyTorch (as well as pretty much every other deep learning framework) is built around tensors.
+# 
+# <img src="assets/tensor_examples.svg" width=600px>
+# 
+# With the basics covered, it's time to explore how we can use PyTorch to build a simple neural network.
+
+#%%
+get_ipython().run_line_magic('matplotlib', 'inline')
+get_ipython().run_line_magic('config', "InlineBackend.figure_format = 'retina'")
+
+import numpy as np
+import torch
+
+import helper
+
+#%% [markdown]
+# First, let's see how we work with PyTorch tensors. These are the fundamental data structures of neural networks and PyTorch, so it's imporatant to understand how these work.
+
+#%%
+x = torch.rand(3, 2)
+x
+
+
+#%%
+y = torch.ones(x.size())
+y
+
+
+#%%
+z = x + y
+z
+
+#%% [markdown]
+# In general PyTorch tensors behave similar to Numpy arrays. They are zero indexed and support slicing.
+
+#%%
+z[0]
+
+
+#%%
+z[:, 1:]
+
+#%% [markdown]
+# Tensors typically have two forms of methods, one method that returns another tensor and another method that performs the operation in place. That is, the values in memory for that tensor are changed without creating a new tensor. In-place functions are always followed by an underscore, for example `z.add()` and `z.add_()`.
+
+#%%
+# Return a new tensor z + 1
+z.add(1)
+
+
+#%%
+# z tensor is unchanged
+z
+
+
+#%%
+# Add 1 and update z tensor in-place
+z.add_(1)
+
+
+#%%
+# z has been updated
+z
+
+#%% [markdown]
+# ### Reshaping
+# 
+# Reshaping tensors is a really common operation. First to get the size and shape of a tensor use `.size()`. Then, to reshape a tensor, use `.resize_()`. Notice the underscore, reshaping is an in-place operation.
+
+#%%
+z.size()
+
+
+#%%
+z.resize_(2, 3)
+
+
+#%%
+z
+
+#%% [markdown]
+# ## Numpy to Torch and back
+# 
+# Converting between Numpy arrays and Torch tensors is super simple and useful. To create a tensor from a Numpy array, use `torch.from_numpy()`. To convert a tensor to a Numpy array, use the `.numpy()` method.
+
+#%%
+a = np.random.rand(4,3)
+a
+
+
+#%%
+b = torch.from_numpy(a)
+b
+
+
+#%%
+b.numpy()
+
+#%% [markdown]
+# The memory is shared between the Numpy array and Torch tensor, so if you change the values in-place of one object, the other will change as well.
+
+#%%
+# Multiply PyTorch Tensor by 2, in place
+b.mul_(2)
+
+
+#%%
+# Numpy array matches new values from Tensor
+a
+
+
@@ -0,0 +1,192 @@
+#%% [markdown]
+# # Neural networks with PyTorch
+# 
+# Next I'll show you how to build a neural network with PyTorch.
+
+#%%
+# Import things like usual
+
+get_ipython().run_line_magic('matplotlib', 'inline')
+get_ipython().run_line_magic('config', "InlineBackend.figure_format = 'retina'")
+
+import numpy as np
+import torch
+
+import helper
+
+import matplotlib.pyplot as plt
+from torchvision import datasets, transforms
+
+#%% [markdown]
+# First up, we need to get our dataset. This is provided through the `torchvision` package. The code below will download the MNIST dataset, then create training and test datasets for us. Don't worry too much about the details here, you'll learn more about this later.
+
+#%%
+# Define a transform to normalize the data
+transform = transforms.Compose([transforms.ToTensor(),
+                              transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
+                             ])
+# Download and load the training data
+trainset = datasets.MNIST('MNIST_data/', download=True, train=True, transform=transform)
+trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
+
+# Download and load the test data
+testset = datasets.MNIST('MNIST_data/', download=True, train=False, transform=transform)
+testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=True)
+
+
+#%%
+dataiter = iter(trainloader)
+images, labels = dataiter.next()
+
+#%% [markdown]
+# We have the training data loaded into `trainloader` and we make that an iterator with `iter(trainloader)`. We'd use this to loop through the dataset for training, but here I'm just grabbing the first batch so we can check out the data. We can see below that `images` is just a tensor with size (64, 1, 28, 28). So, 64 images per batch, 1 color channel, and 28x28 images.
+
+#%%
+plt.imshow(images[1].numpy().squeeze(), cmap='Greys_r');
+
+#%% [markdown]
+# ## Building networks with PyTorch
+# 
+# Here I'll use PyTorch to build a simple feedfoward network to classify the MNIST images. That is, the network will receive a digit image as input and predict the digit in the image.
+# 
+# <img src="assets/mlp_mnist.png" width=600px>
+# 
+# To build a neural network with PyTorch, you use the `torch.nn` module. The network itself is a class inheriting from `torch.nn.Module`. You define each of the operations separately, like `nn.Linear(784, 128)` for a fully connected linear layer with 784 inputs and 128 units.
+# 
+# The class needs to include a `forward` method that implements the forward pass through the network. In this method, you pass some input tensor `x` through each of the operations you defined earlier. The `torch.nn` module also has functional equivalents for things like ReLUs in `torch.nn.functional`. This module is usually imported as `F`. Then to use a ReLU activation on some layer (which is just a tensor), you'd do `F.relu(x)`. Below are a few different commonly used activation functions.
+# 
+# <img src="assets/activation.png" width=700px>
+# 
+# So, for this network, I'll build it with three fully connected layers, then a softmax output for predicting classes. The softmax function is similar to the sigmoid in that it squashes inputs between 0 and 1, but it's also normalized so that all the values sum to one like a proper probability distribution.
+
+#%%
+from torch import nn
+from torch import optim
+import torch.nn.functional as F
+
+
+#%%
+class Network(nn.Module):
+    def __init__(self):
+        super().__init__()
+        # Defining the layers, 128, 64, 10 units each
+        self.fc1 = nn.Linear(784, 128)
+        self.fc2 = nn.Linear(128, 64)
+        # Output layer, 10 units - one for each digit
+        self.fc3 = nn.Linear(64, 10)
+        
+    def forward(self, x):
+        ''' Forward pass through the network, returns the output logits '''
+        
+        x = self.fc1(x)
+        x = F.relu(x)
+        x = self.fc2(x)
+        x = F.relu(x)
+        x = self.fc3(x)
+        x = F.softmax(x, dim=1)
+        
+        return x
+
+model = Network()
+model
+
+#%% [markdown]
+# ### Initializing weights and biases
+# 
+# The weights and such are automatically initialized for you, but it's possible to customize how they are initialized. The weights and biases are tensors attached to the layer you defined, you can get them with `model.fc1.weight` for instance.
+
+#%%
+print(model.fc1.weight)
+print(model.fc1.bias)
+
+#%% [markdown]
+# For custom initialization, we want to modify these tensors in place. These are actually autograd *Variables*, so we need to get back the actual tensors with `model.fc1.weight.data`. Once we have the tensors, we can fill them with zeros (for biases) or random normal values.
+
+#%%
+# Set biases to all zeros
+model.fc1.bias.data.fill_(0)
+
+
+#%%
+# sample from random normal with standard dev = 0.01
+model.fc1.weight.data.normal_(std=0.01)
+
+#%% [markdown]
+# ### Forward pass
+# 
+# Now that we have a network, let's see what happens when we pass in an image. This is called the forward pass. We're going to convert the image data into a tensor, then pass it through the operations defined by the network architecture.
+
+#%%
+# Grab some data 
+dataiter = iter(trainloader)
+images, labels = dataiter.next()
+
+# Resize images into a 1D vector, new shape is (batch size, color channels, image pixels) 
+images.resize_(64, 1, 784)
+# or images.resize_(images.shape[0], 1, 784) to not automatically get batch size
+
+# Forward pass through the network
+img_idx = 0
+ps = model.forward(images[img_idx,:])
+
+img = images[img_idx]
+helper.view_classify(img.view(1, 28, 28), ps)
+
+#%% [markdown]
+# As you can see above, our network has basically no idea what this digit is. It's because we haven't trained it yet, all the weights are random!
+# 
+# PyTorch provides a convenient way to build networks like this where a tensor is passed sequentially through operations, `nn.Sequential` ([documentation](https://pytorch.org/docs/master/nn.html#torch.nn.Sequential)). Using this to build the equivalent network:
+
+#%%
+# Hyperparameters for our network
+input_size = 784
+hidden_sizes = [128, 64]
+output_size = 10
+
+# Build a feed-forward network
+model = nn.Sequential(nn.Linear(input_size, hidden_sizes[0]),
+                      nn.ReLU(),
+                      nn.Linear(hidden_sizes[0], hidden_sizes[1]),
+                      nn.ReLU(),
+                      nn.Linear(hidden_sizes[1], output_size),
+                      nn.Softmax(dim=1))
+print(model)
+
+# Forward pass through the network and display output
+images, labels = next(iter(trainloader))
+images.resize_(images.shape[0], 1, 784)
+ps = model.forward(images[0,:])
+helper.view_classify(images[0].view(1, 28, 28), ps)
+
+#%% [markdown]
+# You can also pass in an `OrderedDict` to name the individual layers and operations. Note that a dictionary keys must be unique, so _each operation must have a different name_.
+
+#%%
+from collections import OrderedDict
+model = nn.Sequential(OrderedDict([
+                      ('fc1', nn.Linear(input_size, hidden_sizes[0])),
+                      ('relu1', nn.ReLU()),
+                      ('fc2', nn.Linear(hidden_sizes[0], hidden_sizes[1])),
+                      ('relu2', nn.ReLU()),
+                      ('output', nn.Linear(hidden_sizes[1], output_size)),
+                      ('softmax', nn.Softmax(dim=1))]))
+model
+
+#%% [markdown]
+# Now it's your turn to build a simple network, use any method I've covered so far. In the next notebook, you'll learn how to train a network so it can make good predictions.
+# 
+# >**Exercise:** Build a network to classify the MNIST images with _three_ hidden layers. Use 400 units in the first hidden layer, 200 units in the second layer, and 100 units in the third layer. Each hidden layer should have a ReLU activation function, and use softmax on the output layer. 
+
+#%%
+## TODO: Your network here
+
+
+#%%
+## Run this cell with your model to make sure it works ##
+# Forward pass through the network and display output
+images, labels = next(iter(trainloader))
+images.resize_(images.shape[0], 1, 784)
+ps = model.forward(images[0,:])
+helper.view_classify(images[0].view(1, 28, 28), ps)
+
+