{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Activation functions\n",
    "\n",
    "\n",
    "https://en.wikipedia.org/wiki/Activation_function\n",
    "\n",
    "## Sigmoid\n",
    "\n",
    "\\begin{equation*}\n",
    "f(x) = \\frac{1}{1+e^{-x}}\n",
    "\\end{equation*}\n",
    "## Tanh\n",
    "\n",
    "\\begin{equation*}\n",
    "f(x) = tanh(x)\n",
    "\\end{equation*}\n",
    "\n",
    "## ReLu\n",
    "\n",
    "\\begin{equation*}\n",
    "f(x) = max(0,x)\n",
    "\\end{equation*}\n",
    "\n",
    "## SoftPlus\n",
    "\n",
    "\\begin{equation*}\n",
    "f(x) = ln(1+e^{-x})\n",
    "\\end{equation*}\n",
    "\n",
    "## MNIST adatbázis\n",
    "\n",
    "![MNIST adatbázis](http://neuralnetworksanddeeplearning.com/images/mnist_100_digits.png)\n",
    "The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.\n",
    "\n",
    "It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting. \n",
    "\n",
    "More information: http://yann.lecun.com/exdb/mnist/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First let's import all packages and load the date"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "# Import MNIST data\n",
    "from tensorflow.examples.tutorials.mnist import input_data\n",
    "mnist = input_data.read_data_sets(\"/tmp/data/\", one_hot=False)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next some hyperparameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Parameters\n",
    "learning_rate = 0.1\n",
    "num_steps = 1000\n",
    "batch_size = 128\n",
    "display_step = 100\n",
    "\n",
    "# Network Parameters\n",
    "n_hidden_1 = 256 # 1st layer number of neurons\n",
    "n_hidden_2 = 256 # 2nd layer number of neurons\n",
    "num_input = 784 # MNIST data input (img shape: 28*28)\n",
    "num_classes = 10 # MNIST total classes (0-9 digits)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A custom activation function, if we use only tf functions, the we do not have to define its gradient"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def custom(x):\n",
    "    return 2*tf.abs(x)-1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The input function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define the input function for training\n",
    "input_fn = tf.estimator.inputs.numpy_input_fn(\n",
    "    x={'images': mnist.train.images}, y=mnist.train.labels,\n",
    "    batch_size=batch_size, num_epochs=None, shuffle=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Definition of the network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define the neural network\n",
    "def neural_net(x_dict, act_fn):\n",
    "    # TF Estimator input is a dict, in case of multiple inputs\n",
    "    x = x_dict['images']\n",
    "    layer_1 = tf.layers.dense(x, n_hidden_1, activation=act_fn)\n",
    "    # Hidden fully connected layer with 256 neurons\n",
    "    layer_2 = tf.layers.dense(layer_1, n_hidden_2, activation=act_fn)\n",
    "    # Output fully connected layer with a neuron for each class\n",
    "    out_layer = tf.layers.dense(layer_2, num_classes)\n",
    "    return out_layer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For training we need to set:\n",
    "- handling of the output\n",
    "- loss function\n",
    "- error metric\n",
    "- ** activation function ** \n",
    "\n",
    "Let's try some activation functions:\n",
    "- None (linear)\n",
    "- tf.nn.sigmoid\n",
    "- tf.nn.tanh\n",
    "- tf.nn.relu\n",
    "- tf.nn.softplus\n",
    "- custom\n",
    "\n",
    "More options: https://www.tensorflow.org/api_docs/python/tf/nn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define the model function (following TF Estimator Template)\n",
    "def model_fn(features, labels, mode):\n",
    "    \n",
    "    # Build the neural network\n",
    "    logits = neural_net(features, None) \n",
    "    \n",
    "    # Predictions\n",
    "    pred_classes = tf.argmax(logits, axis=1)\n",
    "    pred_probas = tf.nn.softmax(logits)\n",
    "    \n",
    "    # If prediction mode, early return\n",
    "    if mode == tf.estimator.ModeKeys.PREDICT:\n",
    "        return tf.estimator.EstimatorSpec(mode, predictions=pred_classes) \n",
    "        \n",
    "    # Define loss and optimizer\n",
    "    loss_op = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(\n",
    "        logits=logits, labels=tf.cast(labels, dtype=tf.int32)))\n",
    "    optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\n",
    "    train_op = optimizer.minimize(loss_op, global_step=tf.train.get_global_step())\n",
    "    \n",
    "    # Evaluate the accuracy of the model\n",
    "    acc_op = tf.metrics.accuracy(labels=labels, predictions=pred_classes)\n",
    "    \n",
    "    # TF Estimators requires to return a EstimatorSpec, that specify\n",
    "    # the different ops for training, evaluating, ...\n",
    "    estim_specs = tf.estimator.EstimatorSpec(\n",
    "      mode=mode,\n",
    "      predictions=pred_classes,\n",
    "      loss=loss_op,\n",
    "      train_op=train_op,\n",
    "      eval_metric_ops={'accuracy': acc_op})\n",
    "\n",
    "    return estim_specs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Training\n",
    "Here we will evaluate the model after training on the training set (we should have used a validation set)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Train the Model\n",
    "model = tf.estimator.Estimator(model_fn)\n",
    "model.train(input_fn, steps=num_steps)\n",
    "model.evaluate(input_fn, steps=100) #Validation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Testing\n",
    "\n",
    "After training we evaluate the model on the test set\n",
    "\n",
    "**Important: we evaluate only the best model on the test set, otherwise it's peeking.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Evaluate the Model\n",
    "# Define the input function for evaluating\n",
    "input_fn_test = tf.estimator.inputs.numpy_input_fn(\n",
    "    x={'images': mnist.test.images}, y=mnist.test.labels,\n",
    "    batch_size=batch_size, shuffle=False)\n",
    "# Use the Estimator 'evaluate' method\n",
    "model.evaluate(input_fn_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some examples visually"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Predict single images\n",
    "n_images = 4\n",
    "# Get images from test set\n",
    "test_images = mnist.test.images[:n_images]\n",
    "# Prepare the input data\n",
    "input_fn_test_few = tf.estimator.inputs.numpy_input_fn(\n",
    "    x={'images': test_images}, shuffle=False)\n",
    "# Use the model to predict the images class\n",
    "preds = list(model.predict(input_fn_test_few))\n",
    "\n",
    "# Display\n",
    "for i in range(n_images):\n",
    "    plt.imshow(np.reshape(test_images[i], [28, 28]), cmap='gray')\n",
    "    plt.show()\n",
    "    print(\"Model prediction:\", preds[i])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tasks\n",
    "1. Let's create a separate validation set! (for example use 10% of the training data)\n",
    "2. Try to finetune the hyperparameters and the network structure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}