# 5. gyakorlat Regularizációs módszerek

A regularizációs módszerek elsősorban a túltanulás megelőzésére használható.

## Weight decay

Az egyik legegyszerűbb regularizációs módszer, lényege, hogy a túl nagy súlyokat bünteti.

## Dropout

Tanítás során véletlenszerűen "kiejtünk" néhány neuront a rejtett rétegben, 

## Batch Normalization

Tanítás során a rétegek kimenetét standardizáljuk (átlag és szórás módosításával)

## MNIST adatbázis

![MNIST adatbázis](http://neuralnetworksanddeeplearning.com/images/mnist_100_digits.png)
Az adatbázis kézzel írt számokról készült képeket tartalmaz. Összesen 60000 db példa található a tanító adatbázisban és 10000 a teszt halmazban. A képeket egységes méretre hozták (28x28 pixel) és minden pixel intenzitását 0 és 1 közé normalizáltak. A képek 1-D-s numpy tömbökben vannak tárolva.

Bővebben: http://yann.lecun.com/exdb/mnist/

Első lépésben improtáljuk a szükséges csomagokat és betöltjük az adatbázist a /tmp/data könyvtárból.

In [1]:
from __future__ import print_function

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=False)

import tensorflow as tf
import numpy as np

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting /tmp/data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz


A következő lépés a tanítási paraméterek és a neuronháló strukturához a változók inicializálása.

In [2]:
# Parameters
learning_rate = 0.1
num_steps = 2000
batch_size = 128
display_step = 100

# Network Parameters
n_hidden= 100 # number of hidden neurons
num_input = 784 # MNIST data input (img shape: 28*28)
num_classes = 10 # MNIST total classes (0-9 digits)

A neuronháló tanításához létrehozunk egy bemeneti függvényt, ami a korábban letöltött adatbázisból ú.n. batch-eket fog készíteni.

In [3]:
# Define the input function for training
input_fn = tf.estimator.inputs.numpy_input_fn(
    x={'images': mnist.train.images}, y=mnist.train.labels,
    batch_size=batch_size, num_epochs=None, shuffle=True)
input_fn_small = tf.estimator.inputs.numpy_input_fn(
    x={'images': mnist.train.images[:1000]}, y=mnist.train.labels[:1000],
    batch_size=batch_size, num_epochs=None, shuffle=True)
# Define the input function for evaluating
input_fn_test = tf.estimator.inputs.numpy_input_fn(
    x={'images': mnist.test.images}, y=mnist.test.labels,
    batch_size=batch_size, shuffle=False)

A következő kód létrehozza a neuronhálót.

In [4]:
# Define the neural network
def neural_net(x_dict, act_fn):
    # TF Estimator input is a dict, in case of multiple inputs
    x = x_dict['images']
    layer_1 = tf.layers.dense(x, n_hidden, activation=act_fn)
    # Hidden fully connected layer with 256 neurons
    layer_2 = tf.layers.dense(layer_1, n_hidden, activation=act_fn)
    # Output fully connected layer with a neuron for each class
    out_layer = tf.layers.dense(layer_2, num_classes)
    return out_layer

Hagyományos modell építése:

In [5]:
# Define the model function (following TF Estimator Template)
def model_fn(features, labels, mode):
    
    # Build the neural network
    logits = neural_net(features, tf.nn.sigmoid) 
    
    # Predictions
    pred_classes = tf.argmax(logits, axis=1)
    pred_probas = tf.nn.softmax(logits)
    
    # If prediction mode, early return
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode, predictions=pred_classes) 
        
    # Define loss and optimizer
    loss_op = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
        logits=logits, labels=tf.cast(labels, dtype=tf.int32)))
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
    train_op = optimizer.minimize(loss_op, global_step=tf.train.get_global_step())
    
    # Evaluate the accuracy of the model
    acc_op = tf.metrics.accuracy(labels=labels, predictions=pred_classes)
    
    # TF Estimators requires to return a EstimatorSpec, that specify
    # the different ops for training, evaluating, ...
    estim_specs = tf.estimator.EstimatorSpec(
      mode=mode,
      predictions=pred_classes,
      loss=loss_op,
      train_op=train_op,
      eval_metric_ops={'accuracy': acc_op})

    return estim_specs

### Neuronháló tanítása
Itt még nem használunk regularizációt, megfigyelhető a túltanulás.

In [6]:
# Train the Model
model = tf.estimator.Estimator(model_fn)
model.train(input_fn_small, steps=num_steps)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_task_type': 'worker', '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f013c0b4690>, '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_num_ps_replicas': 0, '_tf_random_seed': None, '_master': '', '_num_worker_replicas': 1, '_task_id': 0, '_log_step_count_steps': 100, '_model_dir': '/tmp/tmpGs_Nmr', '_save_summary_steps': 100}
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into /tmp/tmpGs_Nmr/model.ckpt.
INFO:tensorflow:loss = 2.8037, step = 1
INFO:tensorflow:global_step/sec: 178.89
INFO:tensorflow:loss = 2.20164, step = 101 (0.561 sec)
INFO:tensorflow:global_step/sec: 190.224
INFO:tensorflow:loss = 1.96941, step = 201 (0.528 sec)
INFO:tensorflow:global_step/sec: 175.358
INFO:tensorflow:loss = 1.61799, st

<tensorflow.python.estimator.estimator.Estimator at 0x7f01074606d0>

In [7]:
model.evaluate(input_fn_small, steps=10)
print("Teszt:")
model.evaluate(input_fn_test)

INFO:tensorflow:Starting evaluation at 2018-07-19-08:26:49
INFO:tensorflow:Restoring parameters from /tmp/tmpGs_Nmr/model.ckpt-2000
INFO:tensorflow:Evaluation [1/10]
INFO:tensorflow:Evaluation [2/10]
INFO:tensorflow:Evaluation [3/10]
INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Evaluation [10/10]
INFO:tensorflow:Finished evaluation at 2018-07-19-08:26:49
INFO:tensorflow:Saving dict for global step 2000: accuracy = 0.964844, global_step = 2000, loss = 0.174116
Teszt:
INFO:tensorflow:Starting evaluation at 2018-07-19-08:26:49
INFO:tensorflow:Restoring parameters from /tmp/tmpGs_Nmr/model.ckpt-2000
INFO:tensorflow:Finished evaluation at 2018-07-19-08:26:50
INFO:tensorflow:Saving dict for global step 2000: accuracy = 0.8765, global_step = 2000, loss = 0.41952


{'accuracy': 0.87650001, 'global_step': 2000, 'loss': 0.41952011}

## Weight decay

Most csak a legegyszerűbb módon használjuk, hivatalosan az optimalizálandó célfüggvényt kéne kiegészíteni.
Egy szebb megoldás: http://www.ritchieng.com/machine-learning/deep-learning/tensorflow/regularization/


In [43]:
regularizer = tf.contrib.layers.l1_regularizer(0.5)
# Define the neural network
def neural_net_wd(x_dict, act_fn):
    # TF Estimator input is a dict, in case of multiple inputs
    x = x_dict['images']
    layer_1 = tf.layers.dense(x, n_hidden, activation=act_fn, kernel_regularizer=regularizer)
    # Hidden fully connected layer with 256 neurons
    layer_2 = tf.layers.dense(layer_1, n_hidden, activation=act_fn, kernel_regularizer=regularizer)
    # Output fully connected layer with a neuron for each class
    out_layer = tf.layers.dense(layer_2, num_classes, kernel_regularizer=regularizer)
    return out_layer

## Dropout
https://www.tensorflow.org/api_docs/python/tf/nn/dropout

In [44]:
dropout = 0.3
# Define the neural network
def neural_net_do(x_dict, act_fn):
    # TF Estimator input is a dict, in case of multiple inputs
    x = x_dict['images']
    layer_1 = tf.layers.dense(x, n_hidden, activation=act_fn)
    layer_1 = tf.nn.dropout(layer_1, dropout)
    # Hidden fully connected layer with 256 neurons
    layer_2 = tf.layers.dense(layer_1, n_hidden, activation=act_fn)
    layer_2 = tf.nn.dropout(layer_2, dropout)
    # Output fully connected layer with a neuron for each class
    out_layer = tf.layers.dense(layer_2, num_classes)
    return out_layer

## Batch normalization
https://www.tensorflow.org/api_docs/python/tf/contrib/layers/batch_norm

In [15]:
training = True
# Define the neural network
def neural_net_bn(x_dict, act_fn, mode):
    # TF Estimator input is a dict, in case of multiple inputs
    x = x_dict['images']
    training = bool(mode == tf.estimator.ModeKeys.TRAIN)
    print('training:', training)
    layer_1 = tf.layers.dense(x, n_hidden, activation=act_fn)
    layer_1 = tf.contrib.layers.batch_norm(layer_1, is_training = training)
    # Hidden fully connected layer with 256 neurons
    layer_2 = tf.layers.dense(layer_1, n_hidden)
    layer_2 = tf.contrib.layers.batch_norm(layer_2, is_training = training)
    # Output fully connected layer with a neuron for each class
    out_layer = tf.layers.dense(layer_2, num_classes)
    return out_layer

Futtassuk újra a model functiont a megfelelő neuronháló modellel:

In [16]:
# Define the model function (following TF Estimator Template)
def model_fn_reg(features, labels, mode):
    
    # Build the neural network
    logits = neural_net_wd(features, tf.nn.sigmoid) 
    #logits = neural_net_do(features, tf.nn.sigmoid) 
    #logits = neural_net_bn(features, tf.nn.sigmoid, mode) 
    # Predictions
    pred_classes = tf.argmax(logits, axis=1)
    pred_probas = tf.nn.softmax(logits)
    
    # If prediction mode, early return
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode, predictions=pred_classes) 
        
    # Define loss and optimizer
    loss_op = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
        logits=logits, labels=tf.cast(labels, dtype=tf.int32)))
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
    train_op = optimizer.minimize(loss_op, global_step=tf.train.get_global_step())
    
    # Evaluate the accuracy of the model
    acc_op = tf.metrics.accuracy(labels=labels, predictions=pred_classes)
    
    # TF Estimators requires to return a EstimatorSpec, that specify
    # the different ops for training, evaluating, ...
    estim_specs = tf.estimator.EstimatorSpec(
      mode=mode,
      predictions=pred_classes,
      loss=loss_op,
      train_op=train_op,
      eval_metric_ops={'accuracy': acc_op})

    return estim_specs

Majd tanítsunk és teszteljünk:

In [17]:
# Train the Model
training = True
learning_rate = 0.1
model = tf.estimator.Estimator(model_fn_reg)
model.train(input_fn_small, steps=num_steps)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_task_type': 'worker', '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f00f6176490>, '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_num_ps_replicas': 0, '_tf_random_seed': None, '_master': '', '_num_worker_replicas': 1, '_task_id': 0, '_log_step_count_steps': 100, '_model_dir': '/tmp/tmpFX5N4p', '_save_summary_steps': 100}
training: True
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into /tmp/tmpFX5N4p/model.ckpt.
INFO:tensorflow:loss = 3.07624, step = 1
INFO:tensorflow:global_step/sec: 95.0544
INFO:tensorflow:loss = 0.0973526, step = 101 (1.054 sec)
INFO:tensorflow:global_step/sec: 95.217
INFO:tensorflow:loss = 0.0571967, step = 201 (1.050 sec)
INFO:tensorflow:global_step/sec: 98.1937
INFO:tensorflo

<tensorflow.python.estimator.estimator.Estimator at 0x7f00f2cba4d0>

In [18]:
training = False
model.evaluate(input_fn_small, steps=10)
print("Teszt:")
model.evaluate(input_fn_test)

training: False
INFO:tensorflow:Starting evaluation at 2018-07-19-08:30:30
INFO:tensorflow:Restoring parameters from /tmp/tmpFX5N4p/model.ckpt-2000
INFO:tensorflow:Evaluation [1/10]
INFO:tensorflow:Evaluation [2/10]
INFO:tensorflow:Evaluation [3/10]
INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Evaluation [10/10]
INFO:tensorflow:Finished evaluation at 2018-07-19-08:30:30
INFO:tensorflow:Saving dict for global step 2000: accuracy = 0.138281, global_step = 2000, loss = 2.38173
Teszt:
training: False
INFO:tensorflow:Starting evaluation at 2018-07-19-08:30:31
INFO:tensorflow:Restoring parameters from /tmp/tmpFX5N4p/model.ckpt-2000
INFO:tensorflow:Finished evaluation at 2018-07-19-08:30:32
INFO:tensorflow:Saving dict for global step 2000: accuracy = 0.1269, global_step = 2000, loss = 2.55234


{'accuracy': 0.1269, 'global_step': 2000, 'loss': 2.5523436}

### Feladatok
1. Álítsuk be a learning rate-et és egyéb paramétereket!
2. Kombináljuk a különböző regularizációs módszereket!