## Multitask tanítás

Konvulúciós háló tanítás több feladat ellátására, esetünkben számok felismerésére és páros illetve páratlan számok elkülönítésére.

![multitask](http://ruder.io/content/images/2017/05/mtl_images-001-2.png)

A struktúrát tekintve külön task-specifikus rejtett rétegeket fogunk használni.

In [1]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
%matplotlib inline  
print ("PACKAGES LOADED")

PACKAGES LOADED


# Adatbázis betöltése

másodlagos címkék előállítása

In [2]:
mnist = input_data.read_data_sets('data/', one_hot=True)
trainimg   = mnist.train.images
trainlabel = mnist.train.labels
testimg    = mnist.test.images
testlabel  = mnist.test.labels

trainlabel_secondary = np.zeros([trainlabel.shape[0],2], dtype='float')
for i in range(trainlabel.shape[0]):
    trainlabel_secondary[i,(np.argmax(trainlabel[i])%2)] = 1
testlabel_secondary = np.zeros([testlabel.shape[0],2], dtype='float')
for i in range(testlabel.shape[0]):
    testlabel_secondary[i,(np.argmax(testlabel[i])%2)] = 1
    
print ("MNIST ready")
print(trainlabel[1:10])
print(trainlabel_secondary[1:10])

Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
MNIST ready
[[ 0.  0.  0.  1.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  1.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  1.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
 [ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]]
[[ 0.  1.]
 [ 1.  0.]
 [ 1.  0.]
 [ 0.  1.]
 [ 1.  0.]
 [ 0.  1.]
 [ 1.  0.]
 [ 0.  1.]
 [ 1.  0.]]


# GPU kiválasztása

In [3]:
device_type = "/gpu:0"
tf.set_random_seed(1024)
np.random.seed(1024)

# CNN háló definiálása

Fontos, hogy a bementet reshape-el képpé alakítsuk.

Standard hálóstruktúra:

Konvolúció, majd relu

Pooling (maxpooling)

Task specifikus fully connected réteg

Kimeneti rétegek


In [4]:
with tf.device(device_type): # <= This is optional
    n_input  = 784
    n_task_neurons = 100
    n_output = 10
    n_output_secondary = 2
    weights  = {
        'wc1': tf.Variable(tf.random_normal([3, 3, 1, 64], stddev=0.1)),
        'wt1': tf.Variable(tf.random_normal([14*14*64, n_task_neurons], stddev=0.1)),
        'wt2': tf.Variable(tf.random_normal([14*14*64, n_task_neurons], stddev=0.1)),
        'wo1': tf.Variable(tf.random_normal([n_task_neurons, n_output], stddev=0.1)),
        'wo2': tf.Variable(tf.random_normal([n_task_neurons, n_output_secondary], stddev=0.1))
        
    }
    biases   = {
        'bc1': tf.Variable(tf.random_normal([64], stddev=0.1)),
        'bt1': tf.Variable(tf.random_normal([n_task_neurons], stddev=0.1)),
        'bt2': tf.Variable(tf.random_normal([n_task_neurons], stddev=0.1)),
        'bo1': tf.Variable(tf.random_normal([n_output], stddev=0.1)),
        'bo2': tf.Variable(tf.random_normal([n_output_secondary], stddev=0.1))
    }
    def conv_simple(_input, _w, _b):
        # Reshape input
        _input_r = tf.reshape(_input, shape=[-1, 28, 28, 1])
        # Convolution
        _conv1 = tf.nn.conv2d(_input_r, _w['wc1'], strides=[1, 1, 1, 1], padding='SAME')
        # Add-bias
        _conv2 = tf.nn.bias_add(_conv1, _b['bc1'])
        # Pass ReLu
        _conv3 = tf.nn.relu(_conv2)
        # Max-pooling
        _pool  = tf.nn.max_pool(_conv3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
        # Vectorize
        _dense = tf.reshape(_pool, [-1, _w['wt1'].get_shape().as_list()[0]])
        # Main branch
        _task1 = tf.nn.relu(tf.add(tf.matmul(_dense, _w['wt1']), _b['bt1']))
        _out1 = tf.add(tf.matmul(_task1, _w['wo1']), _b['bo1'])
        
        # Secondary branch
        _task2 = tf.nn.relu(tf.add(tf.matmul(_dense, _w['wt2']), _b['bt2']))
        _out2 = tf.add(tf.matmul(_task2, _w['wo2']), _b['bo2'])
        
        # Return the outputs
        out = {
            'out1': _out1, 'out2': _out2
        }
        return out
print ("CNN ready")

CNN ready


# Tanítási paraméterek

In [7]:
# tf Graph input
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_output])
y2 = tf.placeholder(tf.float32, [None, n_output_secondary])
# Parameters
learning_rate   = 0.001
training_epochs = 10
batch_size      = 100
display_step    = 1
alpha           = 1
# Functions! 
with tf.device(device_type): # <= This is optional
    _pred = conv_simple(x, weights, biases)
    _pred1 = _pred['out1']
    _pred2 = _pred['out2']
    
    loss_main = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=_pred1, labels=y))
    optm_main = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss_main)
    _corr = tf.equal(tf.argmax(_pred1,1), tf.argmax(y,1)) # Count corrects
    accr = tf.reduce_mean(tf.cast(_corr, tf.float32)) # Accuracy
    
    loss_secondary= tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=_pred2, labels=y2))
    optm_secondary = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss_secondary)
    _corr_secondary = tf.equal(tf.argmax(_pred2,1), tf.argmax(y2,1)) # Count corrects
    accr_secondary = tf.reduce_mean(tf.cast(_corr_secondary, tf.float32)) # Accuracy
    
    loss_joint = loss_main+alpha*loss_secondary
    optm_joint = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss_joint)
    init = tf.global_variables_initializer()
tf.summary.scalar("loss_main", loss_main)
tf.summary.scalar("loss_secondary", loss_secondary)
# Create a summary to monitor accuracy tensor
tf.summary.scalar("accuracy_main", accr)
tf.summary.scalar("accuracy_secondary", accr_secondary)
# Merge all summaries into a single op
merged_summary_op = tf.summary.merge_all()
# Saver 
save_step = 1;
savedir = "nets/"
saver = tf.train.Saver(max_to_keep=3) 
print ("Network Ready to Go!")

Network Ready to Go!


# Háló tanítása

## 1. opció felváltva tanítjuk a taskokat

In [8]:
do_train = 1
sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True))
sess.run(init)

In [9]:
summary_writer = tf.summary.FileWriter("tb_log/", graph=tf.get_default_graph())
total_batch = trainlabel.shape[0] / batch_size
if do_train == 1:
    for epoch in range(training_epochs):
        avg_cost1 = 0.
        avg_cost2 = 0.
        for start, end in zip(range(0, trainlabel.shape[0], batch_size), 
                              range(batch_size, trainlabel.shape[0], batch_size)):
            batch_X = trainimg[start:end]
            batch_Y1 = trainlabel[start:end]
            batch_Y2 = trainlabel_secondary[start:end]
            # Fit training using batch data
            if np.random.rand()<0.5:
                _, summary = sess.run([optm_main, merged_summary_op], feed_dict={x: batch_X, y: batch_Y1, y2: batch_Y2})
                summary_writer.add_summary(summary, epoch * total_batch + start/batch_size)
                # Compute average loss
                avg_cost1 += sess.run(loss_main, feed_dict={x: batch_X, y: batch_Y1, y2: batch_Y2})/total_batch/2
            else:
                _, summary = sess.run([optm_secondary, merged_summary_op], feed_dict={x: batch_X, y: batch_Y1, y2: batch_Y2})
                summary_writer.add_summary(summary, epoch * total_batch + start/batch_size)
                # Compute average loss
                avg_cost2 += sess.run(loss_secondary, feed_dict={x: batch_X, y: batch_Y1, y2: batch_Y2})/total_batch/2
        # Display logs per epoch step
        if epoch % display_step == 0: 
            print ("Epoch: %03d/%03d costs: %.9f, %.9f" % (epoch, training_epochs, avg_cost1, avg_cost2))
            train_acc_main = sess.run(accr, feed_dict={x: batch_X, y: batch_Y1, y2: batch_Y2})
            print ("Main training accuracy: %.3f" % (train_acc_main))
            train_acc_secondary= sess.run(accr_secondary, feed_dict={x: batch_X, y: batch_Y1, y2: batch_Y2})
            print ("Secondary training accuracy: %.3f" % (train_acc_secondary))

        # Save Net
        if epoch % save_step == 0:
            saver.save(sess, "nets/cnn_mnist_simple.ckpt-" + str(epoch))
    print ("Optimization Finished.")

Epoch: 000/010 costs: 0.086590628, 0.051730368
Main training accuracy: 1.000
Secondary training accuracy: 1.000
Epoch: 001/010 costs: 0.030354142, 0.017160866
Main training accuracy: 1.000
Secondary training accuracy: 1.000
Epoch: 002/010 costs: 0.019866445, 0.012489429
Main training accuracy: 1.000
Secondary training accuracy: 1.000
Epoch: 003/010 costs: 0.015507267, 0.009713319
Main training accuracy: 1.000
Secondary training accuracy: 1.000
Epoch: 004/010 costs: 0.012827475, 0.007158817
Main training accuracy: 1.000
Secondary training accuracy: 1.000
Epoch: 005/010 costs: 0.009476832, 0.006330976
Main training accuracy: 1.000
Secondary training accuracy: 1.000
Epoch: 006/010 costs: 0.008574930, 0.005211284
Main training accuracy: 1.000
Secondary training accuracy: 1.000
Epoch: 007/010 costs: 0.006651645, 0.004511672
Main training accuracy: 1.000
Secondary training accuracy: 1.000
Epoch: 008/010 costs: 0.005502414, 0.003888990
Main training accuracy: 1.000
Secondary training accuracy

## 2. opció egyszerre tanítani az egész hálót

In [11]:
summary_writer = tf.summary.FileWriter("tb_log/", graph=tf.get_default_graph())
total_batch = trainlabel.shape[0] / batch_size
if do_train == 1:
    for epoch in range(training_epochs):
        avg_cost1 = 0.
        for start, end in zip(range(0, trainlabel.shape[0], batch_size), 
                              range(batch_size, trainlabel.shape[0], batch_size)):
            batch_X = trainimg[start:end]
            batch_Y1 = trainlabel[start:end]
            batch_Y2 = trainlabel_secondary[start:end]
            # Fit training using batch data
            _, summary = sess.run([optm_joint, merged_summary_op], feed_dict={x: batch_X, y: batch_Y1, y2: batch_Y2})
            summary_writer.add_summary(summary, epoch * total_batch + start/batch_size)
            # Compute average loss
            avg_cost1 += sess.run(loss_joint, feed_dict={x: batch_X, y: batch_Y1, y2: batch_Y2})/total_batch/2

        # Display logs per epoch step
        if epoch % display_step == 0: 
            print ("Epoch: %03d/%03d costs: %.9f" % (epoch, training_epochs, avg_cost1))
            train_acc_main = sess.run(accr, feed_dict={x: batch_X, y: batch_Y1, y2: batch_Y2})
            print ("Main training accuracy: %.3f" % (train_acc_main))
            train_acc_secondary= sess.run(accr_secondary, feed_dict={x: batch_X, y: batch_Y1, y2: batch_Y2})
            print ("Secondary training accuracy: %.3f" % (train_acc_secondary))

        # Save Net
        if epoch % save_step == 0:
            saver.save(sess, "nets/cnn_mnist_simple.ckpt-" + str(epoch))
    print ("Optimization Finished.")

Epoch: 000/010 costs: 0.008094425
 Main training accuracy: 1.000
 Secondary training accuracy: 1.000
Epoch: 001/010 costs: 0.005068405
 Main training accuracy: 1.000
 Secondary training accuracy: 1.000
Epoch: 002/010 costs: 0.003438783
 Main training accuracy: 1.000
 Secondary training accuracy: 1.000
Epoch: 003/010 costs: 0.002566073
 Main training accuracy: 1.000
 Secondary training accuracy: 1.000
Epoch: 004/010 costs: 0.002257223
 Main training accuracy: 1.000
 Secondary training accuracy: 1.000
Epoch: 005/010 costs: 0.001385671
 Main training accuracy: 1.000
 Secondary training accuracy: 1.000
Epoch: 006/010 costs: 0.001210879
 Main training accuracy: 1.000
 Secondary training accuracy: 1.000
Epoch: 007/010 costs: 0.001055906
 Main training accuracy: 1.000
 Secondary training accuracy: 1.000
Epoch: 008/010 costs: 0.000711913
 Main training accuracy: 1.000
 Secondary training accuracy: 1.000
Epoch: 009/010 costs: 0.000540464
 Main training accuracy: 1.000
 Secondary training accura

## Feladatok
1. teszteljük a hálót!
2. készítsünk új másodlagos feladatot! (pl szögletes, görbe vonalak és vegyesen)
3. paraméterek optimalizálása