Keras Programming

DAY 7 - 9

Keras

  • Keras Neural Networks:
    • if hidden layer eliminated, changed into other pure classifications (binary, softmax, etc)
    • if pure (without hidden layers) show better accuracy => Don’t use hidden layers
  • Keras Tips :

     [activation] setting in
    [Dense] keras.layers in
    [add] function
    [loss] setting in
    [compile] function
    Linear Regression mse (mean squre error)
    Binary Classificationsigmoidbinary_crossentropy
    Softmax Classificationsoftmaxcategorical_crossentropy
  • Disadvanatages of Keras : no control (modifications) of layers (only add)
    • academics => PyTorch
    • industry => Keras (quick to implememnt, simple)

MNIST Data Set

  • Hand-written images and their labels
    • For training : 60,000
    • For Testing : 10,000
  • Image : 28 - by 28 (pixels)

CodeExplanation

  • Neural Network Characteristics: Compression Impact:
    • ex) 784 size input -> hidden layer with 1 unit
    • => 784 unit compressed into 1 unit => loss of information on features
    • => IF YOU DON’T HAVE MUCH INFORMATION ABOUT YOUR INPUT, MAINTAIN AS MUCH AS POSSIBLE DURING HIDDEN LAYER

    • Mnist Data Set 특징: compression is pretty Okay because
      1. Simple
      2. Black-and-white
      3. Content is centered

Linear Regression Implementation

import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense

# Data
x_data = np.array([[1], [2], [3]])
y_data = np.array([[1], [2], [3]])

#Model, Cost, Train
model = Sequential()
model.add(Dense(1, input_dim=1 )) # 1 output only
model.compile(loss = 'mse', optimizer = 'adam') # adam | sgd (up to you)
model.fit(x_data, y_data, epochs= 1000, verbose = 0)
model.summary()

# Infernece
print(model.get_weights())
print(model.predict(np.array([4]))) # 4 expected

#Plot
plt.scatter(x_data, y_data)
plt.plot(x_data, y_data)
plt.grid(True)
plt.show()

  • Summary Output
    Model: "sequential"
    _________________________________________________________________
    Layer (type)                Output Shape              Param #   
    =================================================================
    dense (Dense)               (None, 1)                 2  [W = 1 + b = 1]       
                                                                      
    =================================================================
    Total params: 2
    Trainable params: 2
    Non-trainable params: 0
    _________________________________________________________________
    
  • Inference Output
    [array([[1.1798639]], dtype=float32), array([-0.38375837], dtype=float32)]
    1/1 [==============================] - 0s 53ms/step
    [[4.335697]] => almost '4' => Correct
    
  • Plot Output

CodeExplanation

Binary Classification Implementation

import numpy as np
from keras.models import Sequential
from keras.layers import Dense

# Data
x_data = np.array([[1, 2], [2, 3], [3, 1], [4,3], [5, 3], [6, 2]])
y_data = np.array([[0], [0], [0], [1], [1], [1]])

#Model, Cost, Train
model = Sequential()
model.add(Dense(1, activation='sigmoid')) # Just 1 classification (0 또는 1)
model.compile(loss = 'binary_crossentropy', optimizer = 'sgd', metrics=['accuracy'])
model.fit(x_data, y_data, epochs= 10000, verbose = 1)
model.summary()

# Infernece
print(model.get_weights()) 
print(model.predict(x_data))
  • Model Output
    Model: "sequential"
    _________________________________________________________________
    Layer (type)                Output Shape              Param #   
    =================================================================
    dense (Dense)               (None, 1)                 3 [W = 2 + b = 1]        
                                                                     
    =================================================================
    Total params: 3
    Trainable params: 3
    Non-trainable params: 0
    _________________________________________________________________
    
  • Inference Output
    [array([[1.4749366 ],
         [0.31867325]], dtype=float32), array([-5.576844], dtype=float32)]
    1/1 [==============================] - 0s 70ms/step
    [[0.03033758] => 0
    [0.15829739]  => 0
    [0.30293486]  => 0
    [0.78226626]  => 1
    [0.94013095]  => 1
    [0.98035556]] => 1  => Same with out sample data (y_data)
    

Softmax Classification Implementation

import numpy as np
from keras.models import Sequential
from keras.layers import Dense

# Data
x_data = np.array([[1,2,1,1], [3,1,3,4], [4,1,5,5], [1,7,5,5], [1,2,5,6], [1,6,6,6], [1,7,7,7]])

y_data = np.array([[0,0,1], [0,0,1], [0,1,0], [0,1,0], [0,1,0], [1,0,0], [1,0,0]]) # 3 Classifications possible

# Model, Cost

model = Sequential()
model.add(Dense(3, activation='softmax')) # units=3 cuz 3 classifications
model.compile(loss = 'categorical_crossentropy', optimizer = 'sgd', metrics=['accuracy'])

# Training
model.fit(x_data, y_data, epochs = 10000, verbose = 1)
model.summary()

# Inference
y_predict = model.predict(np.array([[1,11,7,9]]))
print(y_predict)
print('argmax : ', np.argmax(y_predict)) # returns index of max value 

ANN implementation

from keras.utils import np_utils
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense

#MNIST data

# 28*28*60,000; 60,000 numbers;  28*28*10,000; 10,000 numbers
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
#Check Shape
print(train_images.shape, train_labels.shape, test_images.shape, test_labels.shape) 

# Change Shape 3D (28*28*60,000) -> 2D (60,000 * 784)
# and change float
train_images = train_images.reshape(train_images.shape[0], 784).astype('float32')/255.0
# Change value (10,000 , ) -> (10,000 * 10) 
# for number of classifications and change float
test_images = test_images.reshape(test_images.shape[0], 784).astype('float32')/255.0

train_labels = np_utils.to_categorical(train_labels) # One-Hot Encoding
test_labels = np_utils.to_categorical(test_labels) # One-Hot Encoding

print(train_images.shape, train_labels.shape, test_images.shape, test_labels.shape)

# Model

model = Sequential()
model.add(Dense(256, activation='relu')) # Hidden Layer #1: units=256, activation = 'relu'
model.add(Dense(256, activation='relu')) # units=256, activation = 'relu'
model.add(Dense(256, activation='relu')) # units=256, activation = 'relu'
model.add(Dense(10, activation='softmax')) # Output Layer: Distinguish between numbers 0 ~ 9

# Cost (Loss) Function binary 아닌 categorical, Soft Gradient Descent
model.compile(loss = 'categorical_crossentropy', optimizer = 'sgd', metrics=['accuracy'])

#Training                           
model.fit(train_images, 
          train_labels, 
          epochs = 5,    #iterations: 5 times (5 * 60,000 data)
          batch_size=32, # (5*60000) = 300,000 data divided by batch_size for input
          verbose = 1)   # verbose = 1 : see intermediate results
                         # verbose = 0 : don't see 중간 결과
                                              
#Testing (the test images provided by Mnist)
cost, accuracy = model.evaluate(test_images, test_labels)

print('Accuracy: ', accuracy)
model.summary()
  • Testing Output (verbose, by Epoch)
    Epoch 1/5
    1875/1875 [==============================] - 11s 6ms/step - loss: 0.6092 - accuracy: 0.8393
    Epoch 2/5
    1875/1875 [==============================] - 11s 6ms/step - loss: 0.2518 - accuracy: 0.9267
    Epoch 3/5
    1875/1875 [==============================] - 11s 6ms/step - loss: 0.1945 - accuracy: 0.9441
    Epoch 4/5
    1875/1875 [==============================] - 10s 6ms/step - loss: 0.1592 - accuracy: 0.9543
    Epoch 5/5
    1875/1875 [==============================] - 11s 6ms/step - loss: 0.1347 - accuracy: 0.9602
    313/313 [==============================] - 1s 3ms/step - loss: 0.1313 - accuracy: 0.9608
    
  • Accuracy Increases => final accuracy = 96%

  • sometimes when testing with real values, the accuracy is lower than the training accuracy.
    • reason => Test Data and Training Data differs a lot
  • Summary Output (verbose, by Epoch)

    Model: "sequential_7"
    _________________________________________________________________
    Layer (type)                Output Shape              Param #  [W + b] 
    =================================================================
    dense_25 (Dense)            (None, 256)               200960 [(786*256) + 256]   
                                                                      
    dense_26 (Dense)            (None, 256)               65792  [(256*256) + 256]   
                                                                      
    dense_27 (Dense)            (None, 256)               65792  [(256*256) + 256]   
                                                                      
    dense_28 (Dense)            (None, 10)                2570   [(256*10) + 10]   
                                                                      
    =================================================================
    Total params: 335,114 [전체 합]
    Trainable params: 335,114
    Non-trainable params: 0
    _________________________________________________________________
    
  • One Hot Encoding
    • 1의 위치에 따라서 labeling
    • ex) [ 1, 0, 0, 0, 0 …] => 0
    • in case of softmax => Number of outputs = Number of classifications