Mutinomial Softmax Classification

DAY 4

Introduction

RegressionClassification
Linear Regression : Exam Score PredictionBinary Classification : Pass/Fail
Multi-Level (Softmax) Classification : Letter Grades

Softmax Classification


linearModel

  • Where should 🌟 be classified to ? (A, B, or C?)
  • only can be answered through Binary (0, 1) BUT you can ask a lot of questions
    • Q1) A or not A; sig = 0.9 (> 0.5)
    • Q2) B or not B; sig = 0.4 (< 0.5)
    • Q3) C or not C; sig = 0.1 (<> 0.5)
    • => classified to ‘A’
  • but in case A: 0.9999, B: 0.50001, C: 0.0001, how should we classify? (both A and B passes the threshold)

  • Softmax Classification does not use Thresholds. Instead, only the MAX VALUE is recognized
    • in case there are two or more same max values, one is picked randomly


A,B,C

  • instead of doing it three times, put it in matrix and just calculate once
  • find maximum value from (A, B, C)


과정

  • One-Hot Encoding (argmax) : change into vector form with one ‘1.0’

Cost Function: CROSS-ENTROPY

\(S(Y) = \bar Y\) (model)LOG FUNCTION\(L=Y\) (real)
0.7
0.2
0.1
\(C(S, L) = - \sum L_i log(S_i)\)1.0
0.0
0.0

understanding cost function

LSCostexplained
 [1, 0, 0] (=>A)\(-1 \cdot log1 - 0 \cdot log0 - 0 \cdot log0 = 0\)since A is the correct answer, Cost = 0
[1, 0, 0][0, 1, 0] (=>B)\(-1 \cdot log0 - 0 \cdot log1 - 0 \cdot log0 = \infty\)mismatch between 0, 1 => infinity
 [0, 0, 1] (=>C)\(-1 \cdot log0 - 0 \cdot log0 - 0 \cdot log1 = \infty\)same as above

PyTorch implementation🔥


과정

import torch
import torch.nn.functional as F

# Training Data
x_train = torch.FloatTensor([[1,2, 1, 1], [2,1, 3, 2], [3,1,3,4], [4,1, 5, 5], [1,7,5,5], [1,2,5,6], [1,6,6,6], [1,7,7,7]])
y_train = torch.FloatTensor([[0, 0, 1], [0, 0, 1], [0, 0, 1], [0, 1, 0], [0, 1, 0], [0, 1, 0], [1, 0, 0], [1, 0, 0]] )

index_y = torch.argmax(y_train, 1) # takes index position of max value (ex: [0,0,1] => index_y = 2)

W = torch.randn([4,3], requires_grad=True) 
b = torch.randn([3], requires_grad = True)  # 3 biases required
optimizer = torch.optim.SGD([W, b], lr = 0.01) 

def model_SoftmaxClassification(x):
  return F.softmax(torch.matmul(x, W) + b)

for step in range(50000):
  prediction = model_SoftmaxClassification(x_train)
  cost = F.cross_entropy(prediction, index_y)

  optimizer.zero_grad() # 0까지 optimize
  cost.backward()       
  optimizer.step()

x_test = torch.FloatTensor([[1,8,8,8]]) 
model_test = model_SoftmaxClassification(x_test) # returns in form [_, _, _] 
index = torch.argmax(model_test.detach(), 1).item() # max value index 
labels = ['A', 'B', 'C']
print('Model with [1,8,8,8] is', labels[index])

‘Model with [1,8,8,8] is A’