Interpolation, PCA/LDA, Overfitting

10 Jan 2023 in Notes / Artificialintelligence / Introductiontoai

DAY 10

Interpolation vs. Linear Regression

CodeExplanation

\[Y = a( a( a (X \cdot W_1 + b_1 ) \cdot W_2 + b_2) \cdot W_o + b_o)\]

REVIEW PIC

MNIST

where training data/labels are given (A LOT)
Most well-known function approximation with neural network : Deep Reinforcement Learning

Example : Suppose you have ( 1 x 100 ) vector, and need to choose 20 out of it
- FEATURE SELECTION : choose best 20 that fits certain criteria
- FEATURE EXTRACTION : NO CRITERIA, gotta choose \(W\) vector ( 100 x 20 ) that will make ( 1 x 20 ) vector
- HOW TO DETERMINE \(W\) Vector ? => PCA / LDA

MNIST

Maintains Shape
분포의 모형을 고르기 (최대한 넓게, 겹쳐도 상관 X)
example: Face Recognition
- color not required -> reduce dimension 3 times (eliminate RGB)
- consider only features of the face

MNIST

Consider Classification
Shape 고려 X, 최대한 안 섞이게 분류만 신경씀
Fisher’s linear discriminant function
\(J(w) = \frac {| \tilde \mu_1 -\tilde \mu_2 |^2 \rightarrow Maximize!}{\tilde S_1^2 -\tilde S_2^2 \rightarrow Minimize!}\)
- numerator (평균간의 거리) maximize => better classification
- denominator (분산 + 분산) minimize => number gets bigger
- Overfitting

MNIST

consider the best line (model) out of three
\(Q.\): Green Line or Black Line ?
- \(A.\): depends on how much dataset is provided
  - Currently : GREEN
  - Later on after more data, If TOO OVERFITTING: Smooth-ify the model (from Green to Black)

MNIST

Overfitting Problem Solutions:
1. Autoencoding
2. Dropout
3. Regularization (생략)

PCA (차원축소)
Make the model smoother by reducing dimensions
In Encoding Stage, outliers (필요없는, 너무 민감한 데이터) get eliminated
In Decoding Stage, the model is enlarged to its original size (but with more smooth data)

MNIST

tf.nn.dropout(layer, keep_prob=0.9)