[DeepLearning Network 연산] Artificial Neurons와 Dense Layers 개념과 코드 구현

💡 Index

1. Artificial Neurons

1) Parametric Functions

Normal Function : 단순히 입력만 받고 규칙에 따라 아웃풋을 만들어내는 함수
Parametric Function : 자체적으로 파라미터를 가지고 z 값을 연산하는 데 사용 됨.
- f(x ; $\theta$ ) 에서 x는 Input이며 $\theta$ 는 함수가 가지고 있는 parameter를 뜻함.
- e.g) y = ax + b → f(x ; a,b) : a는 slob, b는 intercept라 하는 데 파라미터 값이 변화하면 함수의 모양이 변화하고 함수의 작동(output을 만드는 연산) 자체가 달라진다.

2) Hierachy of Tensor Computations

(1) Zeroth-order Tensor Operations ; 0차원 텐서 (scalar)

(2) First-order Tensor Operations ; 1차원 텐서 (vector)

(3) Second-order Tensor Operations ; 2차원 텐서 (metrix)

(4) Third-order Tensor Operations ; 3차원 텐서

3) Affince Functions with N features

(1) Affine Functions with 1 feature (하나의 x값만을 가지는 경우)

이때 Input인 x의 값은 scala이며 w,b는 함수에 내재되어 있는 파라미터를 의미한다. 또한 output인 z 역시 scala의 값을 가진다.

Affine Transformation : $z = xw + b$ ; weight를 input에 곱해준 후 bias를 더해주는 과정을 말한다.
Weighted Sum : $z = xw$

(2) Affine Functions with N Features (여러개의 x 값을 가지는 경우)

Weighted Sum : $(\vec{x})^T\vec{w} $
- 각각의 x에 붙어있는 weight들은 z 값을 구할 때 그 x가 얼마나 중요한 값인지를 나타낸다.
- 만약 weight값이 크다면, x의 값이 조금만 변해도 값이 크게 달라지므로
Affine Transformation : $(\vec{x})^T\vec{w} + b$
- 이때 $(\vec{x})^T$와 $\vec{w}$ 는 동일한 차원의 값을 가진다.
- (당연함! w는 각각의 모든 x에 곱해지는 것이므로)

4) Activation Functions

(1) Sigmoid

$g(x) = \sigma(x) = \frac{1}{1+e^{-x}}$

(2) Tanh

$g(x) = tanh(x) = \frac{e^x - e^{-x}}{e^x+e^{-x}}$

(3) ReLU

가장 흔하게 사용됨
$g(x) = ReLU(x) = max(0,x)$

5) Artificial Neurons

Artificial Neuron = Affine Function + Activation Function
$f((\vec{x})^T ; \vec{w}, b)$ : Affine Function
$g(z)$ : Activation Function$\Downarrow$$\nu(\vec{x}; \vec{w} , b) = g((\vec{x})^T\vec{w} + b)$
f 함수와 g 함수를 합쳐서 하나의 뉴런 $\nu$로 나타낸다면, $ \nu(\vec{x}; \vec{w} , b) = g((\vec{x})^T\vec{w} + b) $

6) Minibatch in Artificial Neurons

Minibatch : AI가 학습할 때 거대한 양의 데이터를 한꺼번에 학습하지 않고 단위별로 쪼개서 하는 것.

7) Artificial Neuron 코드 구현하기

(1) Affine Function with N features

import tensorflow as tf

from tensorflow.keras.layers import Dense

x = tf.random.uniform(shape=(1,10), minval=0, maxval=10)

print(x.shape, '\n', x)

dense = Dense(units= 1)

y_tf = dense(x)

W,B = dense.get_weights()

y_man = tf.linalg.matmul(x,W) + B

# print results

print("===== Input / Weight / Bias ======")

print(f"x: {x.shape}\n {x.numpy()}\n")

print(f"w: {W.shape}\n {W}\n")

print(f"Bias: {B.shape}\n {B}\n")

# print manual

print("===== Outputs =====")

print(f"y(Tensorflow): {y_tf.shape}\n {y_tf.numpy()}\n")

print(f"y(manual): {y_man.shape}\n {y_man.numpy()}")

(2) Activation Layers

import tensorflow as tf

from tensorflow.math import exp, maximum

from tensorflow.keras.layers import Activation

x = tf.random.normal(shape=(1,5)) # input setting

# imp. activation function

sigmoid = Activation('sigmoid')

tanh = Activation('tanh')

relu = Activation('relu')

# forward propagation(Tensorflow)

y_sigmoid_tf = sigmoid(x)

y_tanh_tf = tanh(x)

y_relu_tf = relu(x)

# forward propagation(manual)

y_sigmoid_man = 1 / (1 + exp(-x))

y_tanh_man = (exp(x) - (exp(x))) / (exp(x) + exp(-x))

y_relu_man = maximum(x,0)

# print result

print(f"x: {x.shape}\n, {x.numpy}\n")

print(f"Sigmoid(Tensorflow): {y_sigmoid_tf.shape}\n {y_sigmoid_tf.numpy()}")

print(f"Sigmoid(manual): {y_sigmoid_man.shape}\n {y_sigmoid_man.numpy()}\n")

print(f"Tanh(Tensorflow): {y_tanh_tf.shape}\n {y_tanh_tf.numpy()}")

print(f"Tanh(manual): {y_tanh_man.shape}\n {y_tanh_man.numpy()}\n")

print(f"ReLU(Tensorflow): {y_relu_tf.shape}\n {y_relu_tf.numpy()}")

print(f"ReLU(manual): {y_relu_man.shape}\n {y_relu_man.numpy()}\n")

2. Dense Layers

1) Neuron Vectors and Layers

Neuron ; Affine F + Activation F
Layer : Filter들의 묶음 (Filter Bank)
- 하나의 뉴런이 각각 다른 weight과 bias를 가지고 있다고 하면, layer안의 각 뉴런들은 서로 다른 filtering 기능을 수행하고 있다고 볼 수 있음.
- filter들의 묶음을 filter bank라고 한다.

답러닝의 구조 → Layer들이 계속 연결되어 있는 구조 (cascaded 구조)

2) Dense Layer

Dense Layer : Neuron마다 Input이 전부 연결되어 있는 것을 말함.

Source

Dense Layer에 대해 자세히 뜯어보기 전에, 전체적인 구조는 위 그림과 같다. Input Layer에서 x가 벡터 형태로 $X_n$개가 들어가게 되면, 이에 대해 모든 뉴런에 각각의 x가 모두 들어가는 Dense Layer에 연결된다. 이후 최종적으로 Activation Function을 통해 특정 결과값을 도출하게 된다.

본래 딥러닝에는 굉장히 많은 dense layer가 층을 이루고 있겠지만, 이해하기 쉽게 하나의 레이어에 대해서만 생각해보도록 하면 다음 그림과 같다.

우선 $l_I$개의 input이 Dense Layer [1]의 모든 뉴런들에 들어가게 된다. 그리고 뉴런의 개수는 $l_1$개라고 하자. 이때 각 뉴런은 Affine Transformation 과정을 거치는 것이다. 각 x가 얼마나 중요한지에 따라 weight 값이 조정되기에 w의 개수는 x의 개수와 동일해야 한다. 그렇기에 초록색으로 표시된 $ \vec{w_{l1}}^{[1]} $ 부분을 보면 $l_1$번째 뉴런의 weight 값을 나타낸 것인데, $ l_I \times 1 $ matrix 형태의 weight 값이 있는 것이다.

이 하나의 Layer에 대한 전체 weight와 bias 벡터의 표현은 우측과 같다.

(1) FP of Dense Layer

그렇다면 좀 더 자세하게 첫 번째 레이어의 연산에 대해 자세히 알아보도록 한다.

위에서 첫 번째 dense layer에서 연산을 거친 아웃풋은 우측과 같이 표현될 수 있다.

(2) Generalized Dense Layer

위의 첫 번째 레이어에 대해 이해했다면, 전체 여러 개의 dense layer를 거치는 일반화된 식은 아래와 같이 표현될 수 있다.

3) Cascade Dense Layer 코드 구현

import tensorflow as tf

from tensorflow.keras.layers import Dense

N, n_feature = 4,10

X = tf.random.normal(shape=(N, n_feature))

n_neurons = [3,5] # 각 dense layer에서 사용할 뉴런 개수

dense1 = Dense(units = n_neurons[0], activation = 'sigmoid')

dense2 = Dense(units = n_neurons[1], activation = 'sigmoid')

# forward propagation

A1 = dense1(X)

Y = dense2(A1)

# get weight/bias

W1, B1 = dense1.get_weights()

W2, B2 = dense2.get_weights()

print(f"X: {X.shape}\n")

print(f"W1: {W1.shape}")

print(f"B1: {B1.shape}")

print(f"A: {A1.shape}\n")

print(f"W2: {W2.shape}")

print(f"B2: {B2.shape}")

print(f"Y: {Y.shape}\n")

python class 객체를 활용하여 Model-subclassing 방식으로 구현하면 다음과 같다.

from tensorflow.keras.layers import Dense

from tensorflow.keras.models import Model

class TestModel(Model): # 여기서의 Model은 바로 윗줄에서 import한 Model class를 받는 것

def __init__(self):

super(TestModel, self).__init__() # model subclassing

self.dense1 = Dense(units= 10, activation = 'sigmoid')

self.dense2 = Dense(units=20, activation = 'sigmoid')

def call(self, x):

x = self.dense1(x)

x = self.dense2(x)

return x

model = TestModel()

Y = model(X)

'⛓️ ML ・ DL' 카테고리의 다른 글

[Structuring ML Project] Orthogonalization (1)	2023.09.27
[CNN] MLP와 Convolutional Neural Network(컨볼루션 신경망) (1)	2023.07.13

Yoo's Data Log

[DeepLearning Network 연산] Artificial Neurons와 Dense Layers 개념과 코드 구현

1. Artificial Neurons

1) Parametric Functions

2) Hierachy of Tensor Computations

(1) Zeroth-order Tensor Operations ; 0차원 텐서 (scalar)

(2) First-order Tensor Operations ; 1차원 텐서 (vector)

(3) Second-order Tensor Operations ; 2차원 텐서 (metrix)

(4) Third-order Tensor Operations ; 3차원 텐서

3) Affince Functions with N features

(1) Affine Functions with 1 feature (하나의 x값만을 가지는 경우)

(2) Affine Functions with N Features (여러개의 x 값을 가지는 경우)

4) Activation Functions

(1) Sigmoid

(2) Tanh

(3) ReLU

5) Artificial Neurons

6) Minibatch in Artificial Neurons

7) Artificial Neuron 코드 구현하기

(1) Affine Function with N features

(2) Activation Layers

2. Dense Layers

1) Neuron Vectors and Layers

2) Dense Layer

(1) FP of Dense Layer

(2) Generalized Dense Layer

3) Cascade Dense Layer 코드 구현

'⛓️ ML ・ DL' 카테고리의 다른 글

티스토리툴바

[DeepLearning Network 연산] Artificial Neurons와 Dense Layers 개념과 코드 구현

1. Artificial Neurons

1) Parametric Functions

2) Hierachy of Tensor Computations

(1) Zeroth-order Tensor Operations ; 0차원 텐서 (scalar)

(2) First-order Tensor Operations ; 1차원 텐서 (vector)

(3) Second-order Tensor Operations ; 2차원 텐서 (metrix)

(4) Third-order Tensor Operations ; 3차원 텐서

3) Affince Functions with N features

(1) Affine Functions with 1 feature (하나의 x값만을 가지는 경우)

(2) Affine Functions with N Features (여러개의 x 값을 가지는 경우)

4) Activation Functions​

(1) Sigmoid

(2) Tanh

(3) ReLU

5) Artificial Neurons

6) Minibatch in Artificial Neurons

7) Artificial Neuron 코드 구현하기

(1) Affine Function with N features

(2) Activation Layers

2. Dense Layers

1) Neuron Vectors and Layers

2) Dense Layer

(1) FP of Dense Layer

(2) Generalized Dense Layer

3) Cascade Dense Layer 코드 구현

'⛓️ ML ・ DL' 카테고리의 다른 글

관련글

티스토리툴바

4) Activation Functions

'⛓️ ML ・ DL' 카테고리의 다른 글