a) The SoftMax and Sigmoid functions are both commonly used activation functions in machine learning.
SoftMax function:
- The SoftMax function is used primarily in multi-class classification problems where the output needs to represent probabilities for each class.
- It takes an input vector and transforms it into a probability distribution over multiple classes, ensuring that the sum of all probabilities is equal to 1.
- SoftMax function is defined as: SoftMax(x) = exp(xᵢ) / Σ(exp(xⱼ)) for all classes, where xᵢ is the input for class i and xⱼ represents all input values.
- The SoftMax function applies an exponential operation to each element of the input vector, which amplifies larger values and suppresses smaller ones. It then normalizes the resulting values to ensure they sum up to 1.
Sigmoid function:
- The Sigmoid function is commonly used in binary classification problems where the output needs to represent a probability between 0 and 1.
- It takes an input, which can be positive or negative, and maps it to a value between 0 and 1 using a logistic function.
- Sigmoid function is defined as: Sigmoid(x) = 1 / (1 + exp(-x)).
- The Sigmoid function smoothly maps both positive and negative input values to a probability value between 0 and 1.
b) To convert the vector z=(5,6,0.7,-3,8) into probabilities using a SoftMax function:
1. Apply the exponential function to each element of the vector:
exp(z) = (exp(5), exp(6), exp(0.7), exp(-3), exp(8))
= (148.4132, 403.4288, 2.0138, 0.0498, 2980.9579)
2. Calculate the sum of the exponential values:
Σ(exp(z)) = 148.4132 + 403.4288 + 2.0138 + 0.0498 + 2980.9579
= 3535.8635
3. Calculate the softmax probabilities for each element of the vector:
SoftMax(z) = (148.4132/3535.8635, 403.4288/3535.8635, 2.0138/3535.8635, 0.0498/3535.8635, 2980.9579/3535.8635)
≈ (0.042, 0.114, 0.001, 0.00001, 0.843)
Therefore, the vector z=(5,6,0.7,-3,8) can be converted into probabilities using the SoftMax function as approximately (0.042, 0.114, 0.001, 0.00001, 0.843).
6.) a) Discuss difference between SoftMax and Sigmoid function.
b) use a SoftMax function to convert a vector z=(5,6,0.7, -3,8] into probabilities.
1 answer