Skip to content

Neural network activation functions #858

Closed
@Beliavsky

Description

@Beliavsky

I think neural networks may be too broad a topic for stdlib (there are a number of Fortran projects in this area), but activation functions and their derivatives could be considered.

Activity

jalvesz

jalvesz commented on Aug 10, 2024

@jalvesz
Contributor

That's a good idea, maybe something like this could be a starting point:

Click me: stdlib_math_activations.fypp
#:include "common.fypp"
module stdlib_math_activations
    use stdlib_kinds, only: int8, int16, int32, int64, sp, dp, xdp, qp
    implicit none
    private

    interface gaussian
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: gaussian_${k1}$
        #:endfor
    end interface
    public :: gaussian

    interface gaussian_grad
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: gaussian_grad_${k1}$
        #:endfor
    end interface
    public :: gaussian_grad

    interface elu
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: elu_${k1}$
        #:endfor
    end interface
    public :: elu

    interface elu_grad
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: elu_grad_${k1}$
        #:endfor
    end interface
    public :: elu_grad

    interface relu
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: relu_${k1}$
        #:endfor
    end interface
    public :: relu

    interface relu_grad
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: relu_grad_${k1}$
        #:endfor
    end interface
    public :: relu_grad

    interface gelu
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: gelu_${k1}$
        #:endfor
    end interface
    public :: gelu

    interface gelu_grad
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: gelu_grad_${k1}$
        #:endfor
    end interface
    public :: gelu_grad

    interface gelu_approx
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: gelu_approx_${k1}$
        #:endfor
    end interface
    public :: gelu_approx

    interface gelu_approx_grad
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: gelu_approx_grad_${k1}$
        #:endfor
    end interface
    public :: gelu_approx_grad

    interface sigmoid
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: sigmoid_${k1}$
        #:endfor
    end interface
    public :: sigmoid

    interface sigmoid_grad
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: sigmoid_grad_${k1}$
        #:endfor
    end interface
    public :: sigmoid_grad

    interface step
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: step_${k1}$
        #:endfor
    end interface
    public :: step

    interface step_grad
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: step_grad_${k1}$
        #:endfor
    end interface
    public :: step_grad

    interface Softmax
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: softmax_${k1}$
        #:endfor
    end interface
    public :: softmax

    interface Softmax_grad
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: Softmax_grad_${k1}$
        #:endfor
    end interface
    public :: Softmax_grad

    interface Softplus
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: Softplus_${k1}$
        #:endfor
    end interface
    public :: Softplus

    interface Softplus_grad
        #:for k1, t1 in REAL_KINDS_TYPES
        module procedure :: Softplus_grad_${k1}$
        #:endfor
    end interface
    public :: Softplus_grad
    
    #:for k1, t1 in REAL_KINDS_TYPES
    ${t1}$, parameter :: isqrt2_${k1}$ = 1_${k1}$ / sqrt(2._${k1}$)
    #:endfor

contains

!==================================================
! Gaussian
!==================================================
#:for k1, t1 in REAL_KINDS_TYPES
elemental ${t1}$ function gaussian_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = exp(-x**2)
end function

elemental ${t1}$ function gaussian_grad_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = -2_${k1}$ * x * exp(-x**2)
end function

#:endfor

!==================================================
! Exponential Linear Unit
!==================================================
#:for k1, t1 in REAL_KINDS_TYPES
elemental ${t1}$ function elu_${k1}$( x , a ) result ( y )
    ${t1}$, intent(in) :: x
    ${t1}$, intent(in) :: a
    !==================================================
    if(x >= 0_${k1}$)then
        y = x
    else
        y = a * (exp(x) - 1_${k1}$)
    end if
end function

elemental ${t1}$ function elu_grad_${k1}$( x , a ) result ( y )
    ${t1}$, intent(in) :: x
    ${t1}$, intent(in) :: a
    !==================================================
    if(x >= 0_${k1}$)then
        y = 1_${k1}$ 
    else
        y = a * exp(x)
    end if
end function

#:endfor

!==================================================
! Rectified Linear Unit
!==================================================
#:for k1, t1 in REAL_KINDS_TYPES
elemental ${t1}$ function relu_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = max(0._${k1}$, x)
end function

elemental ${t1}$ function relu_grad_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    if(x > 0_${k1}$)then
        y = 1_${k1}$ 
    else
        y = 0_${k1}$
    end if
end function

#:endfor

!==================================================
! GELU: Gaussian Error Linear Units function
!==================================================
#:for k1, t1 in REAL_KINDS_TYPES
elemental ${t1}$ function gelu_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = 0.5_${k1}$ * x * (1 + erf(x * isqrt2_${k1}$))
end function

elemental ${t1}$ function gelu_grad_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = 0.5_${k1}$ * (1 + erf(x * isqrt2_${k1}$) )
    y = y + x * isqrt2_${k1}$ * exp( - 0.5_${k1}$ * x**2 )
end function

#:endfor

#:for k1, t1 in REAL_KINDS_TYPES
elemental ${t1}$ function gelu_approx_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = 0.5_${k1}$ * x * (1 + erf(x * isqrt2_${k1}$))
end function

elemental ${t1}$ function gelu_approx_grad_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = 0.5_${k1}$ * (1 + erf(x * isqrt2_${k1}$) )
    y = y + x * isqrt2_${k1}$ * exp( - 0.5_${k1}$ * x**2 )
end function

#:endfor

!==================================================
! Sigmoid
!==================================================
#:for k1, t1 in REAL_KINDS_TYPES
elemental ${t1}$ function sigmoid_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = 1_${k1}$ / (1_${k1}$ + exp(-x))
end function

elemental ${t1}$ function sigmoid_grad_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = exp(x) / (1_${k1}$ + exp(x))**2
end function

#:endfor

!==================================================
! Step
!==================================================
#:for k1, t1 in REAL_KINDS_TYPES
elemental ${t1}$ function Step_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    if(x > 0_${k1}$)then
        y = 1_${k1}$ 
    else
        y = 0_${k1}$
    end if
end function

elemental ${t1}$ function Step_grad_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = 0_${k1}$
end function

#:endfor

!==================================================
! tanh
!==================================================
#:for k1, t1 in REAL_KINDS_TYPES
elemental ${t1}$ function tanh_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = tanh(x)
end function

elemental ${t1}$ function tanh_grad_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = 1_${k1}$ - tanh(x)**2
end function

#:endfor

!==================================================
! Softmax
!==================================================
#:for k1, t1 in REAL_KINDS_TYPES
pure function Softmax_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x(:)
    ${t1}$ :: y(size(x))
    !==================================================
    y(:) = exp(x(:) - maxval(x(:)) )
    y(:) = y(:) / sum(y(:))
end function

pure function Softmax_grad_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x(:)
    ${t1}$ :: y(size(x))
    !==================================================
    y = softmax_${k1}$(x)
    y = y * (1_${k1}$ - y)
end function

#:endfor

!==================================================
! Softplus
!==================================================
#:for k1, t1 in REAL_KINDS_TYPES
elemental ${t1}$ function Softplus_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = log(exp(x) + 1_${k1}$)
end function

elemental ${t1}$ function Softplus_grad_${k1}$( x ) result( y )
    ${t1}$, intent(in) :: x
    !==================================================
    y = exp(x) / (exp(x) + 1_${k1}$)
end function

#:endfor

end module

Some of them would be more interesting with the fast versions of some of the intrinsic functions. A companion stdlib_math_fast or stdlib_fast_math could be included.

Beliavsky

Beliavsky commented on Mar 7, 2025

@Beliavsky
Author

Since the GELU function is included in the code of @jalvesz, the approximations to it described at https://www.johndcook.com/blog/2025/03/06/gelu/ could also be considered for addition. Other posts by John D. Cook discussing activation functions are https://www.johndcook.com/blog/2023/08/06/swish-swiss/ and https://www.johndcook.com/blog/2023/08/06/swish-mish-and-serf/.

jalvesz

jalvesz commented on Mar 7, 2025

@jalvesz
Contributor

Can his formula be adopted in stdlib without licensing issues?

It seems it should not, as mathematical formulas (as opposed to code) are in principle not protected by of copy-right. If someone else could confirm, I can include this formula in the PR.

jalvesz

jalvesz commented on Mar 29, 2025

@jalvesz
Contributor

@Beliavsky would you mind reviewing the PR #860 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @Beliavsky@jalvesz

      Issue actions

        Neural network activation functions · Issue #858 · fortran-lang/stdlib