XenoEngineer: Created page with "{{menuLMPrimer}}
===Principles of Sinusoidal Positioning=== ====Periodicity, Frequency, Orthogonality==== In natural language processing (NLP) and deep learning, positional encoding is a technique used to incorporate information about the position of each token in a sequence into a vector representation. This allows the model to capture sequential relationship..."

2024-08-09T11:50:42Z

Created page with "{{menuLMPrimer}} <div style="background-color:azure; border:1px outset azure; padding:0 20px; max-width:860px; margin:0 auto; "> ===Principles of Sinusoidal Positioning=== ====Periodicity, Frequency, Orthogonality==== In natural language processing (NLP) and deep learning, positional encoding is a technique used to incorporate information about the position of each token in a sequence into a vector representation. This allows the model to capture sequential relationship..."

New page

{{menuLMPrimer}}
<div style="background-color:azure; border:1px outset azure; padding:0 20px; max-width:860px; margin:0 auto; ">
===Principles of Sinusoidal Positioning===
====Periodicity, Frequency, Orthogonality====

In natural language processing (NLP) and deep learning, positional encoding is a technique used to incorporate information about the position of each token in a sequence into a vector representation. This allows the model to capture sequential relationships between tokens, even when they don't have direct connections.

One popular method for positional encoding is sinusoidal positional encoding, introduced by Vaswani et al. in their 2017 paper "Attention is All You Need". The idea is to use sine and cosine functions to create a vector representation that captures the position of each token.

The SPE formula is as follows:

<code>
positional_encoding = sin(10000^(2i/sequence_length)) + cos(10000^(2i/sequence_length))
</code>

Where:

<big>'''i'''</big> is the index of the token in the sequence<br/>
<big>'''sequence_length'''</big> is the length of the input sequence

The resulting vector has a fixed dimension, typically set to 512. The sine and
cosine functions are evaluated at different frequencies for each token
position, creating a unique vector representation.

'''1. Periodicity:''' SPE uses periodic functions (sine and cosine) to capture the position of tokens in a sequence.

'''2. Frequency:''' The frequency of the sine and cosine functions increases as you move along the sequence, allowing the model to distinguish between different positions.

'''2. Orthogonality:''' The vectors generated by SPE are orthogonal to each other, which helps prevent overfitting and improves generalization.

By incorporating positional information into a vector representation, SPE enables models like transformers to learn complex relationships between tokens in a sequence.
</div>

Principles of Sinusoidal Positioning - Revision history