Eino: Embedding guide
Basic Introduction
The Embedding component is used to convert text into vector representations. Its main function is to map text content into a high-dimensional vector space, so that semantically similar texts are closer in the vector space. This component plays an important role in the following scenarios:
- Text similarity calculation
- Semantic search
- Text clustering analysis
Component Definition
Interface Definition
EmbedStrings Method
- Function: Convert a set of texts into vector representations
- Parameters:
- ctx: Context object, used to pass request-level information, and also for passing Callback Manager
- texts: List of texts to be converted
- opts: Conversion options, used to configure the conversion behavior
- Return values:
[][]float64
: List of vector representations corresponding to the texts, the dimension of each vector is determined by the specific implementation- error: Error information during the conversion process
Common Option
The Embedding component uses EmbeddingOption to define optional parameters. Below are the abstract common options. Each specific implementation can define its specific options, which can be wrapped into a unified EmbeddingOption type through the WrapEmbeddingImplSpecificOptFn function.
Options can be set as follows:
Usage
Standalone Usage
Code location: eino-ext/components/embedding/openai/examples/embedding
Usage in Orchestration
Code location: eino-ext/components/embedding/openai/examples/embedding
Option and Callback Usage
Option Usage Example
Callback Usage Example
Code location: eino-ext/components/embedding/openai/examples/embedding
Existing Implementations
- OpenAI Embedding: Generate vectors using OpenAI’s text embedding model Embedding - OpenAI
- ARK Embedding: Generate vectors using the ARK platform’s model Embedding - ARK
Custom Implementation Reference
When implementing a custom Embedding component, the following points need to be noted:
- Pay attention to handling common options
- Implement the callback mechanism properly
Option Mechanism
Custom Embedding needs to implement its own Option mechanism:
Callback Handling
The Embedder implementation needs to trigger callbacks at appropriate times. The framework has defined standard callback input and output structs: