RFC: Orthogonal EmbeddingService Architecture for Flexible Embedding Provider Support #3863

nnunley · 2025-08-02T20:56:40Z

nnunley
Aug 2, 2025

RFC: Orthogonal EmbeddingService Architecture for Flexible Embedding Provider Support

Summary

This RFC proposes refactoring Goose's embedding functionality into an orthogonal EmbeddingService trait, separate from the main Provider trait. This design enables mixing different embedding providers with chat providers (e.g., Ollama embeddings with OpenAI chat), improves local embedding model support, and resolves existing compatibility issues.

Motivation

Currently, embedding functionality is tightly coupled to the Provider trait, which creates several limitations:

Fixed Provider Coupling: Users cannot mix embedding providers with chat providers (e.g., cannot use local Ollama embeddings with OpenAI chat)
Local Model Issues: Vector tool selection fails with local embedding models (Router Tool Selection Strategy Vector always fails with local text embedding models #3027)
Limited Flexibility: No way to use specialized embedding models while keeping preferred chat models
Dimension Mismatches: Different embedding models have different dimensions (OpenAI: 1536, Nomic: 768) causing compatibility issues

Related Issues This Solves

Fixes Router Tool Selection Strategy Vector always fails with local text embedding models #3027 - "Router Tool Selection Strategy Vector always fails with local text embedding models"
- The orthogonal design properly handles different embedding dimensions
- Removes hardcoded assumptions about embedding sizes
Addresses remote Ollama limitations (Allow remote Ollama #844, Allow custom Ollama #846)
- New architecture supports custom endpoints for embedding services
- Enables using remote Ollama instances for embeddings

Proposed Solution

1. New `EmbeddingService` Trait

#[async_trait]
pub trait EmbeddingService: Send + Sync + std::fmt::Debug {
    fn model_name(&self) -> &str;
    fn capabilities(&self) -> &EmbeddingCapabilities;
    async fn embed(&self, texts: Vec<String>) -> Result<Vec<Vec<f32>>, ProviderError>;
}

pub struct EmbeddingCapabilities {
    pub dimensions: usize,
    pub max_tokens: usize,
    pub max_batch_size: usize,
}

2. Provider Trait Extension

pub trait Provider {
    // ... existing methods ...

    /// Returns an embedding service if this provider supports embeddings
    fn embedding_service(&self) -> Option<Arc<dyn EmbeddingService>> {
        None
    }

    // Deprecate old methods
    #[deprecated(note = "Use embedding_service() instead")]
    fn supports_embeddings(&self) -> bool { ... }

    #[deprecated(note = "Use embedding_service().embed() instead")]
    async fn create_embeddings(&self, texts: Vec<String>) -> Result<Vec<Vec<f32>>> { ... }
}

3. Concrete Implementations

OllamaEmbeddingService: Supports both native /api/embeddings and OpenAI-compatible endpoints
OpenAIEmbeddingService: Handles different OpenAI embedding models with proper dimensions
Auto-detection of embedding dimensions based on model

Benefits

Flexibility: Mix and match embedding providers with chat providers
- Use cost-effective local embeddings with premium chat models
- Use specialized embedding models for specific domains
Better Local Support: Properly handles different embedding dimensions
- Fixes vector tool selection with local models
- Supports various embedding model architectures
Backward Compatibility: Maintains existing API with deprecation warnings
- Existing code continues to work
- Clear migration path for providers
Extensibility: Easy to add new embedding providers
- Clean separation of concerns
- Provider-specific optimizations possible

Implementation Status

I have a working implementation that includes:

Core EmbeddingService trait and infrastructure
OllamaEmbeddingService with dimension auto-detection
OpenAIEmbeddingService with model-specific dimensions
Updated VectorToolSelector to use EmbeddingService
Comprehensive test suite
Backward compatibility with deprecation warnings

Example Usage

// Use Ollama for embeddings with OpenAI for chat
let chat_provider = OpenAIProvider::new()?;
let embedding_service = OllamaEmbeddingService::new(Some("nomic-embed-text".to_string()))?;

// Or use OpenAI embeddings with Ollama chat
let chat_provider = OllamaProvider::new()?;
let embedding_service = OpenAIEmbeddingService::new(Some("text-embedding-3-small".to_string()))?;

Questions for Discussion

Should we auto-detect embedding dimensions or require explicit configuration?
How should we handle the migration period for deprecated methods?
Should embedding service configuration be separate from provider configuration in YAML?
Are there other embedding providers the community would like to see supported?

Next Steps

If this approach is approved, I can submit a PR with the implementation. The changes are designed to be backward compatible while providing a clear path forward for more flexible embedding support.

DOsinga · 2025-08-05T18:04:28Z

DOsinga
Aug 5, 2025
Maintainer

converted it to a discussion. I like this a lot for what it is worth

0 replies

taniashiba · 2025-08-08T15:53:31Z

taniashiba
Aug 8, 2025
Maintainer

Thank you for making this a discussion! Interested to see what others have to say. @The-Best-Codes @iandouglas @michaelneale

0 replies

The-Best-Codes · 2025-08-08T19:07:12Z

The-Best-Codes
Aug 8, 2025

Don't forget other embedding providers, like Google embedding models for instance!

The plan seems solid to me. Once I see some code in a PR I might have more feedback :)

0 replies

michaelneale · 2025-08-11T05:00:25Z

michaelneale
Aug 11, 2025
Maintainer

I agree should not be coupled to a provider necessarily, perfectly valid to even use a local embedding model with a frontier chat model.

one other thing to note - if this is for router tool selection - may not need this at all: #3933 (there may be other uses of embeddings and so on so it may end up still valid)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: Orthogonal EmbeddingService Architecture for Flexible Embedding Provider Support #3863

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

RFC: Orthogonal EmbeddingService Architecture for Flexible Embedding Provider Support #3863

Uh oh!

nnunley Aug 2, 2025

RFC: Orthogonal EmbeddingService Architecture for Flexible Embedding Provider Support

Summary

Motivation

Related Issues This Solves

Proposed Solution

1. New EmbeddingService Trait

2. Provider Trait Extension

3. Concrete Implementations

Benefits

Implementation Status

Example Usage

Questions for Discussion

Next Steps

Replies: 4 comments

Uh oh!

DOsinga Aug 5, 2025 Maintainer

Uh oh!

taniashiba Aug 8, 2025 Maintainer

Uh oh!

The-Best-Codes Aug 8, 2025

Uh oh!

michaelneale Aug 11, 2025 Maintainer

nnunley
Aug 2, 2025

1. New `EmbeddingService` Trait

DOsinga
Aug 5, 2025
Maintainer

taniashiba
Aug 8, 2025
Maintainer

The-Best-Codes
Aug 8, 2025

michaelneale
Aug 11, 2025
Maintainer