feat: Add Strands Agents Documentation Sync and Search Functionality #19

robbrad · 2025-07-22T21:07:45Z

Pull Request: Add Strands Agents Documentation Sync and Search Functionality

Overview

This PR adds comprehensive documentation sync and search capabilities to the Strands Agents MCP Server. It introduces scripts to automatically sync documentation from the Strands Agents docs repository and builds an intelligent search index to make documentation easily accessible through the MCP protocol.

Key Changes

Documentation Management

• Added sync_docs.py script to synchronize documentation from the Strands Agents docs repository
• Implemented indexer.py to build relationships and cross-references between markdown files
• Created a comprehensive document index (document_index.json) for efficient searching

MCP Server Enhancements

• Enhanced server implementation with new search and documentation retrieval capabilities
• Added fuzzy search, smart search, and concept exploration functionality
• Implemented learning path generation for guided documentation exploration

Testing and CI/CD

• Added extensive test suite for documentation indexing and search functionality
• Implemented GitHub Actions workflow for automatic documentation synchronization
• Added comprehensive testing guides and workflows

Documentation

• Added detailed documentation on running a local MCP server
• Created guides for the documentation sync process
• Added testing guides and workflows
• Updated README with improved installation and usage instructions

Documentation Structure

The PR adds a well-organized documentation structure covering:
• API Reference
• Examples (Python, CDK, deployment options)
• User Guide (concepts, deployment, observability, security)

Testing

All new functionality is thoroughly tested with unit and integration tests:
• Document indexing and search capabilities
• MCP tool registration and execution
• GitHub Action workflows
• Complete end-to-end workflows

Impact

This PR significantly enhances the Strands Agents MCP Server by providing:

Comprehensive documentation access through the MCP protocol
Intelligent search capabilities for finding relevant documentation
Concept exploration and learning path generation
Automatic documentation synchronization from the main docs repository

Related Issues

Addresses the need for improved documentation access and search capabilities in the Strands Agents MCP Server.

README.md

Pidem · 2025-07-22T21:22:52Z

conftest.py

might want to remove this if its empty?

Needed for pytest.

Pytest uses this to base the python path at the root

robbrad · 2025-07-22T21:26:43Z

supersedes #16

yonib05 · 2025-07-22T21:31:33Z

This is super cool. What are your thoughts about having the documentation be pulled down from the public repo dynamically? Either cloning the repo or downloading the zip from GitHub. This way it's always upto-date.

Another option would be to use some sort of web crawler implementation to download the Strands site (probably less practical than downloading the repo).

…g to the MCP to search fix: correct uvx command

dbschmigelski

Thanks again for raising this. Left some cleanliness comments but will likely come back after giving the indexer a thorough review.

I think the main question we have to answer is how the indexing should be done.

The approach now scans all markdown files. This seems to be useful especially for the related documents tool you added.

The alternative, is to leverage https://strandsagents.com/latest/llms.txt.

https://llmstxt.org/ states the following but this is not exactly applicable given we know the content is markdown. So I am curious what the quality differences are using the crawling approach compared to this predefined set.

Large language models increasingly rely on website information, but face a critical limitation: context windows are too small to handle most websites in their entirety. Converting complex HTML pages with navigation, ads, and JavaScript into LLM-friendly plain text is both difficult and imprecise.

While websites serve both human readers and LLMs, the latter benefit from more concise, expert-level information gathered in a single, accessible location. This is particularly important for use cases like development environments, where LLMs need quick access to programming documentation and APIs.

dbschmigelski · 2025-07-28T14:23:39Z

scripts/test-aws-docs-mcp.py

@@ -0,0 +1,67 @@
+"""


looks like this can go?

dbschmigelski · 2025-07-28T14:28:09Z

scripts/simulate_tool_registration.py

@@ -0,0 +1,89 @@
+"""


I'm not sure we need this file. Seems like this can either be deleted or added as a test if we need.

dbschmigelski · 2025-07-28T14:29:03Z

scripts/run_tests.py

@@ -0,0 +1,232 @@
+#!/usr/bin/env python3


I think this can be deleted as well and just handled by the pyproject.toml

dbschmigelski · 2025-07-28T14:31:16Z

setup.py

@@ -0,0 +1,7 @@
+#!/usr/bin/env python


dbschmigelski · 2025-07-28T14:34:33Z

pytest.ini

@@ -0,0 +1,3 @@
+[pytest]
+markers =
+    asyncio: mark a test as an asyncio coroutine


should be able to be removed

dbschmigelski · 2025-07-28T14:46:26Z

tests/unit/test_complete_workflow.py

@@ -0,0 +1,13 @@
+"""


dbschmigelski · 2025-07-28T14:51:57Z

scripts/sync_docs.py

+)
+logger = logging.getLogger('docs-sync')
+
+def compare_files(file1, file2):


checksum? or do we even need this what is the cost of writing the file even if its different since we are already spending the time reading it (and its < 100 files)

dbschmigelski · 2025-07-28T14:53:39Z

src/strands_mcp_server/server.py


 pkg_resources = resources.files("strands_mcp_server")

 mcp = FastMCP(
-    "strands-agents-mcp-server",
+    "strands-agents-mcp-server-fuzzy",


nit: don't know if we need to change this

dbschmigelski · 2025-07-28T14:55:15Z

src/strands_mcp_server/server.py

+    ## Available Tools:
+
+    1. **get_document** - Retrieve a specific document by file path
+    2. **fuzzy_search_documents** - Fuzzy search documents with intelligent matching


can we reduce the number of tools by just having smart_search? Were you seeing worse quality?

dbschmigelski · 2025-07-28T15:25:37Z

.github/workflows/sync-docs.yml

+    branches:
+      - main
+  # Run on schedule (daily at midnight UTC) - keeps documentation current even without code changes
+  schedule:


I don't think we need the cron job, would rather trigger from https://github.com/strands-agents/strandsagents.com/blob/main/.github/workflows/build-deploy.yml

dbschmigelski · 2025-09-12T14:55:24Z

Closing as #21 resolved this

robbrad requested a review from a team as a code owner July 22, 2025 21:07

robbrad changed the title ~~feat: Adding scripts to sync and index strands-agents docs and toolin…~~ feat: Add Strands Agents Documentation Sync and Search Functionality Jul 22, 2025

Pidem reviewed Jul 22, 2025

View reviewed changes

feat: Adding scripts to sync and index strands-agents docs and toolin…

dacc1e6

…g to the MCP to search fix: correct uvx command

robbrad force-pushed the feature-docs-sync-scripts branch from 0208d2e to dacc1e6 Compare July 23, 2025 05:49

dbschmigelski requested changes Jul 28, 2025

View reviewed changes

JackYPCOnline assigned JackYPCOnline and unassigned JackYPCOnline Aug 13, 2025

JackYPCOnline mentioned this pull request Aug 19, 2025

feat: Use llm.txt for doc reference. rewrite this pr to include searc… #21

Merged

dbschmigelski closed this Sep 12, 2025

feat: Add Strands Agents Documentation Sync and Search Functionality #19

feat: Add Strands Agents Documentation Sync and Search Functionality #19

Uh oh!

Conversation

robbrad commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request: Add Strands Agents Documentation Sync and Search Functionality

Overview

Key Changes

Documentation Management

MCP Server Enhancements

Testing and CI/CD

Documentation

Documentation Structure

Testing

Impact

Related Issues

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robbrad commented Jul 22, 2025

Uh oh!

yonib05 commented Jul 22, 2025

Uh oh!

dbschmigelski left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dbschmigelski commented Sep 12, 2025

Uh oh!

Uh oh!

robbrad commented Jul 22, 2025 •

edited

Loading