Skip to content

feat: Add support for MCP Bundles (MCPB) in registry #295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
303 changes: 303 additions & 0 deletions docs/file-hashes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,303 @@
# File Hashes Implementation Guide

## Overview

File hashes provide integrity verification for MCP server packages. The CLI tool generates SHA-256 hashes at publish time, and the registry validates these hashes to ensure package integrity.

## Implementation Strategy: CLI-Generated Hashes

### Flow

```
1. Developer runs: mcp-publisher publish
2. CLI tool fetches package files from URLs
3. CLI tool computes SHA-256 hashes
4. CLI tool includes hashes in publish request
5. Registry validates hashes match the files
6. Registry stores server.json with validated hashes
```

### Responsibilities

#### CLI Tool (Publisher)
- **Generates** hashes for package files
- **Includes** hashes in publish payload
- **Provides** option to skip hash generation (--no-hash flag)

#### Registry
- **Validates** provided hashes against actual files
- **Stores** validated hashes in server.json
- **Accepts** submissions without hashes (optional field)
- **Rejects** submissions with invalid hashes

#### Consumers
- **Verify** downloaded files match provided hashes (optional)
- **Decide** trust policy when hashes are absent

## Hash Format

### Structure
```json
{
"file_hashes": {
"<identifier>": "sha256:<hex-hash>"
}
}
```

### Identifiers by Package Type

#### NPM Packages
```json
{
"file_hashes": {
"npm:@modelcontextprotocol/server-postgres@0.6.2": "sha256:abc123..."
}
}
```

#### Python Packages
```json
{
"file_hashes": {
"pypi:mcp-server-postgres==0.6.2": "sha256:def456..."
}
}
```

#### GitHub Releases
```json
{
"file_hashes": {
"github:owner/repo/v1.0.0/server.tar.gz": "sha256:789xyz..."
}
}
```

#### Direct URLs
```json
{
"file_hashes": {
"https://example.com/packages/server-v1.0.0.tar.gz": "sha256:abc123..."
}
}
```

## CLI Tool Implementation

### Hash Generation Process

1. **Identify Package Files**
- Parse package_location from server.json
- Determine download URLs based on package type
- Handle multiple files if needed (e.g., wheels for different platforms)

2. **Download Files**
- Use temporary directory for downloads
- Stream large files to avoid memory issues
- Implement retry logic for network failures

3. **Compute Hashes**
- Use SHA-256 algorithm
- Process files in chunks for memory efficiency
- Generate consistent identifiers

4. **Include in Publish**
- Add file_hashes to server.json before submission
- Validate JSON structure

### CLI Commands

```bash
# Standard publish with hash generation
mcp-publisher publish server.json

# Skip hash generation (for testing or special cases)
mcp-publisher publish server.json --no-hash

# Verify existing hashes without publishing
mcp-publisher verify server.json

# Generate hashes and output to stdout (dry run)
mcp-publisher hash-gen server.json
```

## Registry Validation

### Validation Process

```python
def validate_file_hashes(server_json):
# Skip if no hashes provided (optional field)
if 'file_hashes' not in server_json:
return True

for identifier, expected_hash in server_json['file_hashes'].items():
# Download file from identifier
file_content = download_file(identifier)

# Compute actual hash
actual_hash = compute_sha256(file_content)

# Compare hashes
if f"sha256:{actual_hash}" != expected_hash:
raise ValidationError(f"Hash mismatch for {identifier}")

return True
```

### Error Responses

```json
{
"error": "Hash validation failed",
"details": {
"npm:@example/server@1.0.0": {
"expected": "sha256:abc123...",
"actual": "sha256:def456...",
"status": "mismatch"
}
}
}
```

## Migration Path

### Phase 1: Deploy Optional Field (Week 1)
- Update registry schema to include optional file_hashes
- Deploy registry without validation
- Document field for early adopters

### Phase 2: Enable Validation (Week 2)
- Activate hash validation in registry
- Continue accepting entries without hashes
- Monitor validation failures

### Phase 3: CLI Tool Support (Week 3-4)
- Release publisher tool with hash generation
- Documentation and examples
- Community feedback incorporation

### Phase 4: Adoption Push (Month 2+)
- Encourage hash inclusion
- Consider making required for verified badges
- Never make fully mandatory (backward compatibility)

## Security Considerations

1. **Algorithm Choice**
- SHA-256 is current standard
- Design allows future algorithm updates
- Include algorithm in hash string (sha256:...)

2. **Network Security**
- Always download over HTTPS
- Validate SSL certificates
- Implement download size limits

3. **Trust Boundaries**
- Hashes verify integrity, not authenticity
- Registry validation prevents tampered submissions
- Consumers should verify independently

## Example Implementation

### Publisher Tool (Go)

```go
func generateFileHashes(serverJSON *ServerJSON) (map[string]string, error) {
hashes := make(map[string]string)

switch serverJSON.PackageLocation.Type {
case "npm":
url := getNPMPackageURL(serverJSON.PackageLocation.PackageName)
identifier := fmt.Sprintf("npm:%s", serverJSON.PackageLocation.PackageName)
hash, err := downloadAndHash(url)
if err != nil {
return nil, err
}
hashes[identifier] = fmt.Sprintf("sha256:%s", hash)

case "pypi":
// Similar for Python packages

case "github":
// Similar for GitHub releases
}

return hashes, nil
}

func downloadAndHash(url string) (string, error) {
resp, err := http.Get(url)
if err != nil {
return "", err
}
defer resp.Body.Close()

hasher := sha256.New()
_, err = io.Copy(hasher, resp.Body)
if err != nil {
return "", err
}

return hex.EncodeToString(hasher.Sum(nil)), nil
}
```

### Registry Validation (Go)

```go
func (s *RegistryService) validateFileHashes(entry *RegistryEntry) error {
if entry.FileHashes == nil {
return nil // Optional field
}

for identifier, expectedHash := range entry.FileHashes {
actualHash, err := s.computeHashForIdentifier(identifier)
if err != nil {
return fmt.Errorf("failed to validate %s: %w", identifier, err)
}

if actualHash != expectedHash {
return fmt.Errorf("hash mismatch for %s", identifier)
}
}

return nil
}
```

## Testing Strategy

1. **Unit Tests**
- Hash computation correctness
- Identifier generation
- Error handling

2. **Integration Tests**
- End-to-end publish with hashes
- Validation failure scenarios
- Network failure handling

3. **Manual Testing**
- Various package types
- Large files
- Concurrent validations

## FAQ

**Q: What if package files are updated after publishing?**
A: The hash represents the file at publish time. Updates require new version publication.

**Q: Can I update just the hashes?**
A: No, hashes are part of the version. New hashes require new version.

**Q: What about private packages?**
A: The registry must be able to access files for validation. Private packages need accessible URLs during validation.

**Q: Are hashes required?**
A: No, file_hashes is optional to maintain backward compatibility.

**Q: What about multiple files per package?**
A: Each file gets its own hash entry in the file_hashes object.
58 changes: 50 additions & 8 deletions docs/server-json/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -228,8 +228,10 @@ The `dnx` tool ships with the .NET 10 SDK, starting with Preview 6.
},
"packages": [
{
"registry_name": "nuget",
"name": "Knapcode.SampleMcpServer",
"location": {
"registry_name": "nuget",
"name": "Knapcode.SampleMcpServer"
},
"version": "0.5.0",
"runtime_hint": "dnx",
"environment_variables": [
Expand Down Expand Up @@ -261,8 +263,10 @@ The `dnx` tool ships with the .NET 10 SDK, starting with Preview 6.
},
"packages": [
{
"registry_name": "docker",
"name": "mcp/database-manager",
"location": {
"registry_name": "docker",
"name": "mcp/database-manager"
},
"version": "3.1.0",
"runtime_arguments": [
{
Expand Down Expand Up @@ -347,8 +351,10 @@ The `dnx` tool ships with the .NET 10 SDK, starting with Preview 6.
},
"packages": [
{
"registry_name": "npm",
"name": "@example/hybrid-mcp-server",
"location": {
"registry_name": "npm",
"name": "@example/hybrid-mcp-server"
},
"version": "1.5.0",
"runtime_hint": "npx",
"package_arguments": [
Expand Down Expand Up @@ -389,6 +395,40 @@ The `dnx` tool ships with the .NET 10 SDK, starting with Preview 6.
}
```

## MCP Bundle (MCPB) Package Example

```json
{
"name": "io.modelcontextprotocol/text-editor",
"description": "MCP Bundle server for advanced text editing capabilities",
"repository": {
"url": "https://github.com/modelcontextprotocol/text-editor-mcpb",
"source": "github",
"id": "mcpb-123ab-cdef4-56789-012ghi-jklmnopqrstu"
},
"version_detail": {
"version": "1.0.2"
},
"packages": [
{
"location": {
"type": "mcpb",
"url": "https://github.com/modelcontextprotocol/text-editor-mcpb/releases/download/v1.0.2/text-editor.mcpb"
},
"version": "1.0.2",
"file_hashes": {
"sha-256": "fe333e598595000ae021bd27117db32ec69af6987f507ba7a63c90638ff633ce"
}
Comment on lines +419 to +421
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider lifting this inside the PackageLocation object I am proposing above, because it is only a concern relevant to non-package-registry entries?

Copy link
Member

@tadasant tadasant Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is the possibility we add some registry in the future that does not do its own checks. So I'm fine keeping it here as an optional field.

I assume we need to rely on the MCP Registry to generate this file hash at publish-time, right? Or is the idea it will get generated by the CLI tool en route to creating a server.json, and the registry just validates it's correct and not tampered with in the publish flow?

I think we should map out and document this feature in a little more detail, maybe a dedicated doc in docs/ explaining the flows, who's responsible for what. I worry a bit about how file_hashes is something we are expecting in the immutable server.json file, but also need to verify they're actually valid hashes. Seems doable but would like to see a step by step explanation somewhere.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opinionated here and will defer to you all re: what makes sense for the workflow the registry committee would prefer to support; happy add docs to reflect.

}
]
}
```

This example shows an MCPB (MCP Bundle) package that:
- Is hosted on GitHub Releases (an allowlisted provider)
- Includes a SHA-256 hash for integrity verification
- Can be downloaded and executed directly by MCP clients that support MCPB

## Deprecated Server Example

```json
Expand All @@ -406,8 +446,10 @@ The `dnx` tool ships with the .NET 10 SDK, starting with Preview 6.
},
"packages": [
{
"registry_name": "npm",
"name": "@legacy/old-weather-server",
"location": {
"registry_name": "npm",
"name": "@legacy/old-weather-server"
},
"version": "0.9.5",
"environment_variables": [
{
Expand Down
Loading