Skip to content

feat(filesystem): add streaming get_file_hash tool for cryptographic digests (md5/sha1/sha256) #2516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
16 changes: 16 additions & 0 deletions src/filesystem/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Node.js server implementing Model Context Protocol (MCP) for filesystem operatio
- Move files/directories
- Search files
- Get file metadata
- Get file digest (md5, sha1, sha256)
- Dynamic directory access control via [Roots](https://modelcontextprotocol.io/docs/learn/client-concepts#roots)

## Directory Access Control
Expand Down Expand Up @@ -150,6 +151,21 @@ The server's directory access control follows this flow:
- Type (file/directory)
- Permissions

- **get_file_hash**

- Compute the cryptographic hash of a regular file (md5, sha1, or sha256)
- Inputs:
- `path` (string): File to hash
- `algorithm` (`"md5" | "sha1" | "sha256"`, optional): Defaults to `"sha256"`
- `encoding` (`"hex" | "base64"`, optional): Digest encoding, defaults to `"hex"`
- Streams file contents for memory-efficient hashing
- Only operates within allowed directories
- Returns the digest as a string
- Notes:
- Fails if the path is not a regular file
- May error if the requested algorithm is unavailable in the current Node/OpenSSL build (e.g., FIPS mode)
- Digest encodings (`hex`, `base64`) are supported by Node’s `crypto` `Hash#digest`, and the filesystem server restricts operations to configured allowed directories.

- **list_allowed_directories**
- List all directories the server is allowed to access
- No input required
Expand Down
135 changes: 135 additions & 0 deletions src/filesystem/__tests__/file-hash.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
import fs from "fs/promises";
import path from "path";
import os from "os";
import { getFileHash } from "../file-hash.js";

describe("get_file_hash (complete coverage)", () => {
let tmpDir: string;
let textFile: string;
let binFile: string;
let dirPath: string;
let symlinkToDir: string;

// Test data
const TEXT = "ForensicShark";
// Expected digests for "ForensicShark" (without newline)
const TEXT_DIGESTS = {
md5_hex: "1422ac7778fd50963651bc74686158b7",
sha1_hex: "a74904ee14c16d949256e96110596bdffc48f481",
sha256_hex: "53746f49c75306a3066eb456dba05b99aab88f562d2c020582c9226d9c969987",
md5_b64: "FCKsd3j9UJY2Ubx0aGFYtw==",
sha1_b64: "p0kE7hTBbZSSVulhEFlr3/xI9IE=",
sha256_b64: "U3RvScdTBqMGbrRW26Bbmaq4j1YtLAIFgskibZyWmYc=",
} as const;

// Small binary snippet: 00 FF 10 20 42 7F
const BIN_SNIPPET = Buffer.from([0x00, 0xff, 0x10, 0x20, 0x42, 0x7f]);
const BIN_DIGESTS = {
md5_hex: "3bd2f5d961a05d8cb7edd3953adc069c",
sha1_hex: "28541834deba1f200e2fbde455bddb2e258afe36",
sha256_hex: "6048e89b6ff39be935d44c069a21f22ae7401177ee4c7d3156a4e3b48102d53f",
md5_b64: "O9L12WGgXYy37dOVOtwGnA==",
sha1_b64: "KFQYNN66HyAOL73kVb3bLiWK/jY=",
sha256_b64: "YEjom2/zm+k11EwGmiHyKudAEXfuTH0xVqTjtIEC1T8=",
} as const;

beforeAll(async () => {
tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), "get-file-hash-"));
textFile = path.join(tmpDir, "text.txt");
binFile = path.join(tmpDir, "bin.dat");
dirPath = path.join(tmpDir, "a-directory");
symlinkToDir = path.join(tmpDir, "dir-link");

await fs.writeFile(textFile, TEXT, "utf-8");
await fs.writeFile(binFile, BIN_SNIPPET);
await fs.mkdir(dirPath);

// Symlink to directory (on Windows: "junction")
if (process.platform === "win32") {
await fs.symlink(dirPath, symlinkToDir, "junction");
} else {
await fs.symlink(dirPath, symlinkToDir);
}
});

afterAll(async () => {
// Cleanup
await fs.rm(tmpDir, { recursive: true, force: true });
});

//
// 1) Text 'ForensicShark' → md5/sha1/sha256 (hex)
//
test("hash of text 'ForensicShark' (md5/sha1/sha256, hex)", async () => {
await expect(getFileHash(textFile, "md5", "hex")).resolves.toBe(TEXT_DIGESTS.md5_hex);
await expect(getFileHash(textFile, "sha1", "hex")).resolves.toBe(TEXT_DIGESTS.sha1_hex);
await expect(getFileHash(textFile, "sha256", "hex")).resolves.toBe(TEXT_DIGESTS.sha256_hex);
});

//
// 2) Not a file: directory, symlink to directory, /dev/null (if present)
//
test("rejects directory as not a regular file", async () => {
await expect(getFileHash(dirPath, "sha256", "hex")).rejects.toThrow(/not a regular file/i);
});

test("rejects symlink to directory as not a regular file", async () => {
await expect(getFileHash(symlinkToDir, "sha256", "hex")).rejects.toThrow(/not a regular file/i);
});

test("rejects device file like /dev/null when present", async () => {
if (process.platform === "win32") {
// No /dev/null → skip test
return;
}
try {
const devNull = "/dev/null";
const st = await fs.lstat(devNull);
// If present & not a regular file → expected error
if (!st.isFile()) {
await expect(getFileHash(devNull, "sha256", "hex")).rejects.toThrow(/not a regular file|EISDIR|EPERM|EINVAL/i);
}
} catch {
// /dev/null does not exist → skip
return;
}
});

//
// 3) Binary snippet correct (all three algorithms, hex)
//
test("hash of small binary snippet (md5/sha1/sha256, hex)", async () => {
await expect(getFileHash(binFile, "md5", "hex")).resolves.toBe(BIN_DIGESTS.md5_hex);
await expect(getFileHash(binFile, "sha1", "hex")).resolves.toBe(BIN_DIGESTS.sha1_hex);
await expect(getFileHash(binFile, "sha256", "hex")).resolves.toBe(BIN_DIGESTS.sha256_hex);
});

//
// 4) Unknown algorithms → error (at least three)
// We intentionally use common but NOT allowed names (sha512)
// plus fantasy/legacy names, so the test remains stable.
//
test("rejects unsupported algorithms", async () => {
const badAlgos = ["sha512", "crc32", "whirlpool", "shark512", "legacy-md5"];
for (const algo of badAlgos) {
// cast to any to bypass TS union, we test runtime errors
await expect(getFileHash(textFile, algo as any, "hex")).rejects.toThrow(/algorithm|unsupported|not available/i);
}
});

//
// 5) Encodings hex & base64 correct
//
test("encodings: hex and base64 (text case)", async () => {
// hex wurde oben schon geprüft; hier nochmals base64 explizit
await expect(getFileHash(textFile, "md5", "base64")).resolves.toBe(TEXT_DIGESTS.md5_b64);
await expect(getFileHash(textFile, "sha1", "base64")).resolves.toBe(TEXT_DIGESTS.sha1_b64);
await expect(getFileHash(textFile, "sha256", "base64")).resolves.toBe(TEXT_DIGESTS.sha256_b64);
});

test("encodings: hex and base64 (binary case)", async () => {
await expect(getFileHash(binFile, "md5", "base64")).resolves.toBe(BIN_DIGESTS.md5_b64);
await expect(getFileHash(binFile, "sha1", "base64")).resolves.toBe(BIN_DIGESTS.sha1_b64);
await expect(getFileHash(binFile, "sha256", "base64")).resolves.toBe(BIN_DIGESTS.sha256_b64);
});
});
46 changes: 46 additions & 0 deletions src/filesystem/file-hash.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import { createHash, getHashes } from "crypto";
import { createReadStream } from "fs";
import fs from "fs/promises";

// Hashing utility
type HashAlgorithm = "md5" | "sha1" | "sha256";
export async function getFileHash(
filePath: string,
algorithm: HashAlgorithm,
encoding: "hex" | "base64" = "hex"
): Promise<string> {
const algo = algorithm.toLowerCase() as HashAlgorithm;
// Policy gate: allow only md5|sha1|sha256 (for DFIR interoperability)
if (!["md5","sha1","sha256"].includes(algo)) {
throw new Error(`Unsupported hash algorithm: ${algorithm}`);
}

// Fail early if Node/OpenSSL is not supported (FIPS/Builds)
const available = new Set(getHashes().map(h => h.toLowerCase()));
if (!available.has(algo)) {
throw new Error(
`Algorithm '${algo}' is not available in this Node/OpenSSL build (FIPS or policy may disable it).`
);
}

// Allow only regular files (throw a clear error if not)
const st = await fs.stat(filePath);
if (!st.isFile()) {
throw new Error(`Path is not a regular file: ${filePath}`);
}

const hash = createHash(algo);
const stream = createReadStream(filePath, { highWaterMark: 1024 * 1024 }); // 1 MiB

return await new Promise<string>((resolve, reject) => {
stream.on("data", (chunk) => hash.update(chunk));
stream.on("error", (err) => reject(err));
stream.on("end", () => {
try {
resolve(hash.digest(encoding));
} catch (e) {
reject(e);
}
});
});
}
35 changes: 34 additions & 1 deletion src/filesystem/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,14 @@ import fs from "fs/promises";
import { createReadStream } from "fs";
import path from "path";
import os from 'os';
import { randomBytes } from 'crypto';
import { randomBytes, createHash, getHashes } from 'crypto';
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
import { diffLines, createTwoFilesPatch } from 'diff';
import { minimatch } from 'minimatch';
import { isPathWithinAllowedDirectories } from './path-validation.js';
import { getValidRootDirectories } from './roots-utils.js';
import { getFileHash } from "./file-hash.js";

// Command line argument parsing
const args = process.argv.slice(2);
Expand Down Expand Up @@ -179,6 +180,12 @@ const GetFileInfoArgsSchema = z.object({
path: z.string(),
});

const GetFileHashArgsSchema = z.object({
path: z.string(),
algorithm: z.enum(['md5', 'sha1', 'sha256']).default('sha256').describe('Hash algorithm to use'),
encoding: z.enum(['hex', 'base64']).default('hex').describe('Digest encoding')
});

const ToolInputSchema = ToolSchema.shape.inputSchema;
type ToolInput = z.infer<typeof ToolInputSchema>;

Expand Down Expand Up @@ -623,6 +630,16 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
required: [],
},
},
{
name: "get_file_hash",
description:
"Compute the cryptographic hash of a file for integrity verification. " +
"Use only for regular files within allowed directories (not directories/devices). " +
"Inputs: { path: absolute path, algorithm: \"md5\"|\"sha1\"|\"sha256\", " +
"encoding: \"hex\"|\"base64\" (optional, default \"hex\") }. " +
"Return only the digest string. Call when verifying file integrity or comparing files.",
inputSchema: zodToJsonSchema(GetFileHashArgsSchema) as ToolInput,
}
],
};
});
Expand Down Expand Up @@ -944,6 +961,22 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
};
}

case "get_file_hash": {
const parsed = GetFileHashArgsSchema.safeParse(args);
if (!parsed.success) {
throw new Error(`Invalid arguments for get_file_hash: ${parsed.error}`);
}
const validPath = await validatePath(parsed.data.path);
const encoding = parsed.data.encoding ?? "hex";
const hash = await getFileHash(validPath, parsed.data.algorithm, encoding);
return {
content: [{
type: "text",
text: `algorithm: ${parsed.data.algorithm}\nencoding: ${encoding}\npath: ${parsed.data.path}\ndigest: ${hash}`
}],
};
}

case "list_allowed_directories": {
return {
content: [{
Expand Down