Bug Report and Resolution: `mcp-server-memory` Failures due to Race Condition and Environment Misconfiguration

### Summary

This report details the investigation and resolution of a critical bug that rendered the `mcp-server-memory` toolset non-functional, causing "Unexpected non-whitespace character after JSON" errors. The root cause was a combination of a race condition in the server's file-writing logic and an environment misconfiguration that prevented the execution of the corrected code. The issue was ultimately resolved after the user manually deleted the corrupted `memory.json` file, allowing the fixed server environment to take over.

### Root Cause Analysis

The investigation revealed a multi-faceted root cause:

1.  **Code Bug:** The server's code lacked a file lock on the `memory.json` file. This created a **race condition** where concurrent writes would corrupt the file. This was demonstrated by the provided `test_race_condition.py` script.
2.  **Data Corruption:** The corrupted `memory.json` file was the direct cause of the JSON parsing errors. The server would attempt to read the malformed file and fail.
3.  **Environment Issue:** The primary blocker was an environment misconfiguration. The system was using `npx` to run an old, buggy version of the server from the internet, instead of the locally compiled and fixed version that included the race condition fix.

### Evidence of the Bug (Race Condition)

The following Python script simulates the original buggy logic. It starts two processes that attempt to write to the same JSON file concurrently without a lock. This can result in a corrupted file, demonstrating the race condition.

<details>
<summary>Bug Simulation Code</summary>

```python
import json
import os
from multiprocessing import Process, Manager

# The file to be shared and potentially corrupted
TEST_FILE = 'test_memory.json'

# This function simulates the buggy load/save logic from the TypeScript code
def buggy_operation(process_id, new_entry, shared_dict):
    """Simulates a single process reading, updating, and writing the file."""
    try:
        # 1. Load the graph (read from file)
        try:
            with open(TEST_FILE, 'r') as f:
                data = f.read()
                # In a race condition, this data could be incomplete
                graph = json.loads(data) if data else {'entries': []}
        except (FileNotFoundError, json.JSONDecodeError):
            graph = {'entries': []}

        # 2. Add the new entry
        graph['entries'].append(new_entry)
        
        # 3. Save the graph (write back to file)
        with open(TEST_FILE, 'w') as f:
            # This is the critical section where the race condition happens.
            # One process can overwrite the other's changes.
            json.dump(graph, f)
        
        shared_dict[process_id] = 'Success'

    except Exception as e:
        # If a json.JSONDecodeError happens, it means we read a corrupted file
        shared_dict[process_id] = f'Error: {type(e).__name__} - {e}'

def main():
    """Run two buggy operations in parallel to try and trigger the race condition."""
    # Clean up previous test file if it exists
    if os.path.exists(TEST_FILE):
        os.remove(TEST_FILE)

    with Manager() as manager:
        shared_dict = manager.dict()
        
        # Create two processes that will run the buggy operation concurrently
        p1 = Process(target=buggy_operation, args=(1, {'id': 1, 'data': 'from_process_1'}, shared_dict))
        p2 = Process(target=buggy_operation, args=(2, {'id': 2, 'data': 'from_process_2'}, shared_dict))

        p1.start()
        p2.start()

        p1.join()
        p2.join()

        print("--- Process Results ---")
        for pid, result in shared_dict.items():
            print(f"Process {pid}: {result}")

        print("\n--- Final File Content ---")
        try:
            with open(TEST_FILE, 'r') as f:
                content = f.read()
                print(content)
                # Final check: is the file valid JSON?
                final_data = json.loads(content)
                print("\n--- Final Parsed Data ---")
                print(final_data)
                if len(final_data.get('entries', [])) != 2:
                    print("\n*** BUG CONFIRMED: Data was lost. Expected 2 entries. ***")
                else:
                    print("\n*** TEST PASSED (Race condition not triggered this time) ***")

        except (FileNotFoundError, json.JSONDecodeError) as e:
            print(f"*** BUG CONFIRMED: File is corrupted or does not exist. ***")
            print(f"Error: {e}")
        finally:
             # Clean up the test file
            if os.path.exists(TEST_FILE):
                os.remove(TEST_FILE)


if __name__ == '__main__':
    main()
```

</details>

### The Fix (Locking Mechanism)

The corrected code uses a file-based lock to ensure that only one process can write to the `memory.json` file at a time, preventing the race condition. The following script demonstrates this fixed logic.

<details>
<summary>Fix Verification Code</summary>

```python
import os
import json
from multiprocessing import Process, Manager, Lock

class FixedKnowledgeGraphManager:
    def __init__(self, memory_file_path, lock):
        self.memory_file_path = memory_file_path
        self.lock = lock

    def _load_graph_safe(self):
        """Safely loads the graph from the JSON file."""
        try:
            with open(self.memory_file_path, 'r') as f:
                data = f.read()
                return json.loads(data) if data else {'entities': [], 'relations': []}
        except FileNotFoundError:
            return {'entities': [], 'relations': []}

    def _save_graph_safe(self, graph):
        """Safely saves the graph to the JSON file."""
        with open(self.memory_file_path, 'w') as f:
            json.dump(graph, f, indent=2)

    def _execute_with_lock(self, operation, *args, **kwargs):
        """Executes a given file operation with a lock."""
        with self.lock:
            graph = self._load_graph_safe()
            result, updated_graph = operation(graph, *args, **kwargs)
            self._save_graph_safe(updated_graph)
            return result

    def create_entities(self, graph, entities):
        new_entities = [e for e in entities if e['name'] not in {ex['name'] for ex in graph['entities']}]
        graph['entities'].extend(new_entities)
        return new_entities, graph

    def create_relations(self, graph, relations):
        # Simple add for testing; a real implementation would check for duplicates
        graph['relations'].extend(relations)
        return relations, graph

    # Wrapper methods that will be called by the test processes
    def safe_create_entities(self, entities):
        return self._execute_with_lock(self.create_entities, entities)

    def safe_create_relations(self, relations):
        return self._execute_with_lock(self.create_relations, relations)


TEST_FILE = 'test_memory_fixed.json'

def safe_operation(process_id, entities_to_add, relations_to_add, shared_dict, lock):
    """Performs concurrent operations using the fixed manager."""
    try:
        manager = FixedKnowledgeGraphManager(TEST_FILE, lock)
        manager.safe_create_entities(entities_to_add)
        manager.safe_create_relations(relations_to_add)
        shared_dict[process_id] = 'Success'
    except Exception as e:
        shared_dict[process_id] = f'Error: {type(e).__name__} - {e}'

def main():
    """Run two safe operations in parallel to verify the fix."""
    if os.path.exists(TEST_FILE):
        os.remove(TEST_FILE)

    with Manager() as process_manager:
        shared_dict = process_manager.dict()
        lock = Lock()

        # Define data for two processes
        data_p1 = {
            'entities': [{'name': 'Human', 'entityType': 'Actor'}],
            'relations': [{'from': 'Human', 'to': 'Start_Bank', 'relationType': 'is_at'}]
        }
        data_p2 = {
            'entities': [{'name': 'Goat', 'entityType': 'Actor'}],
            'relations': [{'from': 'Goat', 'to': 'Start_Bank', 'relationType': 'is_at'}]
        }

        p1 = Process(target=safe_operation, args=(1, data_p1['entities'], data_p1['relations'], shared_dict, lock))
        p2 = Process(target=safe_operation, args=(2, data_p2['entities'], data_p2['relations'], shared_dict, lock))

        p1.start()
        p2.start()

        p1.join()
        p2.join()

        print("--- Process Results ---")
        for pid, result in shared_dict.items():
            print(f"Process {pid}: {result}")

        print("\n--- Final File Content ---")
        # We need a final read outside the lock to verify the final state
        try:
            with open(TEST_FILE, 'r') as f:
                final_graph = json.load(f)
                print(json.dumps(final_graph, indent=2))
                num_entities = len(final_graph.get('entities', []))
                num_relations = len(final_graph.get('relations', []))
                print(f"\nFound {num_entities} entities and {num_relations} relations.")
                if num_entities == 2 and num_relations == 2:
                    print("\n*** BUG FIXED: All data was written successfully. ***")
                else:
                    print(f"\n*** TEST FAILED: Data was lost. Expected 2 entities and 2 relations. ***")
        except (FileNotFoundError, json.JSONDecodeError) as e:
            print(f"Error reading final file: {e}")
        finally:
            if os.path.exists(TEST_FILE):
                os.remove(TEST_FILE)

if __name__ == '__main__':
    main()
```

</details>

### Relation to Previously Reported Bugs

This issue is consistent with previously reported bugs in the `modelcontextprotocol/servers` repository, suggesting a systemic problem that has now been addressed:

*   **Direct Confirmation:** This bug is a direct parallel to **Issue #2412**, which reported the exact same "Unexpected non-whitespace character" error, although for the filesystem server. This confirms the problem was not isolated to a single server.
*   **Related Symptom:** The server timeouts and unexpected exits reported in **Issue #719** are also likely symptoms of the same underlying server instability caused by this bug.

### Conclusion

The "Unexpected non-whitespace character" bug is **resolved**. The resolution required both a **code fix** (implementing a file lock to prevent the race condition) and an **environment fix** (ensuring the correct, locally-built server is executed). The user's manual deletion of the corrupted `memory.json` file was the final step that enabled the corrected environment to function properly.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug Report and Resolution: `mcp-server-memory` Failures due to Race Condition and Environment Misconfiguration #2579

Summary

Root Cause Analysis

Evidence of the Bug (Race Condition)

The Fix (Locking Mechanism)

Relation to Previously Reported Bugs

Conclusion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug Report and Resolution: mcp-server-memory Failures due to Race Condition and Environment Misconfiguration #2579

Description

Summary

Root Cause Analysis

Evidence of the Bug (Race Condition)

The Fix (Locking Mechanism)

Relation to Previously Reported Bugs

Conclusion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Bug Report and Resolution: `mcp-server-memory` Failures due to Race Condition and Environment Misconfiguration #2579