Skip to content

Non English, Chinese documents processing Entity generation, and Ebilingual chat #1348

@kparchure

Description

@kparchure

Does the application support uploading 80+ around 10-300 KB each html/PDF Chinese documents? Somehow the chat interface does not seem to reference all documents for Q&A.

Any pointers for Graph Enhancement/Entity Extraction, Additional Instruction to meaningfully extract entities for a particular domain (since the instructions are in English and documents in Chinese). Tried preprocessing instruction in English but process seem to get stuck (file fails to process).

without any preprocessing instruction or Entity Extraction settings the documents process and we're able to chat in English, but the answers are not always accurate, and seems like the entire set of documents/context is not being used.

we're running LLM builder, neo4j db, and both qwen 2.5, QwQ-32B-AWQ locally

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions