Hi HN,
I’m sharing Semantica, an MIT-licensed open-source framework for building semantic layers and knowledge engineering systems for AI.
Many RAG and agent systems fail not due to model quality, but due to the semantic gap — unstructured, inconsistent data without explicit entities, rules, or relationships. Vector-only approaches often hallucinate or fail silently under real-world data.
Semantica focuses on transforming messy data into reasoning-ready semantic knowledge.
Core capabilities:
- Universal ingestion (PDF, DOCX, HTML, JSON, CSV, databases, APIs)
- Automated entity and relationship extraction
- Knowledge graph construction with entity resolution
- Automated ontology generation and validation
- GraphRAG (hybrid vector + graph retrieval, multi-hop reasoning)
- Persistent semantic memory for AI agents
- Conflict detection, deduplication, and provenance tracking
Project links:
Docs: https://hawksight-ai.github.io/semantica/
GitHub: https://github.com/Hawksight-AI/semantica
I’d appreciate feedback from people working on knowledge graphs, GraphRAG, agent memory, or production RAG reliability.
Happy to discuss design trade-offs or answer technical questions.