CaSIL is an advanced natural language processing system that implements a sophisticated four-layer semantic analysis architecture. It processes both user input and knowledge base content through progressive semantic layers, combining:
- Dynamic concept extraction and relationship mapping
- Adaptive semantic similarity analysis
- Context-aware knowledge graph integration
- Multi-layer processing pipeline
- Real-time learning and adaptation
- Python 3.10+
- OpenAI Compatible LLM Provider
- Clone the repository:
git clone https://github.com/severian42/Cascade-of-Semantically-Integrated-Layers.git
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
cp .env.example .env
# Edit .env with your configuration
-
Start your local LLM server (e.g., Ollama)
-
Run the system:
python main.py --debug # (Shows full internal insight and metrics)
# Start the system
python main.py --debug
# Available commands:
help # Show available commands
graph # Display knowledge graph statistics
concepts # List extracted concepts
relations # Show concept relationships
exit # Exit the program
CaSIL processes input through four sophisticated layers:
- Advanced concept extraction using TF-IDF
- Named entity recognition
- Custom stopword filtering
- Semantic weight calculation
- Dynamic similarity matrix computation
- Graph-based relationship discovery
- Concept clustering and community detection
- Temporal relationship tracking
- Historical context weighting
- Knowledge base integration
- Dynamic context windows
- Adaptive threshold adjustment
- Multi-source information fusion
- Style-specific processing
- Dynamic temperature adjustment
- Context-aware response generation
CaSIL maintains two interconnected graph systems:
- Tracks temporary concept relationships
- Maintains conversation context
- Updates in real-time
- Handles recency weighting
- Stores long-term concept relationships
- Tracks concept evolution over time
- Maintains relationship weights
- Supports community detection
-
Multi-Stage Processing
- Input text → Concept Extraction → Relationship Analysis → Context Integration → Response
- Each stage maintains its own similarity thresholds and processing parameters
- Adaptive feedback loop adjusts parameters based on processing results
-
Semantic Analysis Engine
# Example concept extraction flow text → TF-IDF Vectorization → Weight Calculation → Threshold Filtering → Concepts # Relationship discovery process concepts → Similarity Matrix → Graph Construction → Community Detection → Relationships
-
Dynamic Temperature Control
temperature = base_temp * layer_modifier['base'] * ( 1 + (novelty_score * layer_modifier['novelty_weight']) * (1 + (complexity_factor * layer_modifier['complexity_weight'])) )
-
Dual-Graph System
Session Graph (Temporary) Knowledge Graph (Persistent) ├─ Short-term relationships ├─ Long-term concept storage ├─ Recency weighting ├─ Relationship evolution ├─ Context tracking ├─ Community detection └─ Real-time updates └─ Concept metadata
-
Graph Update Process
# Simplified relationship update flow new_weight = (previous_weight + (similarity * time_weight)) / 2 graph.add_edge(concept1, concept2, weight=new_weight)
- Dynamic threshold adjustment based on:
├─ Input complexity ├─ Concept novelty ├─ Processing layer └─ Historical performance
- Multi-dimensional similarity calculation:
Combined Similarity = (0.7 * cosine_similarity) + (0.3 * jaccard_similarity)
- Weighted by:
- Term frequency
- Position importance
- Historical context
Context Integration Flow:
Input → Session Context → Knowledge Graph → External Knowledge → Response
↑______________________________________________|
(Feedback Loop)
┌────────────────┐
│ Knowledge Base │
└───────┬────────┘
│
▼
User Input → Concept Extraction → Similarity Analysis → Graph Integration
↑ │ │ │
│ ▼ ▼ ▼
│ Session Graph ──────► Relationship Analysis ◄─── Knowledge Graph
│ │ │ │
│ └────────────────────────┼──────────────────
│ │
└───────────────────── Response ◄───────┘
- LRU Cache for similarity calculations
- Concurrent processing with thread pooling
- Batched vectorizer updates
- Adaptive corpus management
-
Progressive Semantic Analysis
- Each layer builds upon previous insights
- Maintains context continuity
- Adapts processing parameters in real-time
-
Dynamic Knowledge Integration
- Combines session-specific and persistent knowledge
- Real-time graph updates
- Community-based concept clustering
-
Adaptive Response Generation
- Context-aware temperature adjustment
- Style-specific processing parameters
- Multi-source information fusion
- CPU: Multi-core processor recommended
- RAM: 8GB minimum, 16GB recommended
- Storage: 1GB for base system
- Python: 3.8 or higher
- OS: Linux, macOS, or Windows
DEBUG_MODE=true
USE_EXTERNAL_KNOWLEDGE=false
LLM_URL=http://0.0.0.0:11434/v1/chat/completions
LLM_MODEL=your_model_name
INITIAL_UNDERSTANDING_THRESHOLD=0.7
RELATIONSHIP_ANALYSIS_THRESHOLD=0.7
CONTEXTUAL_INTEGRATION_THRESHOLD=0.9
SYNTHESIS_THRESHOLD=0.8
from SemanticCascadeProcessing import CascadeSemanticLayerProcessor
processor = CascadeSemanticLayerProcessor(config)
processor.knowledge_base.load_from_directory("path/to/knowledge")
# Get graph statistics
processor.knowledge.print_graph_summary()
# Analyze concept relationships
processor.analyze_knowledge_graph()
Contributions are welcome! Please read our Contributing Guidelines first.
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Thanks to the open-source NLP community
- Special thanks to LocalLLaMA for never letting anything slip by them unnoticed; and for just being an awesome community overall