So, RAGing is turning information into an AI models data points. I downloaded a Wiki dump and did that with a 4GB mistral model.
I now have to move it to a computer with more RAM to test it. On the 16GB one it is under performing. After some more clean up and compressing it will have a home on a 32GB system.
This is what it did until here:
(cyberdeck-env313) j4v@M920:~/Scripts$ python3 RAGing_system_wiki.py
🚀 Starting Mistral RAG System
💻 Running on M920 - Using Wiki drive for processing
🔮 Using mistral:latest model via Ollama
------------------------------------------------------------
🔍 Checking environment...
💾 Disk space on /media/j4v/512SSDUSB/Wiki/rag_system:
Total: 468.38 GB
Used: 205.02 GB
Free: 263.37 GB
🔍 Checking Ollama status...
✅ Ollama process is running
✅ Ollama port 11434 is listening
✅ Ollama API is responsive
✅ Mistral model is available
✅ Using XML file: /media/j4v/512SSDUSB/Wiki/enwiki-latest-pages-articles-multistream.xml
📖 Processing Wikipedia XML: /media/j4v/512SSDUSB/Wiki/enwiki-latest-pages-articles-multistream.xml
⚠️ Processing first 500 pages due to resource constraints
📥 Processing XML pages: 100%|███████████| 500/500 [00:01<00:00, 286.41pages/s, chunks=32351]
🎉 Processing complete! Processed 500 pages, 32412 chunks
🔧 Creating vector store...
📄 Loading processed chunks...
📖 Loading documents: 100%|████████████████████████| 32412/32412 [00:00<00:00, 121321.56it/s]
📊 Creating embeddings for 32412 documents...
/home/j4v/Scripts/RAGing_system_wiki.py:369: LangChainDeprecationWarning: The class `HuggingFaceEmbeddings` was deprecated in LangChain 0.2.2 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-huggingface package and should be used instead. To use it run `pip install -U :class:`~langchain-huggingface` and import as `from :class:`~langchain_huggingface import HuggingFaceEmbeddings``.
self.embeddings = HuggingFaceEmbeddings(
🧩 Building vector store (this may take a while)...
🔨 Processing batches: 100%|███████████████████████████████| 325/325 [19:59<00:00, 3.69s/it]
✅ Vector store saved with 32412 documents
🎯 RAG System Ready!
💡 Type 'quit' to exit, 'status' for system info
------------------------------------------------------------
❓ Your question: What can you tell me about Ultra Light Aircrafts, please?
🔍 Searching for: 'What can you tell me about Ultra Light Aircrafts, please?'
📚 Found 3 relevant documents
🤖 Generating answer with Mistral...
📝 Answer (generated in 30.06s):
💬 ❌ Timeout: Mistral took too long to respond
📚 Sources used: 3 documents
1. Airship of 1910, also called the Cooley monoplane.{{Cite web|url=http://www.wright-brothers.org/History_Wing/Aviations_Attic/UFOs/UFOs.htm|title=Unbel...
2. {{Main|List of large aircraft}}The largest aircraft by dimensions and volume (as of 2016) is the {{cvt|302|ft|m}} long British [[Airlander 10]], a hyb...
3. == External links ==
{{Wiktionary|aircraft}}
{{Commons category}}
===History===
* [http://www.hq.nasa.gov/office/pao/History/SP-468/contents.htm The ...
❓ Your question: Tell me about Shakespears most important works.
🔍 Searching for: 'Tell me about Shakespears most important works.'
📚 Found 3 relevant documents
🤖 Generating answer with Mistral...
📝 Answer (generated in 30.13s):
💬 ❌ Timeout: Mistral took too long to respond
📚 Sources used: 3 documents
1. Listed below are some of the many works influenced (more or less) by Aristophanes.
2. The inspirations for some of Christie's titles include:
* [[William Shakespeare]]'s works: ''[[Sad Cypress]]'', ''[[By the Pricking of My Thumbs]]'', ...
3. In [[Poul Anderson]]'s ''[[Three Hearts and Three Lions]]'' in which the [[Matter of France]] is history and the [[fairy folk]] are real and powerful....
❓ Your question:
#cyberpunkcoltoure