[WIP] Build advanced RAG architectures with Mistral-7b using LlamaIndex

https://huggingface.co/learn/cookbook/en/rag_evaluation https://python.langchain.com/v0.1/docs/integrations/chat/google_vertex_ai_palm/ https://cloud.google.com/vertex-ai/docs/tutorials/jupyter-notebooks#vertex-ai-workbench

start with:

what is rag?
- how is it different than fie-tuning?
- ways of evaluating rag
  - humans
  - static scores
    - precision
    - recall
    - bleu, rouge
  - use another llm
    - since broad capabilities
    - varied human preferences
lots of things to tweak in the RAG pipeline - but there should be a proper way to evaluate its impact
dataset used is huggingface documentation
- text + source
Build RAG
- split documents using recursive split
- embed documents
  - model used: thenlper/gte-small
- retrieve relevant chunks - like a search engine
  - FAISS index stores embeddings for quick retrievals of similar chunks
- use retrieved contexts + query to formulate answer
  - rerank retrieved context
    - retrieve 30 docs
    - rerank and output best 7
    - model used: colbert-ir/colbertv2.0
    - reevaluates and reorders the documents retrieved by the initial search based on their relevance to the query
  - answer using llm
    - LLM B (zephyr-7b-beta)
    - a fine-tuned version of (mistralai/Mistral-7B-v0.1) trained to act as helpful assistants.

basic rag sentence-window rag auto retrieval rag

Enjoy Reading This Article?