🐧 Penguin AI — Multilingual RAG Chatbot with RLHF

🌍 Multilingual RAG — 50+ Languages

Ask questions in your language — the shared embedding space maps cross-lingual queries to documents natively. No translation step needed.

🇮🇳 Hindi 🇫🇷 French 🇩🇪 German 🇪🇸 Spanish 🇯🇵 Japanese 🇨🇳 Chinese 🇰🇷 Korean 🇸🇦 Arabic 🇧🇷 Portuguese 🇷🇺 Russian 🇮🇹 Italian 🇹🇷 Turkish 🇧🇩 Bengali + 37 more

🔤 Multilingual Embeddings

paraphrase-multilingual-MiniLM-L12-v2 — 471 MB, CPU-only. Maps all 50+ languages into a single shared vector space so Hindi queries match English docs seamlessly.

🎯 Multilingual Reranker

mmarco-mMiniLMv2-L12-H384-v1 — cross-encoder trained on MS MARCO in 13 languages. Scores query–passage relevance across language boundaries.

🔍 Auto Language Detection

langdetect identifies your query language instantly. A language badge appears in the UI and the LLM is instructed to respond in the same language.

🎙️ Whisper — 99 Languages

Voice input auto-detects spoken language — no language selector needed. Speak Hindi, French, Spanish or any of 99 supported languages and Whisper transcribes it perfectly.

🔬 Production RAG Pipeline

📥 Document Upload → ✂️ Semantic Chunking → 🔑 SHA-256 Dedup → 🌍 Multilingual Embed → 🔀 BM25 + FAISS Hybrid → ⚖️ RRF Fusion → 🎯 mMiniLM Rerank → 📌 Citations + LRU Cache → 🤖 LLM (same language)

Highlighted steps are multilingual-aware

🧠 RLHF — Reinforcement Learning from Human Feedback

PPO-style reward model adapts the chatbot's behaviour in real-time from your 👍/👎 feedback — no GPU, no retraining.

👍👎 Human Signal → ⚖️ Reward Model → 📊 Value Baseline → 📈 Advantage → 🔄 Policy Update ±0.20 → 🎯 Better Responses

✨ All Features

🌍

Multilingual RAG

Ask in Hindi, French, German, Spanish, Japanese or 46 more languages. Shared embedding space — no translation step. Auto language badge in the UI.

📄

Advanced RAG

Upload PDF / DOCX / TXT. 6-layer pipeline: semantic chunks → multilingual BM25+FAISS → RRF fusion → mMiniLM rerank → citations.

🕵️

Agentic RAG

ReAct loop with multi-hop retrieval (up to 3 hops). LLM reasons, picks tools (search / summarise / verify), self-corrects, and gives a grounded final answer.

🧠

RLHF

PPO-style reward model adapts temperature in real-time from 👍/👎 feedback. Export replay buffer as JSONL for offline fine-tuning with HuggingFace TRL.

🤖

General Chat

Conversational AI with 20-turn sliding memory. RLHF policy adapts the generation style to your preferences over the session.

🎙️

Voice Mic

Click 🎙️, speak, stop — auto-transcribed by Whisper Large v3 Turbo (Groq). Supports 99 languages with automatic language detection.

🧑‍⚕️

Domain Expert

6 specialised personas: Medical · Legal · Finance · Code · Fitness · Research. Domain-tuned system prompts for accurate expert-level answers.

🛡️

Safety Guard

Llama Guard 3 20B checks every response for unsafe content before it reaches you. Toggle on/off in the sidebar.

⚡ Supported Models

Model	Type	Context	Best For
Llama 4 Scout 17B NEW	Chat	128K	Latest Meta — fast & accurate
Qwen 3 32B NEW	Chat	32K	Multilingual tasks
Llama 3.3 70B	Chat	128K	Best overall quality (default)
Llama 3.1 8B	Chat	128K	Speed-critical tasks
Mixtral 8×7B	Chat	32K	Long documents
Whisper Large v3 Turbo	Audio	—	Speech → Text · 99 languages · auto-detect
Llama Guard 3 20B	Safety	—	Content safety filter
multilingual-MiniLM-L12-v2 NEW	Embeddings	—	50+ language RAG embeddings (local)
mmarco-mMiniLMv2-L12 NEW	Reranker	—	Cross-lingual reranking · 13 languages (local)

🚀 Quick Start

Clone

git clone https://github.com/rajneeshbabu/penguin-ai.git

cd penguin-ai

Install

pip install -r requirements.txt

Includes langdetect for auto language detection.

API Key

Free key at console.groq.com

echo "GROQ_API_KEY=gsk_..." > .env

Run

streamlit run app.py

Opens at localhost:8501

🗨️ See It In Action