<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>AI/ML &#8211; BKaushik Blog</title>
	<atom:link href="https://bkaushik.com/category/ai-ml/feed/" rel="self" type="application/rss+xml" />
	<link>https://bkaushik.com</link>
	<description>Code. Build. Learn. Share.</description>
	<lastBuildDate>Fri, 22 Aug 2025 19:12:52 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.7.2</generator>
	<item>
		<title>Unlocking the Power of RAG with Ollama, LangChain &#038; ChromaDB: A Step-by-Step Tutorial</title>
		<link>https://bkaushik.com/ai-ml/unlocking-the-power-of-rag-with-ollama-langchain-chromadb-a-step-by-step-tutorial/</link>
		
		<dc:creator><![CDATA[Bikash]]></dc:creator>
		<pubDate>Fri, 22 Aug 2025 19:12:10 +0000</pubDate>
				<category><![CDATA[AI/ML]]></category>
		<guid isPermaLink="false">https://bkaushik.com/?p=99</guid>

					<description><![CDATA[Introduction In the rapidly evolving landscape of AI, combining retrieval mechanisms with generative models—commonly known as Retrieval-Augmented Generation (RAG)—is becoming a foundational approach to create grounded, factual, and context-aware responses. Git Repo &#8211; https://github.com/bikashkaushik/ollama-rag-tutorial Why This Tutorial Matters Achieve factual accuracy by grounding large language model responses in real-world documents—a key challenge in generative AI....]]></description>
										<content:encoded><![CDATA[<h3 data-start="443" data-end="465"><strong data-start="447" data-end="463">Introduction</strong></h3>
<p data-start="466" data-end="913">In the rapidly evolving landscape of AI, combining retrieval mechanisms with generative models—commonly known as Retrieval-Augmented Generation (RAG)—is becoming a foundational approach to create grounded, factual, and context-aware responses.</p>
<p data-start="466" data-end="913">Git Repo &#8211; <a href="https://github.com/bikashkaushik/ollama-rag-tutorial">https://github.com/bikashkaushik/ollama-rag-tutorial</a></p>
<hr data-start="915" data-end="918" />
<h3 data-start="920" data-end="955"><strong data-start="924" data-end="953">Why This Tutorial Matters</strong></h3>
<ul data-start="956" data-end="1320">
<li data-start="956" data-end="1090">
<p data-start="958" data-end="1090"><strong data-start="958" data-end="986">Achieve factual accuracy</strong> by grounding large language model responses in real-world documents—a key challenge in generative AI.</p>
</li>
<li data-start="1091" data-end="1210">
<p data-start="1093" data-end="1210"><strong data-start="1093" data-end="1136">Use open-source, locally run components</strong> to ensure full control over data privacy, cost-efficiency, and latency.</p>
</li>
<li data-start="1211" data-end="1320">
<p data-start="1213" data-end="1320"><strong data-start="1213" data-end="1250">Lean on familiar Python libraries</strong> like LangChain and ChromaDB to streamline development and deployment.</p>
</li>
</ul>
<hr data-start="1322" data-end="1325" />
<h3 data-start="1327" data-end="1361"><strong data-start="1331" data-end="1359">Overview of the Tutorial</strong></h3>
<p data-start="1362" data-end="1422">Here’s what the repository includes and how it all connects:</p>
<ol data-start="1424" data-end="3071">
<li data-start="1424" data-end="1911">
<p data-start="1427" data-end="1529"><strong data-start="1427" data-end="1448">Environment Setup</strong><br data-start="1448" data-end="1451" />Set up a clean Python environment and install required packages, including:<code class="whitespace-pre!"><span class="hljs-attribute"><br />
</span></code><span style="background-color: #e9ebec; color: #222222;">pip install langchain<br />
pip install langchain-community<br />
</span><span style="background-color: #e9ebec; color: #222222;">pip install langchain-ollama<br />
</span><span style="background-color: #e9ebec; color: #222222;">pip install langchain-chroma<br />
</span><span style="background-color: #e9ebec; color: #222222;">pip install chromadb<br />
</span><span style="background-color: #e9ebec; color: #222222;">pip install pypdf</span></p>
<p data-start="1733" data-end="1911">This ensures you have everything from embedding and RAG orchestration (LangChain) to a vector store (ChromaDB) and document parsing (pypdf). <span class="" data-state="closed"><span class="ms-1 inline-flex max-w-full items-center relative top-[-0.094rem] animate-[show_150ms_ease-in]" data-testid="webpage-citation-pill"><a class="flex h-4.5 overflow-hidden rounded-xl px-2 text-[9px] font-medium text-token-text-secondary! bg-[#F4F4F4]! dark:bg-[#303030]! transition-colors duration-150 ease-in-out" href="https://github.com/bikashkaushik/ollama-rag-tutorial/tree/main" target="_blank" rel="noopener"><span class="relative start-0 bottom-0 flex h-full w-full items-center"><span class="flex h-4 w-full items-center justify-between overflow-hidden"><span class="max-w-full grow truncate overflow-hidden text-center">GitHub</span></span></span></a></span></span></p>
</li>
<li data-start="1913" data-end="2343">
<p data-start="1916" data-end="1953"><strong data-start="1916" data-end="1951">Installing &amp; Configuring Ollama<br />
</strong></p>
<ol data-start="1957" data-end="2203">
<li data-start="1957" data-end="2015">
<p data-start="1959" data-end="2015">Download and install <strong data-start="1980" data-end="1990">Ollama</strong> to run models locally.</p>
</li>
<li data-start="1957" data-end="2015">
<p data-start="1959" data-end="2015">Pull and start the embedding model:<br />
<span style="background-color: #e9ebec; color: #222222;">ollama pull nomic-embed-text</span><code class="whitespace-pre!"><br />
</code></li>
<li data-start="2114" data-end="2203">
<p data-start="2116" data-end="2159">Pull and run the language generation model:<br />
<span style="background-color: #e9ebec; color: #222222;">ollama run mistral</span><code class="whitespace-pre!"><br />
</code></li>
</ol>
<p data-start="2207" data-end="2343">This lets you run core inference and embedding services without relying on external API endpoints. <span class="" data-state="closed"><span class="ms-1 inline-flex max-w-full items-center relative top-[-0.094rem] animate-[show_150ms_ease-in]" data-testid="webpage-citation-pill"><a class="flex h-4.5 overflow-hidden rounded-xl px-2 text-[9px] font-medium text-token-text-secondary! bg-[#F4F4F4]! dark:bg-[#303030]! transition-colors duration-150 ease-in-out" href="https://github.com/bikashkaushik/ollama-rag-tutorial/tree/main" target="_blank" rel="noopener"><span class="relative start-0 bottom-0 flex h-full w-full items-center"><span class="flex h-4 w-full items-center justify-between overflow-hidden"><span class="max-w-full grow truncate overflow-hidden text-center">GitHub</span></span></span></a></span></span></p>
</li>
<li data-start="2345" data-end="2574">
<p data-start="2348" data-end="2574"><strong data-start="2348" data-end="2370">Embedding Function</strong><br data-start="2370" data-end="2373" />The tutorial includes a <code data-start="2400" data-end="2427">get_embedding_function.py</code> script to wrap the Ollama embedding model. This function is essential for converting raw text into vector embeddings that drive similarity search.</p>
</li>
<li data-start="2576" data-end="2807">
<p data-start="2579" data-end="2807"><strong data-start="2579" data-end="2608">Building the Vector Store</strong><br data-start="2608" data-end="2611" />Through <code data-start="2622" data-end="2644">populate_database.py</code>, you’ll parse source documents, generate embeddings, and store them in ChromaDB. This setup forms the searchable knowledge base for RAG to tap into.</p>
<p class="mb-2 mt-4 text-base font-[475] first:mt-0 dark:font-[450]">Example Usage<br />
<span style="background-color: #e9ebec; color: #222222;"><em>python populate_database.py &#8211;reset # Clears and rebuilds the database from scratch.</em><br />
<em>python populate_database.py # Adds new or updated documents without wiping existing data.</em></span></p>
</li>
<li data-start="2809" data-end="3071">
<p data-start="2812" data-end="3071"><strong data-start="2812" data-end="2860">Querying with Retrieval-Augmented Generation</strong><br data-start="2860" data-end="2863" /><code data-start="2866" data-end="2881">query_data.py</code> orchestrates the full RAG workflow: retrieve relevant content from ChromaDB, feed it—along with your query—into the Mistral LLM via LangChain, and receive grounded, context-aware responses.</p>
<p class="mb-2 mt-4 text-base font-[475] first:mt-0 dark:font-[450]">Example Command Usage<br />
<em><span style="background-color: #e9ebec; color: #222222;">python query_data.py &#8220;What is the RAG technique in AI?&#8221;</span></em></p>
</li>
</ol>
<hr data-start="3073" data-end="3076" />
<h3 data-start="3078" data-end="3109"><strong data-start="3082" data-end="3107">Why You Should Try It</strong></h3>
<ul data-start="3110" data-end="3442">
<li data-start="3110" data-end="3185">
<p data-start="3112" data-end="3185"><strong data-start="3112" data-end="3129">Privacy-first</strong>: Run everything locally—no external API dependencies.</p>
</li>
<li data-start="3186" data-end="3310">
<p data-start="3188" data-end="3310"><strong data-start="3188" data-end="3199">Modular</strong>: Choose which components to swap, upgrade, or extend (e.g., swap the vector store, embedding model, or LLM).</p>
</li>
<li data-start="3311" data-end="3442">
<p data-start="3313" data-end="3442"><strong data-start="3313" data-end="3337">Practical foundation</strong>: Use this as a springboard to build internal helpdesks, document QA bots, research assistants, and more.</p>
</li>
</ul>
<hr data-start="3849" data-end="3852" />
<h3 data-start="3854" data-end="3885"><strong data-start="3858" data-end="3883">Call to the Community</strong></h3>
<p data-start="3886" data-end="4099">If you&#8217;re working on similar RAG implementations—or curious to explore the Ollama, LangChain, or ChromaDB ecosystems—let’s connect! I’d love to compare notes, share learnings, or even co-create something exciting.</p>
<hr data-start="4101" data-end="4104" />
<h3 data-start="4106" data-end="4126"><strong data-start="4110" data-end="4124">Conclusion</strong></h3>
<p data-start="4127" data-end="4445">This is more than just a walkthrough—it&#8217;s an invitation to experiment with cutting-edge, local-first AI techniques. By merging retrieval, embeddings, and generation in an open-source stack, you can bring smarter, grounded interactions to your applications—while staying in control of your stack.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Machine Learning using Python having no prior working experience in Python</title>
		<link>https://bkaushik.com/posts/machine-learning-using-python-having-no-prior-working-experience-in-python/</link>
		
		<dc:creator><![CDATA[Bikash]]></dc:creator>
		<pubDate>Fri, 23 May 2025 17:59:10 +0000</pubDate>
				<category><![CDATA[AI/ML]]></category>
		<category><![CDATA[Posts]]></category>
		<guid isPermaLink="false">https://bkaushik.com/?p=84</guid>

					<description><![CDATA[If you&#8217;re aiming to learn Machine Learning using Python and have no prior working experience in Python, here&#8217;s a structured and practical roadmap tailored for engineers or developers from other languages or domains. 🛠️ Phase 1: Learn Python Basics (1–2 weeks) Focus on what’s needed for ML, skip unnecessary details for now. 🔹 Topics to...]]></description>
										<content:encoded><![CDATA[<p>If you&#8217;re aiming to learn <strong data-start="26" data-end="59">Machine Learning using Python</strong> and have <strong data-start="69" data-end="110">no prior working experience in Python</strong>, here&#8217;s a structured and practical roadmap tailored for <strong data-start="167" data-end="226">engineers or developers from other languages or domains</strong>.</p>
<h3 data-start="234" data-end="281">🛠️ Phase 1: Learn Python Basics (1–2 weeks)</h3>
<blockquote data-start="283" data-end="349">
<p data-start="285" data-end="349">Focus on what’s needed for ML, skip unnecessary details for now.</p>
</blockquote>
<p data-start="351" data-end="374"><strong>🔹 Topics to Cover:</strong></p>
<ul>
<li style="list-style-type: none;">
<ul>
<li data-start="377" data-end="422">Variables, data types (int, float, str, bool)</li>
<li data-start="377" data-end="422">Lists, tuples, dictionaries, sets</li>
<li data-start="377" data-end="422">Control flow: <code data-start="475" data-end="479">if</code>, <code data-start="481" data-end="486">for</code>, <code data-start="488" data-end="495">while</code>, <code data-start="497" data-end="504">break</code>, <code data-start="506" data-end="516">continue</code></li>
<li data-start="377" data-end="422">Functions: <code data-start="530" data-end="535">def</code>, arguments, return values</li>
<li data-start="377" data-end="422">Modules and imports</li>
<li data-start="377" data-end="422">Exception handling: <code data-start="606" data-end="611">try</code>, <code data-start="613" data-end="621">except</code></li>
<li data-start="377" data-end="422">Basic file I/O</li>
<li data-start="377" data-end="422">Intro to Jupyter Notebooks</li>
</ul>
</li>
</ul>
<p data-start="669" data-end="682"><strong>🧰 Tools:</strong></p>
<ul>
<li data-start="685" data-end="730">Install Python (via Anaconda or <code data-start="717" data-end="729">python.org</code>)</li>
<li data-start="685" data-end="730">Use <strong data-start="737" data-end="757">Jupyter Notebook</strong> or <strong data-start="761" data-end="777">Google Colab</strong> for practice</li>
</ul>
<p data-start="792" data-end="808"><strong>✅ Resources:</strong></p>
<ul>
<li data-start="811" data-end="906"><a class="" href="https://www.youtube.com/watch?v=LHBE6Q9XlzI" target="_new" rel="noopener" data-start="811" data-end="906">Python for Data Science – FreeCodeCamp (YouTube)</a></li>
<li data-start="811" data-end="906"><a class="cursor-pointer" target="_new" rel="noopener" data-start="909" data-end="983">Python Crash Course – Real Python</a></li>
</ul>
<h3 data-start="990" data-end="1037">🤖 Phase 2: Python for Data &amp; ML (2–3 weeks)</h3>
<blockquote data-start="1039" data-end="1099">
<p data-start="1041" data-end="1099">Learn the libraries that power machine learning in Python.</p>
</blockquote>
<p data-start="1101" data-end="1127"><strong>🔹 Libraries to Learn:</strong></p>
<ul>
<li style="list-style-type: none;">
<ul>
<li data-start="1130" data-end="1166"><strong data-start="1130" data-end="1139">NumPy</strong> – for numerical operations</li>
<li data-start="1130" data-end="1166"><strong data-start="1169" data-end="1179">Pandas</strong> – for data manipulation (DataFrames)</li>
<li data-start="1130" data-end="1166"><strong data-start="1219" data-end="1243">Matplotlib / Seaborn</strong> – for visualization</li>
<li data-start="1130" data-end="1166"><strong data-start="1266" data-end="1282">Scikit-learn</strong> – for classic ML models</li>
</ul>
</li>
</ul>
<p data-start="1308" data-end="1328"><strong>🔍 Key Concepts:</strong></p>
<ul>
<li data-start="1331" data-end="1355">Arrays, matrices (NumPy)</li>
<li data-start="1331" data-end="1355">DataFrames: loading CSV, filtering, grouping (Pandas)</li>
<li data-start="1331" data-end="1355">Plotting distributions and trends</li>
<li data-start="1331" data-end="1355">Using <code data-start="1456" data-end="1474">train_test_split</code>, <code data-start="1476" data-end="1483">fit()</code>, <code data-start="1485" data-end="1496">predict()</code> in scikit-learn</li>
</ul>
<p data-start="1514" data-end="1530"><strong>✅ Resources:</strong></p>
<ul>
<li data-start="1533" data-end="1594"><a class="cursor-pointer" target="_new" rel="noopener" data-start="1533" data-end="1594">Kaggle’s Python Course</a></li>
<li data-start="1533" data-end="1594"><a class="cursor-pointer" target="_new" rel="noopener" data-start="1597" data-end="1674">Scikit-learn Tutorials</a></li>
</ul>
<h3 data-start="1681" data-end="1729">🤖 Phase 3: Core Machine Learning (3–4 weeks)</h3>
<blockquote data-start="1731" data-end="1785">
<p data-start="1733" data-end="1785">Apply Python to actual ML workflows using real data.</p>
</blockquote>
<p data-start="1787" data-end="1811"><strong>🔹 Topics to Master:</strong></p>
<ul>
<li data-start="1814" data-end="1834">Supervised Learning:
<ul>
<li data-start="1814" data-end="1834">Linear regression</li>
<li data-start="1814" data-end="1834">Logistic regression</li>
<li data-start="1814" data-end="1834">Decision trees, Random Forest</li>
<li data-start="1814" data-end="1834">K-Nearest Neighbors</li>
</ul>
</li>
<li data-start="1814" data-end="1834">Unsupervised Learning:
<ul>
<li data-start="1814" data-end="1834">Clustering (K-Means)</li>
<li data-start="1814" data-end="1834">Dimensionality Reduction (PCA)</li>
</ul>
</li>
<li data-start="1814" data-end="1834">Model evaluation:
<ul>
<li data-start="1814" data-end="1834">Accuracy, precision, recall, F1-score</li>
<li data-start="1814" data-end="1834">Confusion matrix</li>
</ul>
</li>
<li data-start="1814" data-end="1834">Cross-validation, overfitting, regularization</li>
</ul>
<p data-start="2156" data-end="2172"><strong>✅ Resources:</strong></p>
<ul>
<li data-start="2175" data-end="2276"><a class="cursor-pointer" target="_new" rel="noopener" data-start="2175" data-end="2276">Google’s Machine Learning Crash Course</a></li>
<li data-start="2175" data-end="2276"><a class="cursor-pointer" target="_new" rel="noopener" data-start="2279" data-end="2416">Hands-On ML with Scikit-Learn, Keras &amp; TensorFlow (book)</a></li>
</ul>
<h3 data-start="2423" data-end="2457">📦 Phase 4: Projects &amp; Practice</h3>
<blockquote data-start="2459" data-end="2521">
<p data-start="2461" data-end="2521">Reinforce your skills with real-world datasets and projects.</p>
</blockquote>
<p data-start="2523" data-end="2544"><strong>🔹 Project Ideas:</strong></p>
<ul>
<li style="list-style-type: none;">
<ul>
<li data-start="2547" data-end="2584">Predict house prices using regression</li>
<li data-start="2547" data-end="2584">Classify spam vs ham emails</li>
<li data-start="2547" data-end="2584">Titanic survival prediction</li>
<li data-start="2547" data-end="2584">Stock price trend classification</li>
<li data-start="2547" data-end="2584">Customer segmentation with K-Means</li>
</ul>
</li>
</ul>
<p data-start="2718" data-end="2733"><strong>✅ Datasets:</strong></p>
<ul>
<li data-start="2736" data-end="2769"><a class="" href="https://www.kaggle.com/" target="_new" rel="noopener" data-start="2736" data-end="2769">Kaggle</a></li>
<li data-start="2736" data-end="2769"><a class="" href="https://archive.ics.uci.edu/ml/index.php" target="_new" rel="noopener" data-start="2772" data-end="2833">UCI ML Repository</a></li>
<li data-start="2736" data-end="2769"><a class="cursor-pointer" target="_new" rel="noopener" data-start="2836" data-end="2902">Hugging Face Datasets (for NLP)</a></li>
</ul>
<h3 data-start="2909" data-end="2930">🧭 Summary Roadmap</h3>
<div class="_tableContainer_16hzy_1">
<div class="_tableWrapper_16hzy_14 group flex w-fit flex-col-reverse" tabindex="-1">
<table class="w-fit min-w-(--thread-content-width)" data-start="2932" data-end="3424">
<thead data-start="2932" data-end="3014">
<tr data-start="2932" data-end="3014">
<th data-start="2932" data-end="2960" data-col-size="sm">Phase</th>
<th data-start="2960" data-end="2972" data-col-size="sm">Duration</th>
<th data-start="2972" data-end="3014" data-col-size="sm">Outcome</th>
</tr>
</thead>
<tbody data-start="3097" data-end="3424">
<tr data-start="3097" data-end="3178">
<td data-start="3097" data-end="3124" data-col-size="sm">Python Basics</td>
<td data-col-size="sm" data-start="3124" data-end="3136">1–2 weeks</td>
<td data-col-size="sm" data-start="3136" data-end="3178">Comfortable writing basic Python</td>
</tr>
<tr data-start="3179" data-end="3260">
<td data-start="3179" data-end="3206" data-col-size="sm">Python for ML Libraries</td>
<td data-col-size="sm" data-start="3206" data-end="3218">2–3 weeks</td>
<td data-col-size="sm" data-start="3218" data-end="3260">Data loading, visualization, prep</td>
</tr>
<tr data-start="3261" data-end="3342">
<td data-start="3261" data-end="3288" data-col-size="sm">Core ML Concepts</td>
<td data-col-size="sm" data-start="3288" data-end="3300">3–4 weeks</td>
<td data-col-size="sm" data-start="3300" data-end="3342">Build ML models with scikit-learn</td>
</tr>
<tr data-start="3343" data-end="3424">
<td data-start="3343" data-end="3370" data-col-size="sm">Projects &amp; Portfolio</td>
<td data-col-size="sm" data-start="3370" data-end="3382">Ongoing</td>
<td data-col-size="sm" data-start="3382" data-end="3424">Real-world ML practice</td>
</tr>
</tbody>
</table>
</div>
</div>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
