<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Troubleshooting on Munish Thakur | DevOps Engineer</title><link>https://methakur.info/tags/troubleshooting/</link><description>Recent content in Troubleshooting on Munish Thakur | DevOps Engineer</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sun, 10 Nov 2024 00:00:00 +0000</lastBuildDate><atom:link href="https://methakur.info/tags/troubleshooting/index.xml" rel="self" type="application/rss+xml"/><item><title>Debugging OOMKilled Pods in Production: A Step-by-Step Guide</title><link>https://methakur.info/blog/debugging-oomkilled-pods/</link><pubDate>Sun, 10 Nov 2024 00:00:00 +0000</pubDate><guid>https://methakur.info/blog/debugging-oomkilled-pods/</guid><description>&lt;p>It&amp;rsquo;s Monday morning, and Slack is blowing up. The backend service is down. Again.&lt;/p>
&lt;p>I SSH into the node and run &lt;code>kubectl get pods&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;div style="color:#f8f8f2;background-color:#282a36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">
&lt;table style="border-spacing:0;padding:0;margin:0;border:0;">&lt;tr>&lt;td style="vertical-align:top;padding:0;margin:0;border:0;">
&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#282a36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code>&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">1
&lt;/span>&lt;span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#7f7f7f">2
&lt;/span>&lt;/code>&lt;/pre>&lt;/td>
&lt;td style="vertical-align:top;padding:0;margin:0;border:0;;width:100%">
&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#282a36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-bash" data-lang="bash">&lt;span style="display:flex;">&lt;span>NAME READY STATUS RESTARTS AGE
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>backend-django-7d8f9c-abcd 0/1 OOMKilled &lt;span style="color:#bd93f9">5&lt;/span> 10m
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/td>&lt;/tr>&lt;/table>
&lt;/div>
&lt;/div>&lt;p>If you&amp;rsquo;ve worked with Kubernetes, you&amp;rsquo;ve seen this. &lt;strong>OOMKilled&lt;/strong> means your pod tried to use more memory than allowed and got terminated by the kernel.&lt;/p>
&lt;p>Here&amp;rsquo;s how I debug it.&lt;/p></description></item></channel></rss>