<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[CloudCraftAI with Jintao]]></title><description><![CDATA[I love open source and love to share.
I'm the maintainer of #Kubernetes Ingress-NGINX | Microsoft MVP | CNCF Ambassador]]></description><link>https://blog.moelove.info</link><generator>RSS for Node</generator><lastBuildDate>Sat, 18 Apr 2026 13:36:59 GMT</lastBuildDate><atom:link href="https://blog.moelove.info/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Why I Failed to Build a Lego-Style Coding Agent]]></title><description><![CDATA[I wanted it simple. I made it simple. Then I discovered that making it actually useful meant adding feature after feature. What started as building blocks became an entire castle.

The Beginning: A Simple Idea
On November 30, 2025, I made my first co...]]></description><link>https://blog.moelove.info/why-i-failed-to-build-a-lego-style-coding-agent</link><guid isPermaLink="true">https://blog.moelove.info/why-i-failed-to-build-a-lego-style-coding-agent</guid><category><![CDATA[coding]]></category><category><![CDATA[agentic AI]]></category><category><![CDATA[llm]]></category><category><![CDATA[AWS]]></category><dc:creator><![CDATA[Jintao Zhang]]></dc:creator><pubDate>Tue, 13 Jan 2026 15:03:32 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1768316550433/d7548021-5d1a-4d0a-bdc2-29da7ad3b8a6.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>I wanted it simple. I made it simple. Then I discovered that making it actually <em>useful</em> meant adding feature after feature. What started as building blocks became an entire castle.</p>
</blockquote>
<h2 id="heading-the-beginning-a-simple-idea">The Beginning: A Simple Idea</h2>
<p>On November 30, 2025, I made my first commit: <code>amcp agent init</code>. In the README, I described it like this:</p>
<blockquote>
<p>A Lego-style coding agent CLI with built-in tools (grep, read files, bash execution) and MCP server integration for extended capabilities. <strong>Lego-style</strong>—that was my north star. I envisioned a coding agent that worked like Lego bricks:</p>
</blockquote>
<ul>
<li><p><strong>Minimal core</strong>: Just <code>grep</code>, <code>read_file</code>, and <code>bash</code>—the essentials</p>
</li>
<li><p><strong>Composable</strong>: Extend capabilities through the MCP protocol</p>
</li>
<li><p><strong>Lightweight</strong>: Only <strong>2,482 lines</strong> of Python</p>
</li>
<li><p><strong>Few dependencies</strong>: Just <code>typer</code>, <code>rich</code>, <code>pydantic</code>, <code>mcp</code>, and <code>openai</code> And I succeeded. The initial AMCP was a clean, focused CLI tool:</p>
</li>
</ul>
<pre><code class="lang-bash">src/amcp/
├── agent.py       <span class="hljs-comment"># 620 lines - Main agent loop</span>
├── tools.py       <span class="hljs-comment"># 511 lines - Tool definitions</span>
├── chat.py        <span class="hljs-comment"># 579 lines - Conversation handling</span>
├── cli.py         <span class="hljs-comment"># 265 lines - CLI entry point</span>
├── config.py      <span class="hljs-comment"># 169 lines - Configuration loading</span>
├── mcp_client.py  <span class="hljs-comment"># 102 lines - MCP integration</span>
└── readfile.py    <span class="hljs-comment">#  47 lines - File reading</span>
</code></pre>
<p>Simple. Beautiful. Complete. Or so I thought.</p>
<h2 id="heading-turning-point-1-the-context-window-explodes">Turning Point #1: The Context Window Explodes</h2>
<p>Two weeks after launch (December 14, 2025), I hit my first real problem: <strong>context window overflow</strong>. When an agent works on complex tasks, conversation history grows indefinitely. My initial solution was brutal—just keep the last 20 messages. But that meant the agent would "forget" critical context from earlier in the session. I had no choice but to add <code>compaction.py</code> (+155 lines):</p>
<pre><code class="lang-python"><span class="hljs-comment"># 467d72b: feat: add context compaction</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Compactor</span>:</span>
    <span class="hljs-string">"""Intelligently compress conversation history while preserving key information."""</span>
</code></pre>
<p>This was the first "mandatory brick." Without it, the agent couldn't complete complex, multi-step tasks.</p>
<h2 id="heading-turning-point-2-not-everyone-uses-openai">Turning Point #2: Not Everyone Uses OpenAI</h2>
<p>Two days later (December 16, 2025), reality knocked again: not everyone uses OpenAI.</p>
<pre><code class="lang-bash">a9455d5: feat: add support <span class="hljs-keyword">for</span> ACP
fb6b08c: add anthropic and open_response LLM format support
</code></pre>
<p>This added <strong>2,709 lines of code</strong>:</p>
<ul>
<li><p><code>acp_agent.py</code>: 752 lines (Agent Client Protocol support)</p>
</li>
<li><p>Anthropic Claude integration</p>
</li>
<li><p>OpenAI Responses API format support What started as a simple <code>openai.chat.completions.create()</code> call was now a multi-headed abstraction layer. Different APIs, different formats, different quirks—all needing adaptation.</p>
</li>
</ul>
<h2 id="heading-turning-point-3-one-agent-isnt-enough">Turning Point #3: One Agent Isn't Enough</h2>
<p>On Christmas Day 2025, I realized that complex tasks need multiple agents working together:</p>
<pre><code class="lang-bash">e61bba1: add multiple agents
5e58fc3: add TaskTool and EventBus
</code></pre>
<p>This added <strong>2,246 lines of code</strong>:</p>
<ul>
<li><p><code>multi_agent.py</code>: 375 lines</p>
</li>
<li><p><code>message_queue.py</code>: 531 lines</p>
</li>
<li><p><code>event_bus.py</code> (eventually grew to 635 lines)</p>
</li>
<li><p><code>task.py</code> (now 859 lines) The jump from single-agent to multi-agent was a qualitative shift. My Lego bricks were becoming a castle.</p>
</li>
</ul>
<h2 id="heading-turning-point-4-users-need-extensibility">Turning Point #4: Users Need Extensibility</h2>
<p>Between December 28, 2025 and January 2, 2026, I added two extensibility systems:</p>
<pre><code class="lang-bash">d746a8c: feat: add Hooks system <span class="hljs-keyword">for</span> extensible agent behavior (v0.5.0)
17eeeb3: feat: add skills and commands system <span class="hljs-keyword">for</span> agent extensibility
</code></pre>
<p>The Hooks system (+886 lines) lets users inject custom logic before and after tool execution. The Skills and Commands system (+836 lines) enables dynamic capability loading. I initially thought these were "nice to have." But when I started actually using AMCP for real work:</p>
<ul>
<li><p>Dangerous operations needed validation → PreToolUse hooks</p>
</li>
<li><p>Everyone on the team had their own workflows → Commands</p>
</li>
<li><p>Different projects needed different expertise → Skills Features I thought were optional turned out to be essential.</p>
</li>
</ul>
<h2 id="heading-turning-point-5-production-needs-a-server">Turning Point #5: Production Needs a Server</h2>
<p>January 7-9, 2026 brought the biggest refactor:</p>
<pre><code class="lang-bash">7f0ac35: feat: init C/S architecture
52c93c2: feat(server): complete Phase 2 - streaming &amp; events
af7ce69: feat: complete Phase 3 - CLI Client SDK
f923591: feat: protocol and sessions
</code></pre>
<p>This added <strong>3,266+ lines of code</strong>:</p>
<ul>
<li><p>HTTP/WebSocket server</p>
</li>
<li><p>Session management</p>
</li>
<li><p>Event broadcasting system</p>
</li>
<li><p>Multi-client support</p>
</li>
<li><p>Protocol adaptation layer Why? Because:</p>
</li>
<li><p>IDE integration requires persistent connections</p>
</li>
<li><p>Multiple clients need shared sessions</p>
</li>
<li><p>Real-time streaming requires WebSocket</p>
</li>
<li><p>Enterprise deployment requires a service architecture</p>
</li>
</ul>
<h2 id="heading-the-numbers-tell-the-story">The Numbers Tell the Story</h2>
<p>Here's a before-and-after comparison:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Metric</td><td>v0.1.0 (Initial)</td><td>v0.8.0 (Current)</td></tr>
</thead>
<tbody>
<tr>
<td>Lines of Code</td><td>2,482</td><td><strong>20,176</strong> (8x growth)</td></tr>
<tr>
<td>Python Files</td><td>8</td><td><strong>53</strong> (6.6x growth)</td></tr>
<tr>
<td>Dependencies</td><td>7</td><td><strong>14+</strong> (2x growth)</td></tr>
<tr>
<td>Directory Depth</td><td>1 level</td><td><strong>4 levels</strong> (added server/, client/, protocol/, prompts/)</td></tr>
<tr>
<td>Development Time</td><td>—</td><td>40 days</td></tr>
<tr>
<td>Commits</td><td>1</td><td><strong>58</strong></td></tr>
</tbody>
</table>
</div><h2 id="heading-growth-timeline">Growth Timeline</h2>
<pre><code class="lang-bash">2025-11-30  ████                           Initial version (2.5K lines)
2025-12-14  █████                          + Context Compaction
2025-12-16  ████████                       + ACP + Multi-LLM Support
2025-12-25  ████████████                   + Multi-Agent + EventBus
2025-12-28  ██████████████                 + Hooks System
2026-01-02  ████████████████               + Skills &amp; Commands
2026-01-07  ████████████████████           + C/S Architecture
2026-01-09  █████████████████████████      Current version (20K+ lines)
</code></pre>
<h2 id="heading-why-simple-doesnt-last">Why "Simple" Doesn't Last</h2>
<p>Looking back at 40 days of development, I've identified four reasons why simplicity was impossible to maintain:</p>
<h3 id="heading-1-reality-is-more-complex-than-your-imagination">1. Reality Is More Complex Than Your Imagination</h3>
<p>I initially thought <code>read_file + grep + bash</code> would be enough. Reality disagreed:</p>
<ul>
<li><p>Large files need chunked reading → smart readfile modes</p>
</li>
<li><p>Large-scale edits are error-prone → apply_patch tool</p>
</li>
<li><p>Complex refactoring needs planning → todo tool</p>
</li>
<li><p>Dangerous operations need confirmation → permission system</p>
</li>
</ul>
<h3 id="heading-2-user-needs-are-incremental">2. User Needs Are Incremental</h3>
<p>At first, I was the only user. Then others started using it:</p>
<ul>
<li><p>"Can you support Claude?" → Multi-LLM support</p>
</li>
<li><p>"Can I use this in Zed?" → ACP protocol</p>
</li>
<li><p>"Can multiple agents collaborate?" → Multi-Agent architecture</p>
</li>
<li><p>"Can I customize workflows?" → Skills &amp; Commands</p>
</li>
<li><p>"Can I deploy this as a service?" → C/S architecture Every request was reasonable. Every feature became necessary.</p>
</li>
</ul>
<h3 id="heading-3-production-requires-robustness">3. Production Requires Robustness</h3>
<p>Toys can be simple. Production systems must:</p>
<ul>
<li><p>Handle edge cases</p>
</li>
<li><p>Manage resource lifecycles</p>
</li>
<li><p>Support concurrent access</p>
</li>
<li><p>Provide monitoring and debugging</p>
</li>
<li><p>Guarantee type safety These "non-functional requirements" often require more code than the features themselves.</p>
</li>
</ul>
<h3 id="heading-4-composability-requires-infrastructure">4. Composability Requires Infrastructure</h3>
<p>Here's the irony: to achieve true "Lego-style" composability, you need:</p>
<ul>
<li><p>A unified tool interface → <code>BaseTool</code> abstraction</p>
</li>
<li><p>A message passing mechanism → <code>EventBus</code></p>
</li>
<li><p>Lifecycle hooks → <code>Hooks</code> system</p>
</li>
<li><p>Dynamic loading → <code>Skills</code> system</p>
</li>
<li><p>Configuration management → Complex config layer <strong>The infrastructure for composability is itself a source of complexity.</strong></p>
</li>
</ul>
<h2 id="heading-what-i-learned">What I Learned</h2>
<h3 id="heading-1-lego-style-is-a-philosophy-not-a-destination">1. "Lego-Style" Is a Philosophy, Not a Destination</h3>
<p>Lego bricks look simple. But Lego the company has thousands of different parts, strict quality standards, and a sophisticated design system behind those "simple" blocks. AMCP is still "Lego-style"—its design still prioritizes composability. But achieving composability requires significant complexity.</p>
<h3 id="heading-2-complexity-is-necessary-but-must-be-managed">2. Complexity Is Necessary, But Must Be Managed</h3>
<p>The codebase grew 8x, but if you look closely, complexity is distributed:</p>
<ul>
<li><p>Core agent logic remains relatively simple</p>
</li>
<li><p>Complexity is encapsulated in individual modules</p>
</li>
<li><p>Modules communicate through clean interfaces <strong>Complexity is inevitable, but it can be isolated.</strong></p>
</li>
</ul>
<h3 id="heading-3-incremental-evolution-is-the-right-path">3. Incremental Evolution Is the Right Path</h3>
<p>If I had tried to design a 20,000-line system from day one, I would have:</p>
<ul>
<li><p>Built features nobody actually needed</p>
</li>
<li><p>Optimized the wrong things too early</p>
</li>
<li><p>Lost the ability to iterate quickly By starting simple and solving only "the most painful problem right now," AMCP evolved into something actually useful.</p>
</li>
</ul>
<h2 id="heading-conclusion-did-i-really-fail">Conclusion: Did I Really "Fail"?</h2>
<h2 id="heading-so-did-i-actually-fail-if-the-goal-was-to-stay-at-2500-lines-of-codeyes-i-failed-spectacularly-but-if-the-goal-was-to-build-a-coding-agent-that-actually-works-in-productionthen-i-succeeded-the-price-of-success-was-accepting-that-complexity-is-unavoidable-lego-style-isnt-about-simplicity-its-about-the-right-abstractions-clear-boundaries-and-composable-design-amcp-still-has-those">So, did I actually fail? If the goal was to stay at 2,500 lines of code—yes, I failed spectacularly. But if the goal was to build a coding agent that <strong>actually works in production</strong>—then I succeeded. The price of success was accepting that complexity is unavoidable. <strong>"Lego-style" isn't about simplicity. It's about the right abstractions, clear boundaries, and composable design.</strong> AMCP still has those.</h2>
<p><em>Written on January 10, 2026AMCP v0.8.0 | 58 commits | 20,176 lines of code</em></p>
]]></content:encoded></item><item><title><![CDATA[From Deprecated npm Classic Tokens to OIDC Trusted Publishing: A CI/CD Troubleshooting Journey]]></title><description><![CDATA[In January 2026, I encountered a series of cryptic authentication errors while publishing an npm package. This post documents the complete journey from problem discovery to final resolution—hopefully saving others from the same headaches.

Background...]]></description><link>https://blog.moelove.info/from-deprecated-npm-classic-tokens-to-oidc-trusted-publishing-a-cicd-troubleshooting-journey</link><guid isPermaLink="true">https://blog.moelove.info/from-deprecated-npm-classic-tokens-to-oidc-trusted-publishing-a-cicd-troubleshooting-journey</guid><category><![CDATA[npm]]></category><category><![CDATA[npm publish]]></category><category><![CDATA[GitHub]]></category><category><![CDATA[github-actions]]></category><dc:creator><![CDATA[Jintao Zhang]]></dc:creator><pubDate>Sun, 04 Jan 2026 01:06:30 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1767489154215/3f14334b-608e-4ac0-8070-0b1e6caa1fa9.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>In January 2026, I encountered a series of cryptic authentication errors while publishing an npm package. This post documents the complete journey from problem discovery to final resolution—hopefully saving others from the same headaches.</p>
</blockquote>
<h2 id="heading-background">Background</h2>
<p>I maintain an npm package called <a target="_blank" href="https://www.npmjs.com/package/amp-acp">amp-acp</a>, an adapter that bridges Amp Code to the Agent Client Protocol (ACP). The project uses GitHub Actions for automated releases: pushing a <code>v*</code> tag triggers automatic publishing to npm and creates a GitHub Release.</p>
<h2 id="heading-the-problem">The Problem</h2>
<p>Starting with v0.3.1, every publish attempt failed. The GitHub Actions logs showed:</p>
<pre><code class="lang-bash">npm error code ENEEDAUTH
npm error need auth This <span class="hljs-built_in">command</span> requires you to be logged <span class="hljs-keyword">in</span> to https://registry.npmjs.org/
npm error need auth You need to authorize this machine using `npm adduser`
</code></pre>
<p>Even more confusing was this warning:</p>
<pre><code class="lang-bash">npm notice Security Notice: Classic tokens have been revoked. 
Granular tokens are now limited to 90 days and require 2FA by default. 
Update your CI/CD workflows to avoid disruption. 
Learn more https://gh.io/all-npm-classic-tokens-revoked
</code></pre>
<h2 id="heading-root-cause-analysis">Root Cause Analysis</h2>
<h3 id="heading-the-end-of-npm-classic-tokens">The End of npm Classic Tokens</h3>
<p>After investigation, I discovered that <strong>npm permanently deprecated all Classic Tokens on December 9, 2025</strong>. According to the <a target="_blank" href="https://github.blog/changelog/2025-12-09-npm-classic-tokens-revoked-session-based-auth-and-cli-token-management-now-available/">GitHub official announcement</a>:</p>
<ul>
<li><p>All existing npm classic tokens have been permanently revoked</p>
</li>
<li><p>Classic tokens can no longer be created or restored</p>
</li>
<li><p>New Granular tokens have a maximum validity of 90 days and require 2FA by default</p>
</li>
</ul>
<p>This means <strong>the traditional approach of storing</strong> <code>NPM_TOKEN</code> in GitHub Secrets is no longer viable (at least not as convenient as before).</p>
<h3 id="heading-the-new-authentication-method-oidc-trusted-publishing">The New Authentication Method: OIDC Trusted Publishing</h3>
<p>npm's recommended solution is <strong>OIDC Trusted Publishing</strong>. This OpenID Connect-based authentication mechanism offers several advantages:</p>
<ol>
<li><p><strong>No token management</strong> – No need to create, store, or rotate tokens</p>
</li>
<li><p><strong>Enhanced security</strong> – Uses short-lived, cryptographically signed, workflow-specific credentials</p>
</li>
<li><p><strong>Automatic provenance</strong> – Automatically generates provenance statements, providing build-origin transparency</p>
</li>
<li><p><strong>Industry standard</strong> – Aligns with PyPI, RubyGems, <a target="_blank" href="http://crates.io">crates.io</a>, and other major package registries</p>
</li>
</ol>
<h2 id="heading-troubleshooting-log">Troubleshooting Log</h2>
<h3 id="heading-attempt-1-upgrading-npm-version">Attempt 1: Upgrading npm Version</h3>
<p>Initially, I assumed the issue was an outdated npm version, so I added this to the workflow:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Update</span> <span class="hljs-string">npm</span> <span class="hljs-string">to</span> <span class="hljs-string">latest</span>
  <span class="hljs-attr">run:</span> <span class="hljs-string">npm</span> <span class="hljs-string">install</span> <span class="hljs-string">-g</span> <span class="hljs-string">npm@latest</span>
</code></pre>
<p><strong>Result: Failed</strong> ❌</p>
<h3 id="heading-attempt-2-removing-registry-url">Attempt 2: Removing registry-url</h3>
<p>Someone suggested removing the <code>registry-url</code> parameter from <code>actions/setup-node</code>:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/setup-node@v4</span>
  <span class="hljs-attr">with:</span>
    <span class="hljs-attr">node-version:</span> <span class="hljs-string">'22'</span>
    <span class="hljs-comment"># Removed registry-url</span>
</code></pre>
<p><strong>Result: Failed</strong> ❌</p>
<h3 id="heading-attempt-3-setting-nodeauthtoken-to-empty-string">Attempt 3: Setting NODE_AUTH_TOKEN to Empty String</h3>
<p>Based on some outdated resources, I tried setting <code>NODE_AUTH_TOKEN</code> to an empty string:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Publish</span> <span class="hljs-string">to</span> <span class="hljs-string">npm</span>
  <span class="hljs-attr">run:</span> <span class="hljs-string">npm</span> <span class="hljs-string">publish</span> <span class="hljs-string">--access</span> <span class="hljs-string">public</span>
  <span class="hljs-attr">env:</span>
    <span class="hljs-attr">NODE_AUTH_TOKEN:</span> <span class="hljs-string">''</span>
</code></pre>
<p><strong>Result: Failed</strong> ❌</p>
<p>Here's the critical misconception: setting an empty <code>NODE_AUTH_TOKEN</code> actually <strong>prevents</strong> OIDC from working, because npm attempts to use the empty token instead of OIDC.</p>
<h3 id="heading-attempt-4-completely-removing-nodeauthtoken">Attempt 4: Completely Removing NODE_AUTH_TOKEN</h3>
<p>I finally realized that for OIDC Trusted Publishing, <code>NODE_AUTH_TOKEN</code> should not be set at all:</p>
<pre><code class="lang-yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Publish</span> <span class="hljs-string">to</span> <span class="hljs-string">npm</span>
  <span class="hljs-attr">run:</span> <span class="hljs-string">npm</span> <span class="hljs-string">publish</span> <span class="hljs-string">--access</span> <span class="hljs-string">public</span>
  <span class="hljs-comment"># <span class="hljs-doctag">Note:</span> no env section</span>
</code></pre>
<p><strong>Result: Partial success</strong> ⚠️</p>
<p>This time OIDC authentication started working (logs showed <code>Signed provenance statement</code>), but a new error appeared:</p>
<pre><code class="lang-bash">npm error 422 Unprocessable Entity - PUT https://registry.npmjs.org/amp-acp - 
Error verifying sigstore provenance bundle: Failed to validate repository information: 
package.json: <span class="hljs-string">"repository.url"</span> is <span class="hljs-string">""</span>, expected to match 
<span class="hljs-string">"https://github.com/tao12345666333/amp-acp"</span> from provenance
</code></pre>
<h3 id="heading-attempt-5-final-success-adding-the-repository-field">Attempt 5 (Final Success): Adding the repository Field</h3>
<p>It turns out npm's Provenance validation requires <code>package.json</code> to include a <code>repository</code> field matching the GitHub repository:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"name"</span>: <span class="hljs-string">"amp-acp"</span>,
  <span class="hljs-attr">"version"</span>: <span class="hljs-string">"0.3.7"</span>,
  <span class="hljs-attr">"repository"</span>: {
    <span class="hljs-attr">"type"</span>: <span class="hljs-string">"git"</span>,
    <span class="hljs-attr">"url"</span>: <span class="hljs-string">"https://github.com/tao12345666333/amp-acp"</span>
  }
}
</code></pre>
<p><strong>Result: Success!</strong> ✅</p>
<h2 id="heading-the-correct-configuration">The Correct Configuration</h2>
<h3 id="heading-1-configure-trusted-publisher-on-npmjscomhttpnpmjscom">1. Configure Trusted Publisher on <a target="_blank" href="http://npmjs.com">npmjs.com</a></h3>
<p>First, configure Trusted Publisher on the npm website:</p>
<ol>
<li><p>Navigate to <a target="_blank" href="https://www.npmjs.com/package/YOUR_PACKAGE/settings"><code>https://www.npmjs.com/package/YOUR_PACKAGE/settings</code></a></p>
</li>
<li><p>Find the "Trusted Publisher" section</p>
</li>
<li><p>Select "GitHub Actions"</p>
</li>
<li><p>Fill in the following:</p>
<ul>
<li><p><strong>Organization/User</strong>: Your GitHub username or organization name</p>
</li>
<li><p><strong>Repository</strong>: Your repository name</p>
</li>
<li><p><strong>Workflow filename</strong>: The workflow file name (e.g., <code>release.yml</code>)</p>
</li>
<li><p><strong>Environment</strong>: (Optional) If using GitHub Environments</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-2-github-actions-workflow-configuration">2. GitHub Actions Workflow Configuration</h3>
<pre><code class="lang-yaml"><span class="hljs-attr">name:</span> <span class="hljs-string">Release</span>

<span class="hljs-attr">on:</span>
  <span class="hljs-attr">push:</span>
    <span class="hljs-attr">tags:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">'v*'</span>

<span class="hljs-attr">permissions:</span>
  <span class="hljs-attr">id-token:</span> <span class="hljs-string">write</span>   <span class="hljs-comment"># Required for OIDC authentication</span>
  <span class="hljs-attr">contents:</span> <span class="hljs-string">write</span>   <span class="hljs-comment"># Required for creating GitHub Release</span>

<span class="hljs-attr">jobs:</span>
  <span class="hljs-attr">release:</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">ubuntu-latest</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Checkout</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/checkout@v4</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Setup</span> <span class="hljs-string">Node.js</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/setup-node@v4</span>
        <span class="hljs-attr">with:</span>
          <span class="hljs-attr">node-version:</span> <span class="hljs-string">'22'</span>
          <span class="hljs-attr">registry-url:</span> <span class="hljs-string">'https://registry.npmjs.org'</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Update</span> <span class="hljs-string">npm</span> <span class="hljs-string">to</span> <span class="hljs-string">latest</span>
        <span class="hljs-attr">run:</span> <span class="hljs-string">npm</span> <span class="hljs-string">install</span> <span class="hljs-string">-g</span> <span class="hljs-string">npm@latest</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Install</span> <span class="hljs-string">dependencies</span>
        <span class="hljs-attr">run:</span> <span class="hljs-string">npm</span> <span class="hljs-string">ci</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Publish</span> <span class="hljs-string">to</span> <span class="hljs-string">npm</span>
        <span class="hljs-attr">run:</span> <span class="hljs-string">npm</span> <span class="hljs-string">publish</span> <span class="hljs-string">--access</span> <span class="hljs-string">public</span>
        <span class="hljs-comment"># <span class="hljs-doctag">Note:</span> Do NOT set NODE_AUTH_TOKEN!</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">Create</span> <span class="hljs-string">GitHub</span> <span class="hljs-string">Release</span>
        <span class="hljs-attr">uses:</span> <span class="hljs-string">softprops/action-gh-release@v2</span>
        <span class="hljs-attr">with:</span>
          <span class="hljs-attr">generate_release_notes:</span> <span class="hljs-literal">true</span>
</code></pre>
<h3 id="heading-3-required-packagejson-fields">3. Required package.json Fields</h3>
<pre><code class="lang-json">{
  <span class="hljs-attr">"name"</span>: <span class="hljs-string">"your-package-name"</span>,
  <span class="hljs-attr">"version"</span>: <span class="hljs-string">"x.y.z"</span>,
  <span class="hljs-attr">"repository"</span>: {
    <span class="hljs-attr">"type"</span>: <span class="hljs-string">"git"</span>,
    <span class="hljs-attr">"url"</span>: <span class="hljs-string">"https://github.com/YOUR_USERNAME/YOUR_REPO"</span>
  }
}
</code></pre>
<h2 id="heading-key-takeaways">Key Takeaways</h2>
<ol>
<li><p><strong>npm Classic Tokens are dead</strong> – As of December 9, 2025, all classic tokens are permanently invalidated</p>
</li>
<li><p><strong>OIDC Trusted Publishing is the new standard</strong> – No token management, enhanced security, built-in provenance</p>
</li>
<li><p><strong>Do not set NODE_AUTH_TOKEN</strong> – For OIDC, this environment variable should not be set at all</p>
</li>
<li><p><strong>Configure Trusted Publisher on</strong> <a target="_blank" href="http://npmjs.com"><strong>npmjs.com</strong></a> – This step is often overlooked</p>
</li>
<li><p><strong>package.json must include the repository field</strong> – Required for provenance validation</p>
</li>
<li><p><strong>Ensure id-token: write permission</strong> – Otherwise, OIDC token generation will fail</p>
</li>
<li><p><strong>npm CLI version requirement</strong> – Requires npm 11.5.1 or later</p>
</li>
</ol>
<h2 id="heading-faq">FAQ</h2>
<h3 id="heading-q-can-i-use-oidc-to-publish-the-first-version-of-a-new-package">Q: Can I use OIDC to publish the first version of a new package?</h3>
<p>A: No. The first version must be published manually or using a traditional token. Trusted Publisher can only be configured afterward.</p>
<h3 id="heading-q-can-i-use-oidc-with-self-hosted-runners">Q: Can I use OIDC with self-hosted runners?</h3>
<p>A: Currently, only GitHub/GitLab-hosted runners are supported. Self-hosted runners are not yet supported.</p>
<h3 id="heading-q-why-doesnt-setting-nodeauthtoken-to-an-empty-string-work">Q: Why doesn't setting NODE_AUTH_TOKEN to an empty string work?</h3>
<p>A: An empty string is still a value—npm will attempt to use it rather than falling back to OIDC. Only when this variable is completely unset will npm automatically use OIDC.</p>
<h3 id="heading-q-what-should-i-do-if-provenance-validation-fails">Q: What should I do if provenance validation fails?</h3>
<p>A: Verify that <code>repository.url</code> in <code>package.json</code> exactly matches the GitHub repository URL (including case sensitivity).</p>
<h2 id="heading-references">References</h2>
<ul>
<li><p><a target="_blank" href="https://docs.npmjs.com/trusted-publishers">npm Trusted Publishing Documentation</a></p>
</li>
<li><p><a target="_blank" href="https://github.blog/changelog/2025-12-09-npm-classic-tokens-revoked-session-based-auth-and-cli-token-management-now-available/">GitHub Changelog: npm classic tokens revoked</a></p>
</li>
<li><p><a target="_blank" href="https://docs.npmjs.com/generating-provenance-statements">npm Provenance Introduction</a></p>
</li>
</ul>
<hr />
<p><em>Written on January 4, 2026, based on the publishing experience of amp-acp project from v0.3.1 to v0.3.7.</em></p>
]]></content:encoded></item><item><title><![CDATA[The Ultimate Guide to Kubernetes Networking and Multi-Tenant Gateways]]></title><description><![CDATA[Introducing of Kubernetes’s Network
Kubernetes, as we know, is a powerful platform for automating the deployment, scaling, and operations of application containers across clusters of hosts. However, for these containers, or rather 'Pods' as they are ...]]></description><link>https://blog.moelove.info/the-ultimate-guide-to-kubernetes-networking-and-multi-tenant-gateways</link><guid isPermaLink="true">https://blog.moelove.info/the-ultimate-guide-to-kubernetes-networking-and-multi-tenant-gateways</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[kong]]></category><category><![CDATA[gateway]]></category><category><![CDATA[Open Source]]></category><category><![CDATA[APIs]]></category><dc:creator><![CDATA[Jintao Zhang]]></dc:creator><pubDate>Fri, 19 Jul 2024 18:23:39 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1721644764112/9c634349-d42b-42ee-b140-4ee7438aea37.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introducing-of-kubernetess-network">Introducing of Kubernetes’s Network</h2>
<p>Kubernetes, as we know, is a powerful platform for automating the deployment, scaling, and operations of application containers across clusters of hosts. However, for these containers, or rather 'Pods' as they are known within the Kubernetes realm, to effectively communicate and serve their purpose, a robust networking model is essential. Let's delve into the components that constitute networking in Kubernetes.</p>
<p>Firstly, we have "Communication between Pods on the same Node." This is the most basic level of networking within Kubernetes. Pods running on the same physical or virtual machine need to communicate with each other. Kubernetes uses the networking namespace and other isolation mechanisms provided by the underlying operating system to ensure that these Pods can connect efficiently and securely.</p>
<p>Moving a step further, we encounter the scenario of "Communication between Pods on different Nodes." As our applications scale, they span across multiple Nodes, creating a need for cross-node communication. Kubernetes abstracts the complexity involved in this process, ensuring Pods can communicate seamlessly, irrespective of their physical location within the cluster.</p>
<p>Next, let's talk about "Service." A Service in Kubernetes is an abstraction that defines a logical set of Pods and a policy by which to access them. This abstraction enables the decoupling of backend Pod implementations from the frontend services that access them. Services allow for stable, reliable communication pathways between different components of an application.</p>
<p>"Ingress and Egress" mechanisms in Kubernetes facilitate the management of incoming and outgoing traffic to your cluster. Ingress controls how external traffic reaches your services within the cluster, allowing you to define accessible URLs, load balance traffic, terminate SSL, and offer name-based virtual hosting. Egress, on the other hand, controls the outbound traffic from your Pods to the external world, ensuring that only authorized connections are made, thus enhancing the security of your cluster. Although Kubernetes does not have native Egress resources, some other components provide this implementation, such as Calico.</p>
<p>Lastly, we have "NetworkPolicy." This is a specification of how groups of Pods are allowed to communicate with each other and other network endpoints. NetworkPolicies are crucial for enforcing security rules and ensuring that only the intended traffic can flow between Pods, Services, and the external world.</p>
<p>These are the basic concepts that will be encountered when discussing Kubernetes networking. However, these concepts require some specific components to implement their functions.</p>
<h2 id="heading-kubernetes-networking-components">Kubernetes Networking Components</h2>
<p>The following are some of the main components.</p>
<p><strong>CNI Implementations:</strong></p>
<p>Let's start with the foundation of Kubernetes networking - the Container Network Interface, or CNI. The CNI is a specification and a set of tools that facilitate the configuration of network interfaces for containers. Kubernetes relies on CNI implementations to provide in-pod networking capabilities. These implementations, such as Calico, Flannel, and Cilium, to name a few, allow for a pluggable framework that supports a wide range of networking features including network segmentation, policy enforcement, and overlay networks. Choosing the right CNI implementation is critical as it influences the network performance, security, and scalability of your Kubernetes cluster.</p>
<p>For example, A few years ago, Flannel, which was widely used, did not provide network policy support. Therefore, if this CNI plugin is used, network policies cannot take effect.</p>
<p>However, in addition to the native NetworkPolicy of Kubernetes, Calico and Cilium have also implemented their own enhanced versions of network policy.</p>
<p><strong>kube-proxy:</strong></p>
<p>Moving on, we encounter kube-proxy, a vital component that runs on each node in the Kubernetes cluster. kube-proxy is responsible for maintaining network rules on nodes. These rules allow network communication to your Pods from network sessions inside or outside of your cluster. Essentially, kube-proxy enables the forwarding of TCP and UDP packets to pods based on the IP and port number of the incoming request. Through services, kube-proxy abstracts the IP addresses of individual pods, ensuring that the internal network structure is hidden from the external clients.</p>
<p><strong>DNS:</strong></p>
<p>In the realm of Kubernetes, DNS plays a pivotal role in service discovery. It allows Pods to locate each other and external services through human-readable names. When a Kubernetes cluster is created, a DNS service is automatically deployed, assigning DNS names to other Kubernetes services. This simplifies communication within the cluster, as Pods can reach each other and external resources using these DNS names, enhancing the overall application architecture's flexibility and robustness.</p>
<p><strong>Controller:</strong></p>
<p>Lastly, let's delve into the controller, particularly focusing on IPAM, and the Endpoint/EndpointSlice/Service mechanisms.</p>
<ul>
<li><p><strong>IPAM (IP Address Management):</strong> This functionality is crucial for managing the assignment of IP addresses to Pods and Services. Efficient IPAM practices ensure that there is no conflict between the IPs assigned within the cluster, maintaining a smooth network operation.</p>
</li>
<li><p><strong>Endpoint/EndpointSlice/Service:</strong> These components work together to manage how external and internal communications reach the Pods. A Service defines a logical set of Pods and a policy by which to access them. The Endpoints and EndpointSlices keep track of the IP addresses of the Pods that are associated with each Service. This system ensures that network traffic is efficiently routed to the appropriate Pods, allowing for scalable and reliable application deployments.</p>
</li>
</ul>
<p>Although they may appear complex, they all follow the most basic models.</p>
<h2 id="heading-kubernetes-networking-model">Kubernetes Networking Model</h2>
<p>The kubernetes networking model is ingeniously designed to ensure seamless communication within the cluster, which is pivotal for deploying scalable and resilient applications. Let's explore the core principles that define this networking model.</p>
<p><strong>Each Pod has its own IP address:</strong></p>
<p>The first principle to understand is that in Kubernetes, each Pod is assigned its own unique IP address. This is a significant departure from traditional deployments where containers or applications within a host might share the host's IP address. This unique IP per Pod simplifies application configuration and inter-service communication. It means that each Pod can be treated as if it's a physical host on the network, making network policies and communications more straightforward to manage.</p>
<p><strong>All Pods can communicate across Nodes without the need for NAT</strong></p>
<p><strong>Components on the Node (e.g., kubelet) can communicate with all Pods on the Node:</strong></p>
<p>These two rules have been introduced earlier, and they define how Pods are accessed.</p>
<h2 id="heading-how-to-access-applications-deployed-in-kubernetes">How to Access Applications Deployed in Kubernetes</h2>
<p>Let’s discuss How to Access Applications Deployed in Kubernetes</p>
<p>Deploying applications is only part of the journey. An equally crucial aspect is how these applications, once deployed, are made accessible to the outside world.</p>
<p><strong>NodePort:</strong></p>
<p>Let's start with NodePort, a simple yet powerful way to access your applications. When you configure a service as NodePort, Kubernetes exposes that service on each Node’s IP at a static port (the NodePort). A NodePort service is accessible by the IP of the Node it's running on combined with a high port number assigned specifically for that service. This means that anyone who can reach the Node, either within the internal network or from the outside world, can access the service by hitting the Node's IP at the designated port.</p>
<p>This method is particularly useful for development environments or smaller setups where direct access to each node is manageable. However, it's worth noting that managing access through NodePort can become challenging in environments with a large number of Nodes, as it requires keeping track of multiple IP addresses and ports.</p>
<p><strong>LoadBalancer:</strong></p>
<p>Moving on to a more sophisticated and widely used approach – LoadBalancer. This method integrates Kubernetes services with existing cloud providers' load balancers. When you create a service of type LoadBalancer, Kubernetes provisions a cloud load balancer for your service, and directs external traffic to it. This external load balancer then automatically routes the traffic to your Kubernetes Pods, regardless of which Node they're running on.</p>
<p>The beauty of using LoadBalancer is its simplicity and scalability. It abstracts away the complexity of dealing with individual Node IPs and ports, providing a single entry point – the load balancer's IP – to access your service. This makes it an ideal choice for production environments where reliability and scalability are paramount.</p>
<p>Whether you opt for NodePort for its simplicity and direct access, or LoadBalancer for its robustness and scalability, Kubernetes offers flexible solutions to suit different application access needs.</p>
<p>But you may be curious, in most cases we talk about accessing applications inside the cluster through Ingress from outside the cluster. What exactly is Ingress?</p>
<h2 id="heading-what-is-ingress">What is Ingress</h2>
<p><strong>Ingress: A Specification</strong></p>
<p>At its core, Ingress is not merely a tool or a component; it is a powerful API object that provides a sophisticated method for handling external access to the services in a Kubernetes cluster. It allows you to define flexible, HTTP-based routing rules that direct traffic to different services based on the request's details. Let's delve into the key elements that make up an Ingress specification:</p>
<p><strong>Host:</strong></p>
<p>The 'host' field in an Ingress specification is what allows us to define domain names that will be routed to specific services within our cluster. This means that you can have different domain names or subdomains pointing to different parts of your application, all managed through a single Ingress resource. This is particularly useful for applications that require multiple entry points or for hosting multiple services under the same cluster.</p>
<p><strong>Paths:</strong></p>
<p>Next, we have 'paths.' This element works in conjunction with 'host' to provide even finer control over the routing of incoming requests. By specifying paths, you can direct traffic not just based on the domain name, but also based on the URL path. For instance, requests to <code>example.com/api</code> could be routed to one service, while requests to <code>example.com/blog</code> could be directed to another. This allows for a highly customizable and efficient distribution of traffic to the various components of your applications.</p>
<p><strong>Certificates:</strong></p>
<p>Lastly, an essential aspect of modern web applications is security, particularly the use of SSL/TLS certificates to encrypt traffic. Ingress facilitates this by allowing you to specify certificates for each host, ensuring that all communication is securely encrypted. This integration of SSL/TLS certificates with Ingress means that you can manage the security of your services at the same point where you're managing their accessibility, simplifying the overall configuration and maintenance of your applications.</p>
<p><strong>However, simply creating an Ingress resource is not enough for it to work.</strong></p>
<h2 id="heading-how-does-ingress-work">How does Ingress work?</h2>
<p><strong>The Ingress Controller: Translating Rules into Action</strong></p>
<p>At the heart of the Ingress mechanism lies the Ingress controller. This isn't just any controller; it's the maestro of network traffic, orchestrating how requests from outside the Kubernetes cluster are handled and directed. But how does it achieve this?</p>
<p>The Ingress controller continuously monitors the Kubernetes API for Ingress resources - the rules and paths that you define for routing external traffic. Upon detecting an Ingress resource, the controller springs into action, translating these high-level Ingress rules into specific, actionable dataplane rules. This translation process is where the abstract becomes concrete, turning our defined paths and domains into precise instructions on how traffic should flow.</p>
<p><strong>The Dataplane: Carrying Traffic to its Destination</strong></p>
<p>With the rules translated, we now turn to the dataplane - the physical and logical network paths that actual traffic flows through. The dataplane is where the rubber meets the road, or more aptly, where the packet meets the pod.</p>
<p>In the context of Ingress, the dataplane is responsible for carrying the external traffic, now governed by the rules set forth by the Ingress controller. This can involve a variety of operations, such as SSL termination, load balancing, and content-based routing, all performed with the goal of ensuring that incoming requests are delivered to the correct service within the cluster.</p>
<p>Usually, we create a service for the dataplane and then expose it to external access by using NodePort or LoadBalancer as mentioned earlier.</p>
<h2 id="heading-limitations-of-ingress">Limitations of Ingress</h2>
<p><strong>Limited Expressiveness</strong></p>
<p>One of the core limitations of Ingress lies in its limited expressiveness. Ingress rules are designed to be straightforward, focusing primarily on HTTP and HTTPS routing. This simplicity, while beneficial for ease of use and understanding, means that Ingress cannot natively handle more complex routing scenarios. For example, Ingress does not inherently support advanced load balancing algorithms, TCP or UDP traffic routing, or canary deployments out of the box.</p>
<p>This limitation can pose challenges for applications requiring more sophisticated traffic management and routing capabilities.</p>
<p><strong>Reliance on Annotations for Controller Implementation Extensions</strong></p>
<p>Another significant limitation is the way Ingress extends its functionality - through annotations. Each Ingress controller, such as Ingress-NGINX, Kong Ingress controller, or HAProxy, may implement additional features that are not part of the standard Ingress specification. These features are often enabled or configured via annotations in the Ingress resource.</p>
<p>While annotations provide a flexible way to extend Ingress capabilities, they also introduce variability and complexity. Different Ingress controllers might support different sets of annotations, leading to a lack of standardization across implementations. This can result in portability issues when moving applications between clusters using different Ingress controllers. Furthermore, relying heavily on annotations can make Ingress resources cluttered and harder to manage, especially as the number of annotations grows to accommodate more complex configurations.</p>
<h2 id="heading-advantages-of-gateway-api">Advantages of Gateway API</h2>
<p>Considering these limitations of Ingress, the Kubernetes community decided to make some changes and thus began the design of the Gateway API.</p>
<p>Gateway API offers several key advantages that make it an attractive choice for managing network traffic in Kubernetes:</p>
<p><strong>Role-oriented:</strong></p>
<p>Gateway API is designed with distinct roles for different types of users. This allows for a more fine-grained and role-specific way of controlling access to resources, improving security, and making the management of permissions more straightforward. For example, cluster operators can control infrastructure-related aspects of gateways and routes, while developers can manage application-level configurations.</p>
<p><strong>More Universal:</strong></p>
<p>Unlike Ingress, which primarily focuses on HTTP/HTTPS traffic, Gateway API is protocol-agnostic and can handle a broader range of traffic types, including TCP and UDP. This makes it a more universal solution for managing all types of network traffic within a Kubernetes cluster.</p>
<p><strong>Vendor-neutral:</strong></p>
<p>Gateway API is designed to be vendor-neutral, meaning it doesn't favor any specific networking solution. This ensures that it can be used consistently across different environments and networking solutions, enhancing portability and reducing vendor lock-in. This neutrality also fosters a more collaborative ecosystem, as improvements and extensions can benefit all users, regardless of their specific networking implementation.</p>
<h2 id="heading-role-oriented">Role-oriented</h2>
<p>In the context of the Kubernetes Gateway API, being "role-oriented" means that the API is designed with distinct roles for different types of users. This approach allows for a more fine-grained control over resources, improving security, and making the management of permissions more straightforward.</p>
<p>For example, the roles could be divided between cluster operators and application developers. Cluster operators control infrastructure-related aspects of gateways and routes, such as selecting the type of load balancer used, setting up SSL/TLS certificates, and managing access logs. On the other hand, application developers manage application-level configurations, such as specifying the routes for their applications, setting up path-based routing, and defining request and response headers.</p>
<p>This role-oriented design allows each user to focus on their area of expertise, without needing to worry about the details that are outside of their purview. It also ensures that only authorized users can make changes to specific aspects of the configuration, enhancing the overall security of the system.</p>
<h2 id="heading-why-the-managed-gateway-is-needed">Why the Managed Gateway is Needed</h2>
<p>Let's move on to the next section and discuss why a Managed Gateway is needed. As you can guess from its name, it simplifies our work.</p>
<h2 id="heading-why-multi-tenant-gateways-are-needed">why multi-tenant Gateways are needed</h2>
<p>The Managed Gateway is essential for several reasons:</p>
<ul>
<li><p><strong>Simplified Management:</strong> Managed Gateways handle the underlying infrastructure, freeing up development teams to focus on application logic. This simplifies the management of the gateway, as the complexities of configuration, maintenance, and upgrades are handled by the provider.</p>
</li>
<li><p><strong>Scalability:</strong> Managed Gateways offer automatic scaling capabilities. They can scale up to handle high traffic loads and scale back down during quieter periods, ensuring optimal resource usage.</p>
</li>
<li><p><strong>Reliability:</strong> Managed Gateways provide high availability and fault tolerance. They are designed to maintain service continuity, even in the face of network disruptions or hardware failures.</p>
</li>
<li><p><strong>Security:</strong> Managed Gateways often come with built-in security features such as SSL encryption, and identity-based access control. This ensures that your applications are secure and compliant with industry standards.</p>
</li>
<li><p><strong>Isolate</strong> and protect tenant traffic, reducing the blast radius.</p>
</li>
</ul>
<h2 id="heading-general-solution">General solution</h2>
<p>To handle this scenario, they will deploy multiple ingress controllers. Here are two common solutions:</p>
<ul>
<li><p><strong>Differentiation through IngressClass:</strong> IngressClass is a Kubernetes resource that allows users to specify the type of Ingress controller they want to use in their applications. By defining different IngressClasses, users can associate their Ingress resources with specific Ingress controllers. This is beneficial as it allows for a clear separation and management of different application traffic within the cluster, each handled by its own Ingress controller.</p>
</li>
<li><p><strong>Differentiation through namespaces:</strong> Another method to differentiate between multiple Ingress controllers is by using namespaces. Kubernetes namespaces provide a scope for names and allow users to divide cluster resources between multiple users or teams. By deploying different Ingress controllers in different namespaces, users can ensure that each controller only manages the traffic for the applications within its own namespace. This approach enhances the security and organization of the cluster, as it provides a clear separation of concerns between different applications and their respective Ingress controllers.</p>
</li>
</ul>
<h2 id="heading-pain-point">Pain point</h2>
<p>While deploying multiple Ingress controllers in the same cluster may seem like a viable solution, it introduces several pain points:</p>
<ul>
<li><p><strong>Increased Complexity:</strong> Managing multiple Ingress controllers increases the complexity of your Kubernetes cluster. Each controller has its own configuration and behavior, leading to potential inconsistencies and conflicts. This can make it hard to predict how the controllers will interact and complicate debugging efforts when issues arise.</p>
</li>
<li><p><strong>Resource Wastage:</strong> Each Ingress controller requires its own resources to run. Deploying multiple controllers may lead to resource wastage, especially if some controllers are not fully utilized. This inefficiency can lead to increased costs and reduced performance for your cluster.</p>
</li>
<li><p><strong>Increased Maintenance:</strong> With multiple Ingress controllers, you need to maintain each one separately. This includes monitoring, updating, troubleshooting, and securing each controller. This increased maintenance can take significant time and effort.</p>
</li>
<li><p><strong>Lack of Centralized Control:</strong> Having multiple Ingress controllers can make it difficult to achieve centralized control and visibility over your network traffic. Without a single point of control, it can be challenging to manage traffic routing consistently and apply uniform security policies across all controllers.</p>
</li>
<li><p><strong>Potential Security Risks:</strong> Each Ingress controller has its own security features and configurations. If not properly managed, having multiple controllers can introduce security risks, as each controller could potentially be a separate point of vulnerability in your cluster.</p>
</li>
</ul>
<h2 id="heading-kong-gateway-operator">Kong Gateway Operator</h2>
<p>Today, let me share an innovative open-source project created by Kong Inc., known as the <a target="_blank" href="https://konghq.com/products/kong-gateway-operator"><strong>Kong Gateway Operator</strong></a>. This project is not just another tool in the tech space; it's a game-changer designed to address some of the most pressing challenges we face in managing API gateways.</p>
<p><img src="https://prd-mktg-konghq-com.imgix.net/images/2024/03/65f9d38b-graphic-hero-kong-gateway-operator.png?auto=format&amp;fit=max&amp;w=2560" alt /></p>
<p>In our journey towards digital transformation, we often encounter bottlenecks that hinder our progress. The Kong Gateway Operator is here to eliminate those bottlenecks by offering:</p>
<ol>
<li><p><strong>Full Lifecycle Management</strong>: Managing the lifecycle of Kong Gateway has never been easier. From deployment to retirement, the Kong Gateway Operator ensures a smooth journey. Moreover, it supports a blue/green strategy for upgrading Kong Gateway, making transitions seamless and minimizing downtime.</p>
</li>
<li><p><strong>Elastic Scaling</strong>: Cost-efficiency is paramount in today's tech landscape. The Kong Gateway Operator leverages Horizontal Pod Autoscaling (HPA) and latency measurements to dynamically scale your resources. This not only ensures optimal performance but also significantly reduces operational costs.</p>
</li>
<li><p><strong>Automatic Certificate Rotation</strong>: Security is a top priority, and managing certificates can be a cumbersome task. Thanks to the integration with cert-manager, the Kong Gateway Operator automates certificate rotation, ensuring your applications are always secure without the manual hassle.</p>
</li>
<li><p><strong>AI Gateway Support</strong>: In an era where AI and LLM (Large Language Models) applications are becoming increasingly prevalent, the Kong Gateway Operator sets the stage by prioritizing AI gateway support. This feature is designed to streamline the development and deployment of AI-driven applications, making it easier for developers to integrate cutting-edge technologies into their solutions.</p>
</li>
</ol>
<h2 id="heading-how-to-deploy-a-managed-gateway-using-kgo-in-civo">How to Deploy a Managed Gateway using KGO in Civo</h2>
<h3 id="heading-create-kubernetes-cluster">Create Kubernetes cluster</h3>
<p>The cluster running on Civo is <a target="_blank" href="https://k3s.io/">k3s</a>, but it is also a distribution that has passed Kubernetes conformance certification.</p>
<p>To create a cluster through Civo, you can operate it in the dashboard of Civo or <a target="_blank" href="https://www.civo.com/docs/overview/civo-cli">install Civo's CLI</a>. It is very convenient to create through CLI. Here I have installed and configured Civo's CLI.</p>
<pre><code class="lang-jsx">➜  ~ civo k8s create --merge  --save  --wait
Merged <span class="hljs-keyword">with</span> main kubernetes config: <span class="hljs-regexp">/home/</span>tao/.kube/config

Access your cluster <span class="hljs-keyword">with</span>:
kubectl config use-context red-nightingale<span class="hljs-number">-6</span>cf01085
kubectl get node                     
The cluster red-nightingale<span class="hljs-number">-6</span>cf01085 (<span class="hljs-number">9752</span>f456-f316<span class="hljs-number">-4</span>bdf<span class="hljs-number">-95</span>fe<span class="hljs-number">-82</span>c9b08db61b) has been created <span class="hljs-keyword">in</span> <span class="hljs-number">1</span> min <span class="hljs-number">9</span> sec
</code></pre>
<p><strong>Deploying clusters on Civo is very fast, which is one of the reasons why I like Civo the most.</strong></p>
<h3 id="heading-install-kgo">Install KGO</h3>
<p>KGO can be installed through Helm, and the CRDs of Kubernetes Gateway API are already included in KGO's Helm chart, so there is no need to perform a separate installation step.</p>
<ul>
<li>Add kong Helm repo.</li>
</ul>
<pre><code class="lang-bash">(⎈|red-nightingale-6cf01085:default)➜  ~ helm repo add kong &lt;https://charts.konghq.com&gt;
<span class="hljs-string">"kong"</span> has been added to your repositories
(⎈|red-nightingale-6cf01085:default)➜  ~ helm repo update
Hang tight <span class="hljs-keyword">while</span> we grab the latest from your chart repositories...
...Successfully got an update from the <span class="hljs-string">"kong"</span> chart repository
Update Complete. ⎈Happy Helming!⎈
</code></pre>
<ul>
<li>Install KGO using Helm.</li>
</ul>
<pre><code class="lang-bash">(⎈|red-nightingale-6cf01085:default)➜  ~ helm upgrade --install kgo kong/gateway-operator -n kong-system --create-namespace --<span class="hljs-built_in">set</span> image.tag=1.3

Release <span class="hljs-string">"kgo"</span> does not exist. Installing it now.
NAME: kgo
LAST DEPLOYED: Sat Jul 20 01:40:28 2024
NAMESPACE: kong-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
kgo-gateway-operator has been installed. Check its status by running:

  kubectl --namespace kong-system  get pods

For more details, please refer to the following documents:

* &lt;https://docs.konghq.com/gateway-operator/latest/get-started/kic/create-gateway/&gt;
* &lt;https://docs.konghq.com/gateway-operator/latest/get-started/konnect/deploy-data-plane/&gt;
</code></pre>
<h3 id="heading-create-gateway">Create Gateway</h3>
<p>To use the Gateway API resources to configure your routes, you need to create a GatewayClass instance and create a Gateway resource that listens on the ports that you need.</p>
<pre><code class="lang-bash">(⎈|red-nightingale-6cf01085:default)➜  ~ <span class="hljs-built_in">echo</span> <span class="hljs-string">'
kind: GatewayConfiguration
apiVersion: gateway-operator.konghq.com/v1beta1
metadata:
 name: kong
 namespace: default
spec:
 dataPlaneOptions:
   deployment:
     podTemplateSpec:
       spec:
         containers:
         - name: proxy
           image: kong:3.7.1
           readinessProbe:
             initialDelaySeconds: 1
             periodSeconds: 1
 controlPlaneOptions:
   deployment:
     podTemplateSpec:
       spec:
         containers:
         - name: controller
           image: kong/kubernetes-ingress-controller:3.2.0
           env:
           - name: CONTROLLER_LOG_LEVEL
             value: debug
---
kind: GatewayClass
apiVersion: gateway.networking.k8s.io/v1beta1
metadata:
 name: kong
spec:
 controllerName: konghq.com/gateway-operator
 parametersRef:
   group: gateway-operator.konghq.com
   kind: GatewayConfiguration
   name: kong
   namespace: default
---
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1beta1
metadata:
 name: kong
 namespace: default
spec:
 gatewayClassName: kong
 listeners:
 - name: http
   protocol: HTTP
   port: 80

'</span> | kubectl apply -f -

gatewayconfiguration.gateway-operator.konghq.com/kong created
gatewayclass.gateway.networking.k8s.io/kong created
gateway.gateway.networking.k8s.io/kong created
</code></pre>
<p>You can verify whether the Gateway has been successfully created by using the following command.</p>
<pre><code class="lang-bash">(⎈|red-nightingale-6cf01085:default)➜  ~ kubectl get Gateway -A
NAMESPACE   NAME   CLASS   ADDRESS         PROGRAMMED   AGE
default     kong   kong    74.220.29.143   True         3m35s

(⎈|red-nightingale-6cf01085:default)➜  ~ <span class="hljs-built_in">export</span> PROXY_IP=$(kubectl get gateway kong -n default -o jsonpath=<span class="hljs-string">'{.status.addresses[0].value}'</span>)
(⎈|red-nightingale-6cf01085:default)➜  ~ curl <span class="hljs-variable">$PROXY_IP</span>                                                                                   
{
  <span class="hljs-string">"message"</span>:<span class="hljs-string">"no Route matched with those values"</span>,
  <span class="hljs-string">"request_id"</span>:<span class="hljs-string">"382d5a15ac243b3183b5b239a5e3ae77"</span>
}
</code></pre>
<p>At the same time, you can also see that KGO has created deployments for CP and DP in the default namespace.</p>
<pre><code class="lang-bash">(⎈|red-nightingale-6cf01085:default)➜  ~ kubectl get deploy          
NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
dataplane-kong-tkbct-snkrk      1/1     1            1           9m5s
controlplane-kong-lv62b-tf6xl   1/1     1            1           9m5s
</code></pre>
<h3 id="heading-create-another-gateway">Create another Gateway</h3>
<p>To verify the ability of KGO, we have created a new namespace and created Gateway resources.</p>
<pre><code class="lang-bash">(⎈|red-nightingale-6cf01085:default)➜  ~ kubectl create ns moe        
namespace/moe created

(⎈|red-nightingale-6cf01085:default)➜  ~ <span class="hljs-built_in">echo</span> <span class="hljs-string">'
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1beta1
metadata:
 name: moe
 namespace: moe
spec:
 gatewayClassName: kong
 listeners:
 - name: http
   protocol: HTTP
   port: 80

'</span> | kubectl apply -f -
</code></pre>
<p>We can check whether Gateway has been deployed correctly by viewing the status of Gateway resources, or we can verify it by requesting the IP of Gateway using the method in the previous section.</p>
<pre><code class="lang-bash">(⎈|red-nightingale-6cf01085:default)➜  ~ kubectl get Gateway -A
NAMESPACE   NAME   CLASS   ADDRESS         PROGRAMMED   AGE
default     kong   kong    74.220.29.143   True         14m
moe         moe    kong    74.220.28.102   True         8m6s
</code></pre>
<p>Of course, a more intuitive way would be to check the status of kocp and kodp.</p>
<pre><code class="lang-bash">(⎈|red-nightingale-6cf01085:default)➜  ~ kubectl get kodp,kocp -A
NAMESPACE   NAME                                               READY
default     dataplane.gateway-operator.konghq.com/kong-tkbct   True
moe         dataplane.gateway-operator.konghq.com/moe-bfwnl    True

NAMESPACE   NAME                                                  READY   PROVISIONED
default     controlplane.gateway-operator.konghq.com/kong-lv62b   True    True
moe         controlplane.gateway-operator.konghq.com/moe-9dgjc    True    True
</code></pre>
<p>After the Gateway is ready, you can refer to the <a target="_blank" href="https://docs.konghq.com/gateway-operator/1.3.x/get-started/kic/create-route/">KGO official documentation</a> to use Kubernetes Gateway API to create routes.</p>
<h2 id="heading-summary">Summary</h2>
<p>In this article, I introduced why we need to use a multi-tenant Gateway in Kubernetes clusters, and also demonstrated how to deploy the Gateway through KGO in Civo's Kubernetes cluster.</p>
<p>KGO provides a wide range of capabilities, please feel free to explore them! <a target="_blank" href="https://konghq.com/products/kong-gateway-operator">https://konghq.com/products/kong-gateway-operator</a></p>
]]></content:encoded></item><item><title><![CDATA[How to reduce the cost of GitHub Actions]]></title><description><![CDATA[I'll cover how to reduce the code of GitHub Actions, and give some advice.

According to G2's statistical report, GitHub Actions is the easiest-to-use CI/CD tool, and more and more people like it.

Since GitHub Actions is GitHub's native CI/CD tool, ...]]></description><link>https://blog.moelove.info/how-to-reduce-the-cost-of-github-actions</link><guid isPermaLink="true">https://blog.moelove.info/how-to-reduce-the-cost-of-github-actions</guid><category><![CDATA[GitHub]]></category><category><![CDATA[GitHub Actions]]></category><category><![CDATA[Open Source]]></category><category><![CDATA[WeMakeDevs]]></category><category><![CDATA[finops]]></category><dc:creator><![CDATA[Jintao Zhang]]></dc:creator><pubDate>Thu, 26 Jan 2023 20:05:36 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1674763354973/d3fc5240-5e60-478b-9533-4ef6f1c2dc87.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>I'll cover how to reduce the code of GitHub Actions, and give some advice.</p>
</blockquote>
<p>According to G2's <a target="_blank" href="https://www.g2.com/categories/continuous-integration?tab=easiest_to_use">statistical report</a>, GitHub Actions is the easiest-to-use CI/CD tool, and more and more people like it.</p>
<p><img src="https://s2.loli.net/2023/01/24/zG5gTWm1lHxBMK2.png" alt="2023-01-24 08-12-54屏幕截图.png" /></p>
<p>Since GitHub Actions is GitHub's native CI/CD tool, tens of thousands of Actions can be used directly in the marketplace, and it is free for public repositories. More and more projects are switching their CI tools to GitHub Actions.</p>
<p>I also really like GitHub Actions and use it for almost all my GitHub-hosted repositories.</p>
<p>But recently I was working on a project that hit the GitHub Actions quota limit. It took me some time to focus on its cost.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://twitter.com/zhangjintao9020/status/1616077513125691399?s=20&amp;t=we9o9FxhgSEOXzY39czbJg">https://twitter.com/zhangjintao9020/status/1616077513125691399?s=20&amp;t=we9o9FxhgSEOXzY39czbJg</a></div>
<p> </p>
<h2 id="heading-why-is-the-quota-exhausted">Why is the quota exhausted?</h2>
<p>Recently I found an interesting project: <a target="_blank" href="https://github.com/upptime/upptime">upptime/upptime: ⬆️ Free uptime monitor and status page powered by GitHub</a></p>
<p>I want to try to use it to monitor some of the services I have developed and make a status page, this will involve some API configurations, and I don't want to make it public, so I forked the project into a private repository. After a simple configuration, it works fine.</p>
<p>Since I wanted more data, I tweaked the CI scheduler configuration. Make these tasks run more frequently.</p>
<pre><code class="lang-yaml"><span class="hljs-attr">workflowSchedule:</span>
  <span class="hljs-attr">graphs:</span> <span class="hljs-string">"0 * * * *"</span>
  <span class="hljs-attr">responseTime:</span> <span class="hljs-string">"0 * * * *"</span>
  <span class="hljs-attr">staticSite:</span> <span class="hljs-string">"0 * * * *"</span>
  <span class="hljs-attr">summary:</span> <span class="hljs-string">"0 * * * *"</span>
  <span class="hljs-attr">updateTemplate:</span> <span class="hljs-string">"0 * * * *"</span>
  <span class="hljs-attr">updates:</span> <span class="hljs-string">"0 * * * *"</span>
  <span class="hljs-attr">uptime:</span> <span class="hljs-string">"*/5 * * * *"</span>
</code></pre>
<p>According to the <a target="_blank" href="https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions">billing documentation for GitHub Actions</a>, GitHub Actions for public repositories is Free, but there is a quota limit for private repositories.</p>
<blockquote>
<p>GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for Self-hosted runners. For private repositories, each GitHub account receives a certain amount of free minutes and storage for use with GitHub-hosted runners, depending on the product used with the account. Any usage beyond the included amounts is controlled by spending limits.</p>
</blockquote>
<p>Soon I received a quota reminder email from GitHub, reminding me that the quota was about to be used up.</p>
<p>This got me thinking about how to solve it.</p>
<h2 id="heading-cost-of-using-github-actions">Cost of using GitHub Actions</h2>
<p>Making the repository public is the most straightforward way, but I explained above why it cannot be made public. I can only find other solutions.</p>
<p>Paying for GitHub Actions is also a very straightforward solution.</p>
<p>Before deciding to pay for it, I want to estimate the cost. GitHub provides a <a target="_blank" href="https://github.com/pricing/calculator">Pricing Calculator</a>, which can easily estimate costs.</p>
<p>Since I modified the CI's scheduling configuration, the most frequently run tasks will run every 5 minutes.</p>
<p>I used <a target="_blank" href="https://meercode.io/">Meercode</a> to collect the running data of GitHub Actions in this repository. It provides some dashboards by default:</p>
<p><img src="https://s2.loli.net/2023/01/25/m5S3ZAtdpNjQz9y.png" alt="2023-01-25 11-29-12 screenshot.png" /></p>
<p>It also allows users to customize it themselves. I created my dashboard. If you are interested in Meercode, please let me know in the comments.</p>
<p><img src="https://s2.loli.net/2023/01/25/5bUsJnwmurC1fxl.png" alt="ci-dashboard.png" /></p>
<p>As can be seen from the figure above, each task takes no more than 0.5 minutes, and there are no more than 12 tasks per hour. Using the price calculator, the approximate <strong>cost is $35 per month</strong>.</p>
<p><img src="https://s2.loli.net/2023/01/25/DGQ3ie2zW6MTjC1.png" alt="2023-01-25 11-25-13 screenshot.png" /></p>
<h2 id="heading-ways-to-save-costs">Ways to save costs</h2>
<p>Since my repository is mainly run uptime CI, it consumes few resources but has frequent tasks, so I wonder if I can save costs if I use a self-hosted runner.</p>
<p>I compared the prices of 3 lower-priced cloud service providers:</p>
<ul>
<li><p><a target="_blank" href="https://www.civo.com/">Civo</a></p>
</li>
<li><p><a target="_blank" href="https://www.digitalocean.com/pricing/droplets">DigitalOcean</a></p>
</li>
<li><p><a target="_blank" href="https://www.vultr.com/pricing/">Vultr</a></p>
</li>
</ul>
<p>Among them, both Civo and Vultr provide 1C1G instances at $5/month, and DigitalOcean instances with the same specifications are priced at $6/month.</p>
<p>I finally chose <a target="_blank" href="https://www.civo.com/">Civo</a>, which is a <em>cloud-native service provider</em>, and there is an introduction on its homepage:</p>
<blockquote>
<p>Transparent pricing from just $5 a month</p>
</blockquote>
<p>Civo provides a variety of services, such as Kubernetes (based on k3s), or compute instances.</p>
<p>Among them, the instance specification of the <em>Extra Small</em> type is 1C1G, and it has 1TB traffic, and if you choose the Kubernetes service, you do not need to pay for the control plane(same as Azure AKS). Even the larger specs look cheap.</p>
<p><img src="https://s2.loli.net/2023/01/25/jzmTs6X7PR9QVMx.png" alt="2023-01-25 17-04-35 screenshot.png" /></p>
<p>I have tried using its Kubernetes service, and compute instance respectively, and they both work fine.</p>
<p><img src="https://s2.loli.net/2023/01/25/2ur4YJ6AlvKgm8O.png" alt="2023-01-25 17-08-43 screenshot.png" /></p>
<h3 id="heading-using-compute-instances">Using compute instances</h3>
<p>Deploying the GitHub Actions runner in a Linux compute instance is simple, just add it to the project <code>https://github.com/&lt;Your name&gt;/&lt;Project name&gt;/settings/actions/runners/new</code>.</p>
<p>There are complete deployment steps on this page, just follow the steps.</p>
<p>My installation process is as follows:</p>
<pre><code class="lang-bash">civo@polished-bush-99d8-1926a1:~$ mkdir actions-runner &amp;&amp; <span class="hljs-built_in">cd</span> actions-runner
civo@polished-bush-99d8-1926a1:~/actions-runner$ curl -o actions-runner-linux-x64-2.301.1.tar.gz -L https://github.com/actions/runner/releases/download/v2.301.1/actions-runner-linux-x64-2.301.1.tar.gz
civo@polished-bush-99d8-1926a1:~/actions-runner$ <span class="hljs-built_in">echo</span> <span class="hljs-string">"3ee9c3b83de642f919912e0594ee2601835518827da785d034c1163f8efdf907  actions-runner-linux-x64-2.301.1.tar.gz"</span> | shasum -a 256 -c
actions-runner-linux-x64-2.301.1.tar.gz: OK                                                                     
civo@polished-bush-99d8-1926a1:~/actions-runner$ tar xzf ./actions-runner-linux-x64-2.301.1.tar.gz              
civo@polished-bush-99d8-1926a1:~/actions-runner$ ./config.sh --url https://github.com/MoeLove/monitoring --token <span class="hljs-variable">$TOKEN</span>
</code></pre>
<p>After the execution is complete, some files will be added to the current directory. Execute <code>./env.sh</code> to start the GitHub Actions runner.</p>
<pre><code class="lang-bash">civo@polished-bush-99d8-1926a1:~/actions-runner$ ls
_diag  _work  actions-runner-linux-x64-2.301.1.tar.gz  bin  config.sh  env.sh  externals  run-helper.cmd.template  run-helper.sh  run-helper.sh.template  run.sh  safe_sleep.sh  svc.sh
</code></pre>
<p>If you want to run stably in the background, you can execute <code>./svc.sh install</code> to install the runner as a systemd service and manage its life cycle through systemd.</p>
<h3 id="heading-using-kubernetes">Using Kubernetes</h3>
<p>Civo does not charge for the Kubernetes control plane, but only for Worker Nodes. The advantage of using Kubernetes is that I can automatically scale up and down in the cluster, and I can easily run and create multiple runners for different projects.</p>
<p>Since GitHub official has not provided to deploy a Self-hosted runner on Kubernetes, I used the <a target="_blank" href="https://github.com/actions/actions-runner-controller">Actions Runner Controller (ARC)</a> project, This project allows rapid deployment of Self-hosted runners through <code>Runner</code> custom resources.</p>
<p>The deployment process is clearly described in the <a target="_blank" href="https://github.com/actions/actions-runner-controller/blob/master/docs/quickstart.md">documentation</a>. The following is my deployment process.</p>
<pre><code class="lang-sh"><span class="hljs-comment"># deploy cert-manager</span>
(MoeLove) ➜ kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.11.0/cert-manager.yaml

<span class="hljs-comment"># deploy ARC</span>
(MoeLove) ➜ helm repo add actions-runner-controller https://actions-runner-controller.github.io/actions-runner-controller
(MoeLove) ➜ helm upgrade --install --namespace actions-runner-system --create-namespace\
  --<span class="hljs-built_in">set</span>=authSecret.create=<span class="hljs-literal">true</span>\
  --<span class="hljs-built_in">set</span>=authSecret.github_token=<span class="hljs-string">"REPLACE_YOUR_TOKEN_HERE"</span>\
  --<span class="hljs-built_in">wait</span> actions-runner-controller actions-runner-controller/actions-runner-controller

<span class="hljs-comment"># create runner</span>
(MoeLove) ➜ cat &lt;&lt;EOF | kubectl apply -f -
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: moelove-runner
spec:
  replicas: 1
  template:
    spec:
      repository: MoeLove/monitoring
EOF
</code></pre>
<p>After installation, the following results are achieved:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://twitter.com/zhangjintao9020/status/1616251840429002755?s=20&amp;t=SM0rFgSfq8b03CD11dhdNw">https://twitter.com/zhangjintao9020/status/1616251840429002755?s=20&amp;t=SM0rFgSfq8b03CD11dhdNw</a></div>
<p> </p>
<h2 id="heading-self-hosted-vs-github-managed">Self-hosted vs GitHub-managed</h2>
<p>In the content above, I introduced how I used Meercode to measure the key indicators of CI metrics and estimate the cost of GitHub Actions. According to my actual low resource consumption and high time-consuming scenario, I chose the Self-hosted runner.</p>
<p>So when is it more appropriate to choose a GitHub-managed runner? What are the benefits of GitHub-managed?</p>
<p>The GitHub-managed runner has the following advantages:</p>
<ul>
<li><p><strong>Support for multiple operating systems</strong>: In addition to providing Linux systems, GitHub-managed runner also supports macOS and Windows, but most cloud providers do not provide macOS environments. (I used to put some Mac minis as servers in the data center for specific scenarios)</p>
</li>
<li><p><strong>VM-level isolation</strong>: According to the GitHub Actions documentation, when the GitHub Actions runner runs a job, it creates a VM to run all tasks, which brings certain security and isolation guarantees. If it is a Self-hosted runner when running through the binary, the task will share the host environment, and if it is running through ARC, it will bring isolation through the Pod. This will cause certain security issues.</p>
</li>
<li><p><strong>Low Maintenance Costs</strong>: In fact in any large system, maintenance costs are very expensive. If it is only for personal use, or only a few projects use the Self-hosted runner, the maintenance cost is relatively controllable. Once it gets big, it introduces a lot of complexity. The GitHub-managed runner is maintained by GitHub.</p>
</li>
</ul>
<p>There are also two products that offer self-hosted runner services:</p>
<ul>
<li><p><a target="_blank" href="https://actuated.dev?umt_source=blog.moelove.info">Actuated</a></p>
</li>
<li><p><a target="_blank" href="https://cirun.io?umt_source=blog.moelove.info">cirun</a></p>
</li>
</ul>
<p>They reduce the cost of runner maintenance and management and provide more secure isolation and support for Arm-based environments. cirun also provides GPU runner support.</p>
<p>If you have the above requirements, you may also wish to consider these services.</p>
<h2 id="heading-summarize">Summarize</h2>
<p>In general, the following steps are required to reduce the cost of GitHub actions.</p>
<ul>
<li><p>Visualization/Observability: Estimate costs using actual data.</p>
</li>
<li><p>Compare multiple vendors/solutions: Different vendors offer different pricing for different scenarios or products, and you can choose according to your actual situation.</p>
</li>
<li><p>Security and maintenance costs also need to be considered.</p>
</li>
</ul>
<p>If you are interested in my articles, please subscribe to my Newsletter!</p>
]]></content:encoded></item><item><title><![CDATA[My Rust journey and how to learn Rust]]></title><description><![CDATA[I'll share my Rust journey, how I learned Rust and some free Rust learning resources.

Rust has become more and more popular. Through the StackOverflow 2022 Developer Survey, we can see that many people are interested in Rust.

Rust is on its seventh...]]></description><link>https://blog.moelove.info/my-rust-journey-and-how-to-learn-rust</link><guid isPermaLink="true">https://blog.moelove.info/my-rust-journey-and-how-to-learn-rust</guid><category><![CDATA[Rust]]></category><category><![CDATA[#prometheus]]></category><category><![CDATA[cloud native]]></category><category><![CDATA[WeMakeDevs]]></category><category><![CDATA[WebAssembly]]></category><dc:creator><![CDATA[Jintao Zhang]]></dc:creator><pubDate>Tue, 17 Jan 2023 11:27:40 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1673954421780/58ec55f9-907a-4ff4-b9c7-7b65713680cf.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>I'll share my Rust journey, how I learned Rust and some free Rust learning resources.</p>
</blockquote>
<p>Rust has become more and more popular. Through the <a target="_blank" href="https://survey.stackoverflow.co/2022/#most-loved-dreaded-and-wanted-language-want">StackOverflow 2022 Developer Survey</a>, we can see that many people are interested in Rust.</p>
<blockquote>
<p>Rust is on its seventh year as the most loved language with 87% of developers saying they want to continue using it.</p>
<p>Rust also ties with Python as the most wanted technology with TypeScript running a close second</p>
</blockquote>
<ul>
<li><p>Most Wanted</p>
<p>  <img src="https://s2.loli.net/2023/01/14/mW5X6EgYLSfB3Qk.png" alt="2023-01-14 01-16-33屏幕截图.png" /></p>
</li>
<li><p>Most Loved vs. Dreaded</p>
<p>  <img src="https://s2.loli.net/2023/01/14/69Wgk1zcHN8TQeb.png" alt="2023-01-14 01-16-07屏幕截图.png" /></p>
</li>
</ul>
<p>But Rust has a particular learning curve.</p>
<p><img src="https://camo.githubusercontent.com/1d24e64022fd725f1896890b3ce14c560f075dc1f80f0b0baae3ece8981c882a/68747470733a2f2f70617065722d6174746163686d656e74732e64726f70626f782e636f6d2f735f353445314239364546464546443239343536323930324443354239393731443335434436423635304243383744313230303341333041343635313737363230315f313538363531343237353631385f696d6167652e706e67" alt="pic from Rust User Team Samara - &amp;Meetup1" /></p>
<p>This made me want to share my Rust journey, why I chose Rust, and how to learn Rust.</p>
<h2 id="heading-getting-connected-with-rust">Getting connected with Rust</h2>
<p>I had heard about Rust when it was first released, and my impression was that it was a system programming language that could replace C/C++ and was safe enough. But I didn't learn and use it. (I've only used it to write Hello World!)</p>
<p><mark>Back in time to 5 years ago, I was leading the transformation of the company's infrastructure into a cloud-native stack.</mark></p>
<p>I need to construct a monitoring stack based entirely on Prometheus to replace a set of monitoring software in the company with more than 10 years of history. And some other monitoring software, such as Nagios, Zabbix, and Graphite.</p>
<p>Yes, you read that right, we are using a lot of surveillance software. There are a few reasons for this:</p>
<ul>
<li><p>A single software cannot meet all needs</p>
</li>
<li><p>The team is scattered, and most of the time, new software is introduced just to meet specific needs, rather than to solve the problem</p>
</li>
</ul>
<p>Anyway, here are some historical reasons.</p>
<p>And, from what I mentioned above, we have a set of self-developed monitoring software with a history of more than 10 years, as you can see, our infrastructure is slow to iterate.</p>
<p>And because we have our physical data center, this also leads to many old machines in our servers that have not been updated. (This is one of the reasons why I used Rust later)</p>
<p>I first replaced the monitoring stack in a newly launched small data center, with about 400 machines, and the effect was good. Using Prometheus to complete the monitoring of all the servers in this small data center and the various services running on them. There are also Dashboards created for them in Grafana, and alarm notifications created through Alertmanager.</p>
<p>Later, I promoted these transformations in two data centers, and overall it was relatively smooth, including the monitoring of Kubernetes was also completed during this process.</p>
<p>But when it was implemented in the last data center, I faced the biggest challenge.</p>
<p><a target="_blank" href="https://github.com/prometheus/node_exporter/"><strong>node_exporter</strong></a> <strong>failed to start on some machines, and some machines crashed automatically after running for some time.</strong></p>
<p>I started to investigate this issue. For the automatic crash issue, I temporarily fixed it by adding a restart script.</p>
<p>I'm mainly concerned with why node_exporter won't start. <mark>I found that the operating system of this part of the machine is CentOS 5, and the kernel is 2.6.18.</mark></p>
<p>I found that there are already similar issues in the community: <a target="_blank" href="https://github.com/prometheus/node_exporter/issues/691">https://github.com/prometheus/node_exporter/issues/691</a></p>
<p><strong>At the same time, I also noticed that the Go documentation clearly stated that CentOS 5 is not supported, and a kernel of at least version 2.6.32 or above is required.</strong></p>
<p>(I forgot the minimum dependencies when I checked, but through the <a target="_blank" href="https://web.archive.org/web/20170916192117/https://github.com/golang/go/wiki/MinimumRequirements">web archive</a>, I see that the minimum kernel version required in 2017 is 2.6.23)</p>
<p>After some searching, I also saw something like <a target="_blank" href="https://dave.cheney.net/2013/06/18/how-to-install-go-1-1-on-centos-5">How to install Go 1.1 on CentOS 5.9</a>, but at the same time, some known issues are mentioned in the article.</p>
<p>So I'm not going to keep fighting it.</p>
<p><strong><mark>I want to re-implement one by myself</mark></strong><mark>, which can also solve the above automatic crash problem.</mark></p>
<p><strong>In the end, I used Rust to implement a tool similar to node_exporter and completed the upgrade and transformation of the monitoring system.</strong></p>
<p><strong>This is where my journey started with Rust in production.</strong></p>
<p>Next, let me introduce why I chose Rust.</p>
<h2 id="heading-why-choose-rust">Why choose Rust</h2>
<p>I have introduced some background above. At that time, the easiest choice should be Python, which is simple enough and rich in ecology. At the same time, I also have many years of experience in Python development, I can quickly build the tools I need.</p>
<p>The reasons for not choosing Python are:</p>
<ul>
<li><p>Not all of these machines have a Python environment, and the versions of Python are also different. I was asked not to modify the environment on these machines as much as possible;</p>
</li>
<li><p>Since I may make some modifications later, I think the subsequent distribution may not be convenient;</p>
</li>
</ul>
<p>Then I rethought my goal:</p>
<ul>
<li>Can be compiled into binary executable files for easy distribution and deployment. I used Ansible for unified deployment.</li>
</ul>
<p>So a more suitable option is C/C++/Rust.</p>
<p>I have more experience in C development and a little experience in C++. For my first requirement, the above three languages can be easily met.</p>
<p>When most people compare Rust and C/C++, they are comparing their performance and safety.</p>
<p>And in my use case at the time, I don't think the results in the other two languages would be worse than in Rust, although these are also considerations. And since I was just starting to learn Rust at the time, it might be worse than my C implementation.</p>
<p><strong>But I want more challenges, try something new, and in terms of Prometheus monitoring, the C/C++-related ecology is not very active. Another point I think Rust will have great development in the future.</strong></p>
<p>So in the end I chose Rust.</p>
<h2 id="heading-how-i-learned-rust">How I learned Rust</h2>
<p>Rust is not simple, and it's not quite the same as other languages, so some practices that work in other languages may not work in Rust.</p>
<p><mark>Since I have a specific problem that needs to be solved, I need to implement a </mark> <a target="_blank" href="https://github.com/prometheus/node_exporter/"><mark>node_exporter</mark></a> <mark> to complete the transformation of the monitoring stack. So I learned Rust through the learning-by-doing mode.</mark></p>
<p>I first took a quick look at the following:</p>
<ul>
<li><p><a target="_blank" href="https://doc.rust-lang.org/stable/book/">The Rust Programming Language</a>: This book is very complete, I didn't read it completely at first. Instead, use it to understand the main concepts and some usages in Rust.</p>
</li>
<li><p><a target="_blank" href="https://doc.rust-lang.org/rust-by-example/">Rust By Example</a>: There are many examples here, and you can also increase your familiarity with Rust by practicing these examples;</p>
</li>
<li><p><a target="_blank" href="https://doc.rust-lang.org/std/index.html">Rust std lib docs</a>: Documentation of the standard library, a quick overview, understanding some keywords, modules, etc. But it is not necessary to read it in its entirety initially.</p>
</li>
</ul>
<p>This way I quickly implemented a basic node_exporter version. Then continue to iterate and apply it to the production environment, and completed the construction of the Prometheus monitoring stack.</p>
<p>Later, I continued to implement some small tools in Rust, learned its best practices, and learned some open-source projects implemented in Rust to increase my Rust experience.</p>
<h2 id="heading-recommend-some-rust-learning-resources">Recommend some Rust learning resources</h2>
<p>There are many learning resources for Rust now. In addition to the ones I listed above, I recommend the following free content:</p>
<ul>
<li><p><a target="_blank" href="https://learn.microsoft.com/en-us/training/paths/rust-first-steps/">Take your first steps with Rust - Training | Microsoft Learn</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/rust-lang/rustlings">rust-lang/rustlings: Small exercises to get you used to reading and writing Rust code!</a></p>
</li>
</ul>
<p>videos:</p>
<ul>
<li><p><a target="_blank" href="https://www.youtube.com/watch?v=zF34dRivLOw&amp;utm_source=blog.moelove.info&amp;utm_medium=content">Rust Crash Course | Rustlang - YouTube</a></p>
</li>
<li><p><a target="_blank" href="https://www.youtube.com/watch?v=T_KrYLW4jw8&amp;list=PLzMcBGfZo4-nyLTlSRBvo0zjSnCnqjHYQ&amp;utm_source=blog.moelove.info&amp;utm_medium=content">Rust Tutorial - YouTube</a></p>
</li>
<li><p><a target="_blank" href="https://www.youtube.com/playlist?list=PLlrxD0HtieHjbTjrchBwOVks_sr8EVW1x&amp;utm_source=blog.moelove.info&amp;utm_medium=content">Rust for Beginners - YouTube</a></p>
</li>
</ul>
<h2 id="heading-summarize">Summarize</h2>
<p>This is how my Rust journey started, and it continues.</p>
<p>Although I focus on Cloud Native and Kubernetes-related technologies, and now I write more Go language, I also still write some tools in Rust and use Rust in WebAssembly.</p>
<p>In the future, I will also share relevant content. If you are interested in my articles, welcome to subscribe to my Newsletter!</p>
]]></content:encoded></item><item><title><![CDATA[Opportunities and Challenges of Technological Evolution in Cloud Native]]></title><description><![CDATA[Nowadays, Cloud Native is becoming increasingly popular, and the CNCF defines Cloud Native as:

Based on a modern and dynamic environment, aka cloud environment.

With containerization as the fundamental technology, including Service Mesh, immutable ...]]></description><link>https://blog.moelove.info/opportunities-and-challenges-of-technological-evolution-in-cloud-native</link><guid isPermaLink="true">https://blog.moelove.info/opportunities-and-challenges-of-technological-evolution-in-cloud-native</guid><category><![CDATA[Cloud]]></category><category><![CDATA[cloud native]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[#ServiceMesh]]></category><category><![CDATA[Apache APISIX]]></category><dc:creator><![CDATA[Jintao Zhang]]></dc:creator><pubDate>Thu, 15 Dec 2022 17:22:59 GMT</pubDate><content:encoded><![CDATA[<p>Nowadays, Cloud Native is becoming increasingly popular, and the CNCF defines Cloud Native as:</p>
<ul>
<li><p>Based on a modern and dynamic environment, aka cloud environment.</p>
</li>
<li><p>With containerization as the fundamental technology, including Service Mesh, immutable infrastructure, declarative API, etc.</p>
</li>
<li><p>Key features include autoscaling, manageability, observability, automation, frequent change, etc.</p>
</li>
</ul>
<p>According to the CNCF 2021 survey, there are a very significant number (over 62,000) of contributors in the Kubernetes community. With the current trend of technology, more and more companies are investing more cost into Cloud Native and joining the track early for active cloud deployment. Why are companies embracing Cloud Native while developing, and what does Cloud Native mean for them?</p>
<h2 id="heading-technical-advantages-of-cloud-native">Technical Advantages of Cloud Native</h2>
<p>The popularity of Cloud Native comes from its advantages at the technical level. There are two main aspects of Cloud Native technology, including containerization led by Docker, and container orchestration led by Kubernetes.</p>
<p>Docker introduced container images to the technology world, making container images a standardized delivery unit. In fact, before Docker, containerization technology already existed. Let's talk about a more recent technology, LXC (<a target="_blank" href="https://linuxcontainers.org/">Linux Containers</a>) in 2008. Compared to Docker, LXC is less popular since Docker provides container images, which can be more standardized and more convenient to migrate. Also, Docker created the DockerHub public service, which has become the world's largest container image repository. In addition, containerization technology can also achieve a certain degree of resource isolation, including not only CPU, memory, and other resources isolation, but also network stack isolation, which makes it easier to deploy multiple copies of applications on the same machine.</p>
<p>Kubernetes became popular due to the booming of Docker. The container orchestration technology, led by Kubernetes, provides several important capabilities, such as fault self-healing, resource scheduling, and service orchestration. Kubernetes has a built-in DNS-based service discovery mechanism, and thanks to its scheduling architecture, it can be scaled very quickly to achieve service orchestration.</p>
<p>Now more and more companies are actively embracing Kubernetes and transforming their applications to embark on Kubernetes deployment. And Cloud Native we are talking about is actually based on the premise of Kubernetes, the cornerstone of Cloud Native technology.</p>
<p><img src="https://static.apiseven.com/2022/10/01/63384eaab1218.png" alt="img1.PNG" /></p>
<h3 id="heading-containerization-advantages">Containerization Advantages</h3>
<ol>
<li>Standardized Delivery</li>
</ol>
<p>Container images have now become a standardized delivery unit. By containerization technology, users can directly complete the delivery through a container image instead of binary or source code. Relying on the packaging mechanism of the container image, you can use the same image to start a service and produce the same behavior in any container runtime.</p>
<ol start="2">
<li>Portable and Light-weight, Cost-saving</li>
</ol>
<p>Containerization technology achieves certain isolation by Linux kernel's capabilities, which in turn makes it easier to migrate. Moreover, containerization technology can directly run applications, which is lighter in technical implementation compared to virtualization technology, without the need for OS in the virtual machine. All applications can share the kernel, which saves cost. And the larger the application, the greater the cost savings.</p>
<ol start="3">
<li>Convenience of resource management</li>
</ol>
<p>When starting a container, you can set the CPU, memory, or disk IO properties that can be used for the container service, which allows for better planning and deployment of resources when starting application instances through containers.</p>
<h3 id="heading-container-orchestration-advantages">Container Orchestration Advantages</h3>
<ol>
<li>Simplify the Workflow</li>
</ol>
<p>In Kubernetes, application deployment is easier to manage than in Docker, since Kubernetes uses declarative configuration. For example, a user can simply declare in a configuration file what container image the application will use and what service ports are exposed without the need for additional management. The operations corresponding to the declarative configuration greatly simplify the workflow.</p>
<ol start="2">
<li>Improve Efficiency and Save Costs</li>
</ol>
<p>Another advantageous feature of Kubernetes is failover. When a node in Kubernetes crashes, Kubernetes automatically schedules the applications on it to other normal nodes and gets them up and running. The entire recovery process does not require human intervention and operation, so it not only improves operation and maintenance efficiency at the operational level but also saves time and cost.</p>
<p>With the rise of Docker and Kubernetes, you will see that their emergence has brought great innovation and opportunity to application delivery. Container images, as standard delivery units, shorten the delivery process and make it easier to integrate with CI/CD systems.</p>
<p>Considering that application delivery is becoming faster, how is that application architecture following the Cloud Native trend?</p>
<h2 id="heading-application-architecture-evolution-from-monoliths-microservice-to-service-mesh">Application Architecture Evolution: from Monoliths, Microservice to Service Mesh</h2>
<p>The starting point of application architecture evolution is still from monolithic architecture. As the size and requirements of applications increased, the monolithic architecture no longer met the needs of collaborative team development, thus distributed architectures were gradually introduced.</p>
<p>Among the distributed architectures, the most popular one is the microservice architecture. Microservice architecture can split services into multiple modules, which communicate with each other, complete service registration and discovery, and achieve common capabilities such as flow limitation and circuit breaking.</p>
<p>In addition, there are various patterns included in a microservice architecture. For example, the per-service database pattern, which represents each microservice with an individual database, is a pattern that avoids database-level impact on the application but may introduce more database instances.</p>
<p>Another one is the API Gateway pattern, which receives the entrance traffic of the cluster or the whole microservice architecture through a gateway and completes the traffic distribution through APIs. This is one of the most used patterns, and gateway products like Spring Cloud Gateway or Apache APISIX can be applied.</p>
<p>The popular architectures are gradually extending to Cloud Native architectures. Can a microservice architecture under Cloud Native simply build the original microservice as a container image and migrate it directly to Kubernetes?</p>
<p>In theory, it seems possible, but in practice there are some challenges. In a Cloud Native microservice architecture, these components need to run not just in containers, but also include other aspects such as service registration, discovery, and configuration.</p>
<p>The migration process also involves business-level transformation and adaptation, requiring the migration of common logic such as authentication, authorization, and observability-related capabilities (logging, monitoring, etc.) to K8s. Therefore, the migration from the original physical machine deployment to the K8s platform is much more complex than it is.</p>
<p>In this case, we can use the Sidecar model to abstract and simplify the above scenario.</p>
<p>Typically, the Sidecar model comes in the form of a Sidecar Proxy, which evolves from the left side of the diagram below to the right side by sinking some generic capabilities (such as authentication, authorization, security, etc.) into Sidecar. As you can see from the diagram, this model has been adapted from requiring multiple components to be maintained to requiring only two things (application + Sidecar) to be maintained. At the same time, the Sidecar model itself contains some common components, so it does not need to be maintained by the business side itself, thus easily solving the problem of microservice communication.</p>
<p><img src="https://static.apiseven.com/2022/10/01/63384eaa17798.png" alt="img2.PNG" /></p>
<p>To avoid the complex scenes of separate configuration and repeated wheel building when introducing a Sidecar for each microservice, the process can be implemented by introducing a control plane or by control plane injection, which gradually forms current Service Mesh.</p>
<p>Service Mesh usually requires two components, i.e., control plane + data plane. The control plane completes the distribution of configuration and the execution of the related logic, such as Istio, which is currently the most popular. On the data plane, you can choose an API gateway like Apache APISIX for traffic forwarding and service communication. Thanks to the high performance and scalability of APISIX, it is also possible to perform some customization requirements and custom logic. The following shows the architecture of the Service Mesh solution with Istio+APISIX.</p>
<p><img src="https://static.apiseven.com/2022/10/01/63384ea7257a2.png" alt="img3.PNG" /></p>
<p>The advantage of this solution is that when you want to migrate from the previous microservice architecture to a Cloud Native architecture, you can avoid massive changes on the business side by using a Service Mesh solution directly.</p>
<h2 id="heading-technical-challenges-of-cloud-native">Technical Challenges of Cloud Native</h2>
<p>The previous article mentioned some of the advantages of the current Cloud Native trend in terms of technical aspects. However, every coin has two sides. Although some fresh elements and opportunities can be brought, challenges will emerge due to the participation of certain technologies.</p>
<h3 id="heading-problems-caused-by-containerization-and-k8s">Problems Caused by Containerization and K8s</h3>
<p>In the beginning part of the article, we mentioned that containerization technology uses a shared kernel, and the shared kernel brings lightness but creates a lack of isolation. If container escape occurs, the corresponding host may be attacked. Therefore, to meet these security challenges, technologies such as secure containers have been introduced.</p>
<p>In addition, although container images provide a standardized delivery method, they are prone to be attacked, such as supply chain attacks.</p>
<p>Similarly, the introduction of K8s has also brought about challenges in component security. The increase in components has led to a rise in the attack surface, as well as additional vulnerabilities related to the underlying components and dependency levels. At the infrastructure level, migrating from traditional physical or virtual machines to K8s involves infrastructure transformation costs and more labor costs to perform cluster data backups, periodic upgrades, and certificate renewals.</p>
<p>Also, in the Kubernetes architecture, the apiserver is the core component of the cluster and needs to handle all the inside and outside traffic. Therefore, in order to avoid border security issues, how to protect the apiserver also becomes a key question. For example, we can use Apache APISIX to protect it.</p>
<h3 id="heading-security">Security</h3>
<p>The use of new technologies requires additional attention at the security level:</p>
<ul>
<li><p><strong>At the network security level</strong>, fine-grained control of traffic can be implemented by Network Policy, or other connection encryption methods like mTLS to form a zero-trust network.</p>
</li>
<li><p><strong>At the data security level</strong>, K8s provides the secret resource for handling confidential data, but actually, it is not secure. The contents of the secret resource are encoded in Base64, which means you can access the contents through Base64 decoding, especially if they are placed in etcd, which can be read directly if you have access to etcd.</p>
</li>
<li><p><strong>At the level of permission security</strong>, there is also a situation where RBAC settings are not reasonable, which leads to an attacker using the relevant Token to communicate with the apiserver to achieve the purpose of the attack. This kind of permission setting is mostly seen in the controller and operator scenarios.</p>
</li>
</ul>
<p><img src="https://static.apiseven.com/2022/10/01/63384e9f5ca7b.png" alt="img4.png" /></p>
<h3 id="heading-observability">Observability</h3>
<p>Most of the Cloud Native scenarios involve some observability-related operations such as logging, monitoring, etc.</p>
<p>In K8s, if you want to collect logs in a variety of ways, you need to collect them directly on each K8s node through aggregation. If logs were collected in this way, the application would need to be exported to standard output or standard errors.</p>
<p>However, if the business does not make relevant changes and still chooses to write all the application logs to a file in the container, it means that a Sidecar is needed for log collection in each instance, which makes the deployment architecture extremely complex.</p>
<p>Back to the architecture governance level, the selection of monitoring solutions in the Cloud Native environment also poses some challenges. Once the solution selection is wrong, the subsequent cost of use is very high, and the loss can be huge if the direction is wrong.</p>
<p>Also, there are capacity issues involved at the monitoring level. While deploying an application in K8s, you can simply configure its rate limiting to limit the resource details the application can use. However, in a K8s environment, it is still rather easy to over-sell resources, over-utilize resources, and overflow memory due to these conditions.</p>
<p>In addition, another situation in a K8s cluster where the entire cluster or node runs out of resources will lead to resource eviction, which means resources already running on a node are evicted to other nodes. If a cluster's resources are tight, a node storm can easily cause the entire cluster to crash.</p>
<h3 id="heading-application-evolution-and-multi-cluster-pattern">Application Evolution and Multi-cluster Pattern</h3>
<p>At the application architecture evolution level, the core issue is service discovery.</p>
<p>K8s provides a DNS-based service discovery mechanism by default, but if the business includes the coexistence of cloud business and stock business, it will be more complicated to use a DNS service discovery mechanism to deal with the situation.</p>
<p>Meanwhile, if enterprises choose Cloud Native technology, with the expansion of business scale, they will gradually go to consider the direction of multi-node processing, which will then involve multi-cluster issues.</p>
<p>For example, we want to provide customers with a higher availability model through multiple clusters, and this time it will involve the orchestration of services between multiple clusters, multi-cluster load distribution and synchronization configuration, and how to handle and deploy strategies for clusters in multi-cloud and hybrid cloud scenarios. These are some of the challenges that will be faced.</p>
<p>In general, the evolution of architecture and technology in the Cloud Native era brings us both opportunities and challenges.</p>
]]></content:encoded></item></channel></rss>