<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Prosple Engineering Blog]]></title><description><![CDATA[Helping every student get the best possible start to their career.]]></description><link>https://engineering.prosple.com</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1741263772743/b91ab741-5d32-43e2-86c9-3cb765b539d9.png</url><title>Prosple Engineering Blog</title><link>https://engineering.prosple.com</link></image><generator>RSS for Node</generator><lastBuildDate>Mon, 18 May 2026 06:30:43 GMT</lastBuildDate><atom:link href="https://engineering.prosple.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[How I Wasted My Day Debugging Redirects So You Don't Have To]]></title><description><![CDATA[If you’re not a fan of stories and are just looking for a quick way to help you on your redirects debugging, then you could skip to the part about the blog post. Otherwise, continue reading.
Introduction
There I was, finally almost wrapping up a task...]]></description><link>https://engineering.prosple.com/how-i-wasted-my-day-debugging-redirects-so-you-dont-have-to</link><guid isPermaLink="true">https://engineering.prosple.com/how-i-wasted-my-day-debugging-redirects-so-you-dont-have-to</guid><category><![CDATA[webdev]]></category><category><![CDATA[debugging]]></category><category><![CDATA[redirects]]></category><category><![CDATA[frontend]]></category><category><![CDATA[Next.js]]></category><category><![CDATA[caching]]></category><category><![CDATA[301]]></category><dc:creator><![CDATA[Sian Dela Cruz]]></dc:creator><pubDate>Wed, 30 Apr 2025 09:01:12 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/f2JMVDnarks/upload/7b0ebf0d4a63847dcbd1697fa262a800.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you’re not a fan of stories and are just looking for a quick way to help you on your redirects debugging, then you could skip to the part about the blog post. Otherwise, continue reading.</p>
<h2 id="heading-introduction">Introduction</h2>
<p>There I was, finally almost wrapping up a task I’ve been doing for almost a month. I was tasked with upgrading our frontend codebase from Next 10 to Next 15 and React 16 to React 19. I tested everything and no bugs in sight. After weeks of headaches and new white hairs produced, I was finally confident in its stability. And so, I passed off my work to QA.</p>
<p>QA did their thing, and they found one bug. I looked at the bug and it was a single redirecting issue, shouldn’t be so complicated. I let out a huge sigh of relief. With all that work and possible uncertainties, only one bug was found that seemed to be simple enough. Little did I know what I was about to sign up for…</p>
<h2 id="heading-the-rabbit-hole-when-everything-seemed-right-but-wasnt">The Rabbit Hole: When Everything Seemed Right but Wasn’t</h2>
<p>I got started right away. We had an existing legacy router and figured to look there. After a couple of trial and errors and well-placed console logs for suspect code blocks, no findings so far so I marked the legacy router itself “Clear”. Next, a file called <code>redirect.js</code> - a prime suspect. It had code that explicitly uses a <code>301</code> redirect, jackpot! Not even an hour in, it seems I already hit the gold mine. There were no other relevant codes even mentioning a <code>301</code> redirect in the code. Where else could it be, right? WRONG! I tested the redirect again and again, still the same thing. This was where things started to become blurry.</p>
<p>If not the part of the code that explicitly has <code>301</code> in it, and is called <code>redirect.js</code>, where should I even look? I honestly had no idea. I looked into our custom <code>server.js</code> entrypoint for our Next.js app, no luck with that. I tried moving away the page from our legacy router into Next’s default file-based router, nope. I tried upgrading possible related dependencies, nada! At this point, I was deep in the weeds - losing time, losing patience, and possibly losing my mind. I’ve been gaslighting myself that some things that are obviously unrelated could possibly be culprits. I mean, where else would I look? I even tried debugging our GraphQL infrastructure, and tried modifying it. In the end, I tried going back by batches in commits and testing the redirect issue in each version until I reached the version of our live production code… huh? It’s still not working? I tried in production, and even asked my teammates to try locally, it worked for them. What was happening? It all made sense earlier, nothing makes sense now.</p>
<h2 id="heading-the-blog-post-that-saved-me-and-might-save-you-too">The Blog Post That Saved Me (and Might Save You Too)</h2>
<p>Just as I was about to give up, I found this article named <a target="_blank" href="https://dodov.dev/blog/how-to-debug-browser-redirects">“How to Debug Browser Redirects”</a> by Hristiyan Dodov. He quoted:</p>
<blockquote>
<p>“In the absense of cache control directives that specify otherwise, a 301 redirect defaults to being cached without any expiry date.”</p>
</blockquote>
<p>Suddenly, everything made sense. I wasn’t crazy. My code wasn’t broken. My tools (especially the browser) weren’t lying — they were just <strong>too helpful, in a misleading way</strong>.</p>
<p>It was the browser cache all along that was keeping me away from getting answers.</p>
<h2 id="heading-how-to-actually-debug-redirects">How to Actually Debug Redirects</h2>
<p>The best solution for my case that Dodov mentioned is using <a target="_blank" href="https://curl.se/">cURL</a> in the terminal.</p>
<p>At first glance, it didn’t seem like anything special — just another terminal command I’d probably forget to use. But in reality, <a target="_blank" href="https://curl.se/">cURL</a> ended up being the most <strong>objective</strong>, <strong>unfiltered</strong>, and <strong>browser-free</strong> way to know what was actually happening between my app and the client.</p>
<h3 id="heading-why-curl-helped-me"><strong>✅</strong> Why cuRL Helped Me</h3>
<p>After hours of chasing false leads, I started revisiting the “obvious” fixes I had tried earlier - changing code, tweaking routes, updating configurations. I eventually confirmed the issue was related to <code>redirect.js</code> as I suspected, but the browser's aggressive caching prevented me from seeing my fixes work. But this time, instead of testing them in the browser (which had misled me before due to its aggressive caching of the 301 redirect), I tested using <a target="_blank" href="https://curl.se/">cURL</a>:</p>
<pre><code class="lang-bash">// Command
curl -I http://localhost/profiles

// Response
HTTP/1.1 301 Moved Permanently
X-Powered-By: Express
Path=/; Max-Age=316224000
location: /profiles?default=1
Date: Wed, 23 Apr 2025 17:19:44 GMT
Connection: keep-alive
Keep-Alive: timeout=5
</code></pre>
<p>This command showed me the valuable information like Max-Age (The duration of the 301 cache in seconds). It helped me confirm if my fixes <em>were</em> working or not with confidence. The issue wasn’t my fix… it was that the browser still held onto an outdated redirect in its cache. I had solved the bug hours earlier — I just didn’t know it.</p>
<h3 id="heading-other-handy-curl-flags-you-might-want-to-try"><strong>🔍</strong> Other Handy cURL Flags You Might Want To Try</h3>
<ul>
<li><p>See the full redirect chain:</p>
<pre><code class="lang-bash">  curl -IL https://yourdomain.com/path
</code></pre>
</li>
<li><p>Mimic a specific user agent:</p>
<pre><code class="lang-bash">  curl -A <span class="hljs-string">"Mozilla/5.0"</span> -I https://yourdomain.com/path
</code></pre>
</li>
<li><p>View detailed request/response activity:</p>
<pre><code class="lang-bash">  curl -v https://yourdomain.com/path
</code></pre>
</li>
</ul>
<h2 id="heading-conclusion">Conclusion</h2>
<p>What started as a simple bug ended up being a full day's journey into the quirks of 301 redirects and browser caching. The lesson? When debugging redirects, don't trust your browser alone—it might be secretly caching responses that make you question your sanity.</p>
<p>Remember these key takeaways:</p>
<ul>
<li><p>Browser caching of 301 redirects can be extremely persistent (potentially cached indefinitely!). If possible code your redirects to include Max-Age.</p>
</li>
<li><p>Use command-line tools like <a target="_blank" href="https://curl.se/">cURL</a> to see the actual HTTP responses without browser interference.</p>
</li>
<li><p>Search more information about similar issues even beforehand to avoid falling into the debugging hell like me.</p>
</li>
</ul>
<p>If you’d like to read more information about Dodov’s approach to debugging redirects (HINT: He has other tools he uses there as well), you can check out his article <a target="_blank" href="https://dodov.dev/blog/how-to-debug-browser-redirects">here</a>.</p>
<p>Happy coding, and may your redirects always take you where you expect to go!</p>
]]></content:encoded></item><item><title><![CDATA[Dealing with Dates and Timezones]]></title><description><![CDATA[Since the dawn of time (pun intended), timezones have always been a confusing concept to wrap around. It's been a recurring issue at Prosple especially when new developers join our ranks.
So we've finally decided to address this once and for all.
Her...]]></description><link>https://engineering.prosple.com/dealing-with-dates-and-timezones</link><guid isPermaLink="true">https://engineering.prosple.com/dealing-with-dates-and-timezones</guid><category><![CDATA[timezone]]></category><category><![CDATA[Web Development]]></category><category><![CDATA[Software Engineering]]></category><category><![CDATA[Computer Science]]></category><category><![CDATA[Frontend Development]]></category><dc:creator><![CDATA[Kuya Dev (Rem Lampa)]]></dc:creator><pubDate>Thu, 04 Jul 2024 16:13:04 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1720109445456/b0c5971d-8237-4b72-a6a2-48e0fe15d1d1.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Since the dawn of time (pun intended), timezones have always been a confusing concept to wrap around. It's been a recurring issue at <a target="_blank" href="https://prosple.com">Prosple</a> especially when new developers join our ranks.</p>
<p>So we've finally decided to address this once and for all.</p>
<p>Here are rules of thumb that serve as guidance for our engineers in determining how to handle timezones in their respective applications. I'm sharing this publicly as I feel that this can probably help in solving most of the timezone issues plaguing developers the world over.</p>
<h2 id="heading-backend"><strong>Backend</strong></h2>
<h3 id="heading-apis-and-storage">APIs and Storage</h3>
<p>For simplicity, backend APIs should always live and breathe in UTC: It receives dates in UTC, stores dates in UTC, and responds with dates in UTC. Backend just also simply stores any timezone data associated to the date.</p>
<p>No conversion logic happens in backend API request and response and during storage.</p>
<h3 id="heading-time-sensitive-processes">Time-Sensitive Processes</h3>
<p>For time-sensitive backend processes, like schedules and cron jobs, the raw UTC date is treated as if it's in its associated timezone, then adjusted according to the server's local timezone.</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"datetime"</span>: <span class="hljs-string">"2024-07-04T11:41:25.793Z"</span>,
    <span class="hljs-attr">"timezone"</span>: <span class="hljs-string">"Asia/Manila"</span>
}
</code></pre>
<p>In this example, the date stored in the database is in UTC, but the stored timezone is in Manila (UTC+8). This should be treated as <code>Thu Jul 04 2024 11:41:25 UTC+0800 (Philippine Standard Time)</code> before converting to the server's local timezone.</p>
<h2 id="heading-frontend-applications"><strong>Frontend Applications</strong></h2>
<p>This will come down to whether we allow users to personalize their timezones or not.</p>
<h3 id="heading-if-users-are-not-allowed-to-personalize-their-timezones">If Users Are NOT Allowed to Personalize Their Timezones</h3>
<p>These Apps should display dates in the browser's local timezone.</p>
<p>That said, Frontend Apps should send and receive dates in UTC to and from Backend APIs, regardless of the browser timezone.</p>
<p>Hence, the conversion between UTC and browser local timezone is executed in the frontend code.</p>
<p>If the date is received with an associated timezone, the raw date is treated as if it's in the provided timezone, then adjusted to the browser timezone accordingly.</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"datetime"</span>: <span class="hljs-string">"2024-07-04T11:41:25.793Z"</span>,
    <span class="hljs-attr">"timezone"</span>: <span class="hljs-string">"Asia/Manila"</span>
}
</code></pre>
<p>In this example, the date received from the API is in UTC, but the timezone is in Manila (UTC+8). This should be treated as <code>Thu Jul 04 2024 11:41:25 UTC+0800 (Philippine Standard Time)</code> before converting to browser local timezone.</p>
<h3 id="heading-if-users-are-allowed-to-personalize-their-timezones">If Users Are Allowed to Personalize Their Timezones</h3>
<p>These Apps should display dates in the user's preferred timezone.</p>
<p>That said, Frontend Apps should send and receive dates in UTC to and from Backend APIs, regardless of the selected timezone.</p>
<p>Hence, the conversion between UTC and user's personalized timezone is STILL executed in the frontend code.</p>
<p>If the date is received with an associated timezone, the raw date is treated as if it's in the provided timezone, then adjusted to the user's preferred timezone accordingly.</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"datetime"</span>: <span class="hljs-string">"2024-07-04T11:41:25.793Z"</span>,
    <span class="hljs-attr">"timezone"</span>: <span class="hljs-string">"Asia/Manila"</span>
}
</code></pre>
<p>In this example, the date received from the API is in UTC, but the timezone is in Manila. This should be treated as <code>Thu Jul 04 2024 11:41:25 UTC+0800 (Philippine Standard Time)</code>. If the user's preferred timezone is <code>Asia/Hanoi</code> (UTC+7), the time is displayed as <code>Thu Jul 04 2024 10:41:25</code>, regardless of the browser timezone.</p>
<h2 id="heading-forms-that-have-date-and-timezone-fields"><strong>Forms that Have Date and Timezone Fields</strong></h2>
<p>Forms and UI with timezone-tagged dates should always display dates in UTC. But never display them without the associated timezone data. Or else, user loses the context.</p>
<p>For example, if we have an API response of:</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"datetime"</span>: <span class="hljs-string">"2024-07-04T11:41:25.793Z"</span>,
    <span class="hljs-attr">"timezone"</span>: <span class="hljs-string">"Asia/Manila"</span>
}
</code></pre>
<p>The date is in UTC, but the timezone is in Manila. This should be displayed in the frontend as if the raw date is in the Manila Timezone:</p>
<p><code>2024-07-04T11:41:25.793Z</code> -&gt; <code>July 4, 2024 11:41AM Manila Time</code></p>
<p>Form dates are also submitted in UTC, but should always be accompanied with a separate timezone value.</p>
<p>Be careful not to let the browser convert the dates into local time, or the form will end up in timezone conversion hell. For example, say user lives in Manila.</p>
<ul>
<li><p>Browser receives UTC: <code>2024-07-04T11:41:25.793Z</code></p>
</li>
<li><p>Browser converts it to local time (adds 8 hours): <code>2024-07-04T19:41:25.793Z</code></p>
</li>
<li><p>Form is submitted.</p>
</li>
<li><p>Server saves it in UTC: <code>2024-07-04T19:41:25.793Z</code></p>
</li>
<li><p>User refreshes page.</p>
</li>
<li><p>Browser receives UTC: <code>2024-07-04T19:41:25.793Z</code></p>
</li>
<li><p>Browser converts it to local time (adds another 8 hours): <code>2024-07-05T03:41:25.793Z</code></p>
</li>
</ul>
<p>Talk about time-traveling. 🤣 This may or may not be based on experience.</p>
<h2 id="heading-but-how-about-isomorphic-applications"><strong>But How About Isomorphic Applications?</strong></h2>
<p>Isomorphic apps are frontend apps that may render both server- and/or client-side.</p>
<p>Same rules still hold true. Server-Side Rendering happens on the server, so things there should always be in UTC.</p>
<p>Timezone adjustments should always happen on the browser, so in the case of NextJS for example, this should happen after hydration. One solution is to perform the adjustment inside useEffect, making sure to only execute when the browser window object is available.</p>
<h2 id="heading-mind-your-use-cases">Mind Your Use Cases</h2>
<p>As with all things in software engineering, these rules will not cover all use cases. So adopt them with care.</p>
<h2 id="heading-summary">Summary</h2>
<p>Whew! That was nasty. But such is the joy of working with timezones.</p>
<p>In conclusion, by adhering to these guidelines for handling timezones, we can ensure consistency and accuracy across our web applications. Whether dealing with backend services, user-facing frontends, or isomorphic applications, maintaining a clear strategy for timezone management will help avoid common pitfalls and improve the overall user experience. Remember, while these rules are robust, always consider the specific needs of your project to achieve the best results.</p>
]]></content:encoded></item><item><title><![CDATA[The Curious Case of a Healthy Service Timing Out]]></title><description><![CDATA[For several months, we at Prosple Engineering have been plagued by an intermittent stream of 502 and 504 errors in our API Gateway. It seemed that one of our downstream microservices was causing it. Curiously, that service's resource consumption was ...]]></description><link>https://engineering.prosple.com/the-curious-case-of-a-healthy-service-timing-out</link><guid isPermaLink="true">https://engineering.prosple.com/the-curious-case-of-a-healthy-service-timing-out</guid><category><![CDATA[Load Testing]]></category><category><![CDATA[troubleshooting]]></category><category><![CDATA[debugging]]></category><category><![CDATA[Software Engineering]]></category><dc:creator><![CDATA[Kuya Dev (Rem Lampa)]]></dc:creator><pubDate>Mon, 27 May 2024 09:24:30 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/CHAFV-0U7b8/upload/3d314b412e84d2b3f2b6333de96135e8.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>For several months, we at <a target="_blank" href="https://engineering.prosple.com/">Prosple Engineering</a> have been plagued by an intermittent stream of 502 and 504 errors in our API Gateway. It seemed that one of our downstream microservices was causing it. Curiously, that service's resource consumption was way below the threshold. And our development and staging environments aren't experiencing the issues. It was a mystery.</p>
<p>Luckily, one of our team members stumbled upon this <a target="_blank" href="https://www.nginx.com/blog/avoiding-top-10-nginx-configuration-mistakes/">article</a>, and we theorized that there's an issue with that service's Nginx configuration. Long story short, we implemented the fix and it worked.</p>
<p>But I won't be talking about the solution. What I'll discuss here is <strong>how we tested both the hypothesis and the fix</strong>. We could've just accepted the solution presented in the article blindly, deployed it, and see if it worked. But as engineers, we should always exercise healthy skepticsm, and test things as early in the development (or in this case, early in the troubleshooting) as possible to minimize the cost, in terms of time, effort, and money.</p>
<h2 id="heading-the-challenge">The Challenge</h2>
<p>So we had a hypothesis: the service's Nginx might be misconfigured. The first step was to test that hypothesis. How did we do it? By replicating the errors on our local machine!</p>
<p>That sounded simple. But before we even came up with this hypothesis, we had spent months trying to replicate the issue <strong>SO THAT WE CAN COME UP WITH A HYPOTHESIS</strong>. No one had any idea. Remember, the service's health and resource consumption were more than fine, only clues we had were the intermittent Bad Gateway and Gateway Timeout Alerts that were showing up in our APM (Application Performance Monitoring).</p>
<p>But once we finally had a reasonable hypothesis, we were now able to narrow down our investigation. All efforts shifted to the service's Nginx instance.</p>
<p>The next question now is how do we exactly test the hypothesis on our local machine? As already mentioned, the errors were intermittent, which leads us to another hypothesis: if we repeatedly execute the same exact HTTP request to the service, most will succeed, and only a few will time out. Fortunately, we have a particular technique to test this new hypothesis: load testing!</p>
<h2 id="heading-load-testing-to-the-rescue">Load Testing to the Rescue</h2>
<p>Simply put, load testing is the act of bombarding an application with a huge number of requests. For this purpose, we chose <a target="_blank" href="https://www.npmjs.com/package/autocannon">autocannon</a>, a free and open source load testing tool.</p>
<p>We have two hypotheses, and we will test them separately:</p>
<ol>
<li><p>We tested the intermittency hypothesis by load testing without changing any of the app's configurations. This was straightforward. Once we increased the load enough, we started getting the timeout errors for some of the requests.</p>
</li>
<li><p>We tested the hypothesis on Nginx configuration by implementing the suggested fix and executing load testing again. Sounds counterintuitive, but this is a loosely similar approach to what they call in mathematical logic as <a target="_blank" href="https://www2.edc.org/makingmath/mathtools/contradiction/contradiction.asp">"proof by contradiction"</a>. This wasn't as easy as testing the intermittency hypothesis though, as we discuss in the next section.</p>
</li>
</ol>
<h2 id="heading-the-pitfalls-of-local-machine-load-testing">The Pitfalls of Local Machine Load Testing</h2>
<p>There are many factors that can affect load testing, especially on one's local machine, including RAM, processing power, open applications, among others.</p>
<p>This was immediately apparent when we started the tests. We were either reaching the limits of the Docker containers or the host machine (the developer's laptop). There was no other choice but to tweak the settings to allow for consistent results, both before and after applying the fix.</p>
<p>We eventually ended up with these settings for <code>autocannon</code>:</p>
<ul>
<li><p><code>connections</code>: 50</p>
</li>
<li><p><code>connectionRate</code>: 100</p>
</li>
<li><p><code>duration</code>: 10</p>
</li>
<li><p><code>pipelining</code>: 1</p>
</li>
<li><p><code>timeout</code>: 30</p>
</li>
<li><p><code>renderStatusCodes</code>: true</p>
</li>
<li><p><code>debug</code>: true</p>
</li>
</ul>
<p>With these settings, we were able to confirm that the fix did decrease the timeout issues.</p>
<h2 id="heading-summary">Summary</h2>
<p>In conclusion, diagnosing and resolving intermittent 502 and 504 errors in our API Gateway was a challenging but enlightening experience. By forming and rigorously testing our hypotheses through load testing, we were able to pinpoint and address the misconfiguration in the service's Nginx. This process underscored the importance of methodical troubleshooting and the value of skepticism in engineering. Ultimately, our diligent efforts paid off, leading to a more reliable and robust system.</p>
]]></content:encoded></item><item><title><![CDATA[Debugging Drupal Redis Cache Hit Rate]]></title><description><![CDATA[I love puzzles. As a developer sometimes they end up on my lap and I get to play Sherlock Holmes a bit.
This post is to share the learnings on how I managed to track down and solve a problem that has been plaguing us for a while: a stubbornly low cac...]]></description><link>https://engineering.prosple.com/debugging-drupal-redis-cache-hit-rate</link><guid isPermaLink="true">https://engineering.prosple.com/debugging-drupal-redis-cache-hit-rate</guid><category><![CDATA[Drupal]]></category><category><![CDATA[PHP]]></category><category><![CDATA[caching]]></category><category><![CDATA[Redis]]></category><dc:creator><![CDATA[Duarte Garin]]></dc:creator><pubDate>Fri, 11 Dec 2020 01:00:00 GMT</pubDate><content:encoded><![CDATA[<p>I love puzzles. As a developer sometimes they end up on my lap and I get to play Sherlock Holmes a bit.</p>
<p>This post is to share the learnings on how I managed to track down and solve a problem that has been plaguing us for a while: a stubbornly low cache hit rate on Redis.</p>
<p>While there are some details that are more relevant for the stack we used, (Drupal specifically) I believe the concepts would apply in general every time you face an issue like similar to this.</p>
<h3 id="heading-some-context-please">Some context please</h3>
<p>This particular issue is happening on our content API, which is essentially a Drupal 9 CMS with a GraphQL API.</p>
<p>Our CMS has two main use cases:</p>
<ol>
<li><p>To be used as a content management platform by the editor team</p>
</li>
<li><p>To be used as a content API by our frontend applications</p>
</li>
</ol>
<p>Given the large volume of traffic going through this system we use Redis to alleviate some of the pressure on our database.</p>
<p>This used to work fine and dandy but for quite some time we’ve been having a degradation on the cache hit rate and this is the bleak picture of our Redis cluster in Production 😞</p>
<p><img src="https://cdn-images-1.medium.com/max/800/1*VhbQzsP03J2ZADNsHshWFQ.png" alt /></p>
<p>Average cache hit rate in Redis</p>
<p>As you can see, we’re getting a pretty low cache hit rate, averaging 25% or so, not great at all.</p>
<p>Also, to put things into perspective our Redis cluster gets around 50k get calls per minute and processes around 2 terabytes of data! That’s a lot, so any gains here will be much appreciated.</p>
<h3 id="heading-when-did-this-start">When did this start?</h3>
<p>First thing to do is to obviously try to identify when this whole mess started. Sometimes that’s hard to achieve, specially if it’s not something that suddenly happened, but rather a degradation that occurred slowly over time.</p>
<p>For us, this seemed to coincide with a very big release, when we migrated to Drupal 9.</p>
<p><img src="https://cdn-images-1.medium.com/max/800/1*LMnD4oE1p08jtp7yg9btFg.png" alt /></p>
<p>Traffic for 3 months in Redis</p>
<p>Great! Easy right? Just go to the release log, check what code changed, and done! Not so much.</p>
<p>The first issue is that this release involved not just the Drupal platform, but <strong>all</strong> contributed modules as well, which is third party code managed by composer. It’s like trying to find a needle in a haystack.</p>
<p>Still, as the brave warriors we are we tried (we had some usual suspects), but unfortunately it didn’t pan out and it was becoming a huge time sink.</p>
<p>Time to go deeper.</p>
<h3 id="heading-memory-usage-and-evictions">Memory usage and evictions</h3>
<p>Redis is an in memory cache, meaning it stores everything into memory (right!).</p>
<p>One of the first thing that got flagged to us was to check if memory was full and that was possibly a culprit.</p>
<p>It’s important to understand why that won’t be an issue for most cases using Redis.</p>
<p>The way Redis works is, it will keep caching things into memory until it gets full, at which point eviction policies will kick in, effectively removing things from the cache according to the configured <a target="_blank" href="https://docs.redislabs.com/latest/rs/administering/database-operations/eviction-policy/">eviction polic</a>y.</p>
<p>Obviously if the expiry time is low enough, the items might get evicted sooner and you don’t get a full memory, but that won’t help with your cache hit rate.</p>
<p>For us, given we use a proactive cache invalidation approach based on tags (good stuff Drupal!), we use very high expiry times like a year and then proactively invalidate them, which means triggering the eviction policy is going to happen by design.</p>
<h3 id="heading-wait-what-are-we-trying-to-solve-here">Wait, what are we trying to solve here?</h3>
<p>This might sound painfully obvious, but it actually took me a bit to take a step back and ask the right question</p>
<p><strong>If our cache hit rate is low, what are the calls causing all the MISSes?</strong></p>
<p>Basically, in a simplistic view, for every 100 calls made to Redis, 80 are a MISS, meaning they either expired or they aren’t there at all.</p>
<p>But how can I actually figure that out?</p>
<h3 id="heading-redis-cli-monitor">Redis CLI monitor</h3>
<p>To debug any issue with Redis, Redis CLI will be your friend. It’s a very neat and easy to use CLI tool that can do a bunch of stuff from monitoring your redis traffic to giving you stats about what’s going on in your cluster.</p>
<p>It’s <strong>very important to note</strong> that you need to be <strong>extremely careful</strong> running monitor in a Production cluster as it will slow it down significantly.</p>
<p>So, the best thing to do is prepare a dev environment with the same conditions and run it there.</p>
<p>Once I had the env up and running, I start the monitor like so:</p>
<pre><code class="lang-plaintext">redis-cli -h &lt;host-goes-here&gt; -p 6379 monitor &gt; dump.txt
</code></pre>
<p>Basically I’m telling the CLI to monitor all redis traffic and dump it into a file so I can analyse it later.</p>
<p>You now start seeing traffic coming through Redis.</p>
<p>Here is a very small sample:</p>
<pre><code class="lang-plaintext">1607504386.620066 [0 10.0.10.106:48690] "mget" "drupal.redis.9.0.9..ac9491ca4fcc64451a506c053106451111ecfa7173a07e39c42fecaf20db6167:cachetags:x-redis-bin:container"
1607504386.620666 [0 10.0.10.106:48690] "get" "drupal.redis.9.0.9..ac9491ca4fcc64451a506c053106451111ecfa7173a07e39c42fecaf20db6167:container:_redis_last_delete_all"
1607504386.652331 [0 10.0.10.106:48690] "hgetall" "drupal.redis.9.0.9..ac9491ca4fcc64451a506c053106451111ecfa7173a07e39c42fecaf20db6167:config:system.performance"
</code></pre>
<p>Neat! So what do we do with this now?</p>
<p>The main thing we did was try to identify patterns. We knew the hit rate was very low and so most likely there was some type of traffic (the majority) constantly getting missed.</p>
<p>Loading the data into a tool like Splunk really helped identifying trends. The problem was, we had too much data.</p>
<h3 id="heading-narrow-things-down">Narrow things down</h3>
<p>Remember when I told you we had 2 use cases for the CMS? The editors logging in and creating content and the API.</p>
<p>Only after a while did it occur to me to monitor those two use cases separately to try and narrow things down and indeed, the outcome was helpful.</p>
<h4 id="heading-cms-editor-traffic">CMS Editor Traffic</h4>
<p>When I generated traffic via users logging in and using the CMS to edit content, our cache hit rate was amazing! Around 80–90% after the caches warmed up. This was very difficult to see in Production where all traffic is mashed into one.</p>
<p><img src="https://cdn-images-1.medium.com/max/800/1*4pJqnWmrS6QFeaokgrpvLQ.png" alt /></p>
<p>CMS Editor traffic</p>
<h4 id="heading-api-traffic">API Traffic</h4>
<p>Now let’s have a look at what is happening via our GraphQL endpoint.</p>
<p>Once we start generating traffic via GraphQL this is what happens:</p>
<p><img src="https://cdn-images-1.medium.com/max/800/1*g7q0_UVTA0PMjUQcz_4vDA.png" alt /></p>
<p>GraphQL traffic</p>
<p>Ouch.</p>
<p>Still, great news, we’ve narrowed things down a bit.</p>
<p>Now, to further narrow it down I basically do this with a single request. Send a GraphQL request, see hit rate dropping to 20%, let it come back up, then send a request, drop to 20% and basically confirm the pattern holds. There is definitely something happening with GraphQL requests.</p>
<p>Now, obviously everyone has a different use case and the above likely won’t apply, the important takeway here is:</p>
<blockquote>
<p><strong>Redis traffic is hard to monitor</strong>, so try to narrow down the issue as much as possible to reduce the amount of traffic you have to monitor</p>
</blockquote>
<p>For us this meant that now I can run redis-cli monitor against just a single GraphQL request (turned off everything else so it doesn’t pollute the redis log).</p>
<p>Again here is a small sample of traffic for that request:</p>
<pre><code class="lang-plaintext">`1607504386.762256 [0 10.0.10.106:48690] "mget" "drupal.redis.9.0.9..ac9491ca4fcc64451a506c053106451111ecfa7173a07e39c42fecaf20db6167:cachetags:x-redis-bin:default"
1607504386.763416 [0 10.0.10.106:48690] "get" "drupal.redis.9.0.9..ac9491ca4fcc64451a506c053106451111ecfa7173a07e39c42fecaf20db6167:default:_redis_last_delete_all"
1607504386.829198 [0 10.0.10.106:48690] "hgetall" "drupal.redis.9.0.9..ac9491ca4fcc64451a506c053106451111ecfa7173a07e39c42fecaf20db6167:graphql_definitions:fields"
1607504386.870541 [0 10.0.10.106:48690] "mget" "drupal.redis.9.0.9..ac9491ca4fcc64451a506c053106451111ecfa7173a07e39c42fecaf20db6167:cachetags:config:core.base_field_override.node.article.promote" "drupal.redis.9.0.9..ac9491ca4fcc64451a506c053106451111ecfa7173a07e39c42fecaf20db6167:cachetags:config:core.base_field_override.node.author.promote" "drupal.redis.9.0.9..ac9491ca4fcc64451a506c053106451111ecfa7173a07e39c42fecaf20db6167:cachetags:config:core.base_field_override.node.author.title" "drupal.redis.9.0.9..ac9491ca4fcc64451a506c053106451111ecfa7173a07e39c42fecaf20db6167:cachetags:config:core.base_field_override.node.campus.promote" "drupal.redis.9.0.9..ac9491ca4fcc64451a506c053106451111ecfa7173a07e39c42fecaf20db6167:cachetags:config:core.base_field_override.node.campus.title"
</code></pre>
<p>When I analysed the whole thing in Spunk I noticed that there were 900+ calls to redis in this request (which in itself seems too much). On top of that 600+ of those were this <code>*.cachetags:*</code> type of entries. That looks like a pattern worth checking.</p>
<p>Now, I already know that this request had a cache hit rate of 20%, which means 80% of the traffic I have (the full thing of the sample above) is a MISS.</p>
<p>Now I just need to figure out what it is (you can probably guess).</p>
<p>I took all those <code>*.cachetags:*</code> entries and issued gets for them in Redis and voila, 600+ nils, meaning they don’t exist in the Redis database.</p>
<h3 id="heading-actually-fixing-the-problem">Actually fixing the problem</h3>
<p>From this point on the conversation becomes more Drupal specific, namely what the heck are those 800+ cache tags doing in my requests?</p>
<p>Well long story short, the GraphQL drupal module was being naughty and adding <strong>all fields</strong> we had (times the types of content we have) to our graphql schema and field definitions. This is so that if a field changes our schema (auto-generated) changes as well. The concept seems correct but there are better ways to achieve it, namely with a single cache tag.</p>
<p>The issue is that we use Redis as a backend to check cache tag invalidation counts and if a tag doesn’t have any invalidations it doesn’t exist in Redis. If you have 600+ tags and no invalidations, that’s 600 calls that always end up as a MISS.</p>
<p>So we made the change to ensure our definitions don’t rely on those cache tags and hence preventing those calls to Redis in the first place. Push the fix and run the test again.</p>
<p><img src="https://cdn-images-1.medium.com/max/800/1*3d7zTqhUmDie6gkl10V3xQ.png" alt /></p>
<p>Staging traffic post fix</p>
<p>YES!!! It works!!!</p>
<p>Our cache hit rate is now averaging 90% on simulated traffic and our Redis traffic was reduced by over two thirds on GraphQL requests. Hurray!!</p>
<h3 id="heading-wrapping-up">Wrapping up</h3>
<p>In a nutshell, here are the things I learned from this:</p>
<ol>
<li><p>Redis performance is hard to debug</p>
</li>
<li><p>Narrowing things down helps a lot</p>
</li>
<li><p>Use and abuse Redis CLI to track things down (monitor your traffic using redis cli monitor, use stats to check metrics and stats, use exists to see if tags exist on cache, and so on)</p>
</li>
<li><p>Keep cool and try one small thing at a time</p>
</li>
</ol>
<p>Also, I’d like to send a huge shoutout to everyone in the #australia-nz community, which are always willing (and able) to help me figure these things out. You guys rock!</p>
<p>🙏</p>
]]></content:encoded></item><item><title><![CDATA[How to override cache tags in a Drupal response object]]></title><description><![CDATA[While working on some improvements to the caching granularity in the Prosple platform, I found myself needing to override the cache tags being sent in a GraphQL response.
This is because Drupal has a lot of cache tags, and while they are all incredib...]]></description><link>https://engineering.prosple.com/how-to-override-cache-tags-in-a-drupal-response-object</link><guid isPermaLink="true">https://engineering.prosple.com/how-to-override-cache-tags-in-a-drupal-response-object</guid><category><![CDATA[Drupal]]></category><category><![CDATA[caching]]></category><category><![CDATA[PHP]]></category><dc:creator><![CDATA[Duarte Garin]]></dc:creator><pubDate>Mon, 24 Feb 2020 01:00:00 GMT</pubDate><content:encoded><![CDATA[<p>While working on some improvements to the caching granularity in the Prosple platform, I found myself needing to override the cache tags being sent in a GraphQL response.</p>
<p>This is because Drupal has <strong>a lot</strong> of cache tags, and while they are all incredibly useful you might find yourself in a situation where they never get used and, for caching systems like CDNs, it can cause you to do over the allowed limits for header sizes.</p>
<p>So how to do this?</p>
<h3 id="heading-add-an-event-subscriber-to-your-custom-module">Add an Event Subscriber to your custom module</h3>
<p>The first thing we need to do is to create an event subscriber.</p>
<p>You can do this by creating a new module or adding it to an existing custom one, which is what I’ll do here. The module is called <code>example</code> .</p>
<p>First, make sure that you have an <a target="_blank" href="http://example.services"><code>example.services</code></a><code>.yml</code> file in your module root where you define your Event Subscriber. Here is one for the <code>example</code> module:</p>
<pre><code class="lang-plaintext">services:
  example_event_subscriber:
    class: '\Drupal\example\EventSubscriber\ExampleEventSubscriber'
    tags:
      - { name: 'event_subscriber' }
</code></pre>
<h3 id="heading-create-an-event-subscriber">Create an Event Subscriber</h3>
<p>Next, we are going to create the actual Event Subscriber:</p>
<pre><code class="lang-plaintext">&lt;?php

namespace Drupal\example\EventSubscriber;

use Symfony\Component\HttpKernel\Event\FilterResponseEvent;
use Symfony\Component\HttpKernel\KernelEvents;
use Symfony\Component\EventDispatcher\EventSubscriberInterface;

/**
 * Class ExampleEventSubscriber.
 *
 * @package Drupal\example\EventSubscriber
 */
class ExampleEventSubscriber implements EventSubscriberInterface {

  /**
   * {@inheritdoc}
   */
  public static function getSubscribedEvents() {
    return [
      KernelEvents::RESPONSE =&gt; ['onKernelResponse', 100],
    ];
  }

  /**
   * React to a response event
   *
   * @param FilterResponseEvent $event
   *   The response event.
   */
  public function onKernelResponse(FilterResponseEvent $event) {
    /** @var \Drupal\Core\Cache\CacheableJsonResponse $response */
    $response = $event-&gt;getResponse();
    if(is_a($response, '\Drupal\Core\Cache\CacheableJsonResponse')){
      $cache_tags = ['test'];
      $response-&gt;getCacheableMetadata()-&gt;setCacheTags($cache_tags);
    }
    else{
      return;
    }
  }

}
</code></pre>
<p>So let’s look at this closely.</p>
<p>First, we implement the getSubscribedEvents method. You can see more details on how to create Event Subscribers in <a target="_blank" href="https://www.drupal.org/docs/8/creating-custom-modules/subscribe-to-and-dispatch-events">https://www.drupal.org/docs/8/creating-custom-modules/subscribe-to-and-dispatch-events</a></p>
<p>The important part here is that priority argument (which I’m setting to 100). This is to ensure that this Event Subscriber runs before the <code>FinishResponseSubscriber</code> .</p>
<p>Now, within our <code>onKernelResponse</code> method we can get our Response object and override the cache tags.</p>
<p>In our case, we are checking that the response is a <code>CacheableJSONResponse</code> , which might not make sense to you.</p>
<p>What matters is what’s inside that conditional statement:</p>
<pre><code class="lang-plaintext">$cache_tags = ['test'];
$response-&gt;getCacheableMetadata()-&gt;setCacheTags($cache_tags);
</code></pre>
<p>I’m just setting a list of dummy tags here, you could get the ones in the response by using <strong>$response-&gt;getCacheableMetadata()</strong> or I can just set my own list as I’m doing above.</p>
<p>Then I call the method <code>setCacheTags</code> in the <code>CacheableMetadata</code> class and add the list of tags I want.</p>
<p>That’s it!</p>
]]></content:encoded></item><item><title><![CDATA[How to turn your PHP website into a SAML Identity Provider in 30 minutes]]></title><description><![CDATA[The other day I was working with one of our partners, a member based organisation, integrating their portal with our platform.
The use case was simple: allow the partners members to access a Prosple career directory by using their existing credential...]]></description><link>https://engineering.prosple.com/how-to-turn-your-php-website-into-a-saml-identity-provider-in-30-minutes</link><guid isPermaLink="true">https://engineering.prosple.com/how-to-turn-your-php-website-into-a-saml-identity-provider-in-30-minutes</guid><category><![CDATA[SAML]]></category><category><![CDATA[PHP]]></category><category><![CDATA[SSO]]></category><category><![CDATA[authentication]]></category><dc:creator><![CDATA[Duarte Garin]]></dc:creator><pubDate>Thu, 13 Feb 2020 01:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1695104561690/57af4c04-1e57-496d-8a4d-59b5c8592e21.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The other day I was working with one of our partners, a member based organisation, integrating their portal with our platform.</p>
<p>The use case was simple: allow the partners members to access a Prosple career directory by using their existing credentials.</p>
<p>In order to do this, we need to integrate with their Identity Provider (IdP) and they decided on SAML from the various protocols we support. There was only one catch. They didn’t have a SAML IdP and weren’t exactly sure on how to add that capability to their existing Drupal website.</p>
<p>In order to help our partner with this I started doing some research on SAML implementation for <em>existing login</em> workflows in PHP and was surprised when all I found was solutions that involved replacing the login page (for example using simpleSAMLPHP).</p>
<p>I wanted something simple, that can be easily plugged in to a portal with an existing login page. I did a bit of research, and found <a target="_blank" href="https://github.com/lightSAML/lightSAML">LightSAML</a>, which allowed me to quickly whip up a working IdP prototype in around half an hour. I found that pretty amazing and so decided to share my findings for those that find them useful.</p>
<p>Before we get straight into the nitty gritty however, it’s important to at least understand the basics of SAML and how it works.</p>
<h3 id="heading-what-is-saml">What is SAML?</h3>
<blockquote>
<p><strong>Security Assertion Markup Language</strong> (<strong>SAML</strong>, pronounced <em>SAM-el</em>) is an <a target="_blank" href="https://en.wikipedia.org/wiki/Open_standard">open standard</a> for exchanging <a target="_blank" href="https://en.wikipedia.org/wiki/Authentication">authentication</a> and <a target="_blank" href="https://en.wikipedia.org/wiki/Authorization">authorization</a> data between parties, in particular, between an <a target="_blank" href="https://en.wikipedia.org/wiki/Identity_provider_%28SAML%29">identity provider</a> and a <a target="_blank" href="https://en.wikipedia.org/wiki/Service_provider_%28SAML%29">service provider</a>. SAML is an <a target="_blank" href="https://en.wikipedia.org/wiki/XML">XML</a>-based <a target="_blank" href="https://en.wikipedia.org/wiki/Markup_language">markup language</a> for security assertions (statements that service providers use to make access-control decisions).</p>
</blockquote>
<p>Got it? In a nutshell what this means is that an application can delegate authentication to an external system (IdP), via a Service Provider (SP). This is done by exchanging some messages in XML.</p>
<p>In its simplest form, a basic SAML <em>Service Provider Initiated flow</em> looks like this:</p>
<ol>
<li><p>A user tries to access the application.</p>
</li>
<li><p>The application checks if the user is logged in via the SP and if not redirects the user to the IdP login page along with a <strong>SAML Request</strong>.</p>
</li>
<li><p>The user authenticates in the IdP and if successful returns the user to the Assertion Consumer Service Url along with a <strong>SAML Response</strong> (This is usually signed with the IdP certificate).</p>
</li>
<li><p>The Service Provider decodes the response and provides the user access to the application.</p>
</li>
</ol>
<p>At Prosple we strongly discourage the use of IdP initiated flows. I won’t cover that ground in this article, but check <a target="_blank" href="https://www.identityserver.com/articles/the-dangers-of-saml-idp-initiated-sso">The Dangers of SAML IdP-Initiated SSO</a> by Scott Brady for an overview on the matter if you’re interested.</p>
<h3 id="heading-php-saml-idp-demo">PHP SAML IdP Demo</h3>
<p>Alright, now that we’re clear on the basic concepts, let’s get started.</p>
<p>For the purpose of this demo, we will be making some assumptions:</p>
<ul>
<li><p>You have an existing PHP portal (e.g Drupal, Wordpress, etc) with an existing user database, either in the CMS itself or in some sort of external system that your CMS integrates with.</p>
</li>
<li><p>You have a login page where users authenticate against your system.</p>
</li>
<li><p>You may have looked at solutions like simpleSAMLPHP, but you don’t want to create a new login page, you want to integrate the SAML workflow into your existing page.</p>
</li>
</ul>
<p>We’ll be using the <a target="_blank" href="https://github.com/lightSAML/lightSAML">LightSAML</a> Core PHP library since, contrary to something like simpleSAMLPHP its main design principles are around working like a pluggable API rather than a standalone application.</p>
<p>In their own words:</p>
<blockquote>
<p>LightSAML Core is a <a target="_blank" href="https://php.net/">PHP</a> library implementing <a target="_blank" href="https://www.oasis-open.org/standards#samlv2.0">OASIS</a> <a target="_blank" href="http://saml.xml.org/saml-specifications">SAML 2.0</a> protocol, fully OOP structured with DPI principles, reusable, and embeddable.</p>
</blockquote>
<p>Sounds great! Ready? Set? Go!</p>
<h3 id="heading-housekeeping-php-apache-docker">Housekeeping (PHP, Apache, Docker)</h3>
<p>In order to get up and running for our demo we need a server with PHP and Apache.</p>
<p>I’m going to be doing this in Docker, but feel free to use whatever PHP stack you want to run this codebase (XAMP, MAMP, etc).</p>
<p>Let’s create our simple docker-compose file in the root of our project:</p>
<pre><code class="lang-plaintext">version: "3.2"
services:
  php:
    ports:
      - "8080:80"
    image: php:apache
    volumes:
      - .:/var/www/html/
</code></pre>
<p>Too easy. Now with a simple <code>docker-compose up</code> we’re up and running in <a target="_blank" href="http://localhost:8080/">http://localhost:8080/</a></p>
<h3 id="heading-installing-lightsaml">Installing LightSAML</h3>
<p>The first thing we want to do is install LightSAML.</p>
<p>For this we will use Composer.</p>
<p>Assuming you have Composer installed, from the root of your project run <code>composer require lightsaml/lightsaml</code></p>
<p>Done!</p>
<p>Now, since we want to be using the classes provided by the library, let’s set up some basic autoloading. We’ll do this by creating an <em>inc.php</em> file in the root of our project with the following code:</p>
<pre><code class="lang-plaintext">&lt;?php

// Autoloading libraries
require __DIR__ . '/vendor/autoload.php';
</code></pre>
<p>This merely includes the auto-generated Composer autoload php file and we can then include this in our codebase. Simple stuff.</p>
<h3 id="heading-login-page">Login Page</h3>
<p>Now, to the use case at hand. The first thing to cover is our login page. We’re going to create a basic dummy login page to simulate the IdP login page.</p>
<p>The code is simple:</p>
<pre><code class="lang-plaintext">&lt;?php

include "inc.php";

// Reading the HTTP Request
$request = \Symfony\Component\HttpFoundation\Request::createFromGlobals();

?&gt;

&lt;h1&gt;IdP Login Page&lt;/h1&gt;

&lt;form action="post-saml.php"&gt;
    &lt;div&gt;
        &lt;label&gt;Username:&lt;/label&gt;
        &lt;input name="username" type="text"&gt;
    &lt;/div&gt;
    &lt;div&gt;
        &lt;label&gt;Pass:&lt;/label&gt;
        &lt;input type="password" name="password"&gt;
    &lt;/div&gt;
    &lt;input type="submit"&gt;
    &lt;input type="hidden" name="SAMLRequest"
           value="&lt;?php print $request-&gt;get("SAMLRequest") ?&gt;"&gt;
    &lt;input type="hidden" name="RelayState"
           value="&lt;?php print $request-&gt;get("RelayState") ?&gt;"&gt;
&lt;/form&gt;
</code></pre>
<p>The PHP code includes <em>inc.php</em> file and reads the HTTP request so we can extract some query string parameters. You don’t need to use Symfony for this, basic usage of PHP <em>$_GET</em> suffices, however given I already have the library handy I’ll do that for the purpose of this demo.</p>
<p>Next, we have a basic form simulating your IdP login page: a username input, a password input and a submit button.</p>
<p>Remember the workflow I described above? When users arrives in this login page from a Service Provider, they will arrive with both <strong>SAMLRequest</strong> and <strong>RelayState</strong> parameters. We will need these after the authentication is successful so we can respond to the Service Provider.</p>
<p>Depending on your use case and exactly how authentication works in your system, there are many ways to go about this but in this simple demo we are just passing them as hidden fields so they are added to the POST data when the form is submitted and the user taken to <strong>post-saml.php</strong> (the page where we construct the response and redirect the user back, more on this later).</p>
<p>For the purposes of integrating this in your own login page, <strong>just ensure you pass through SAMLRequest and RelayState</strong> to that page.</p>
<p>When the user accesses <a target="_blank" href="http://localhost:8080/login.php">http://localhost:8080/login.php</a> they will see something like this:</p>
<p><img src="https://cdn-images-1.medium.com/max/800/1*4DsaBq_CHN5ZsZYH6MRiPA.png" alt /></p>
<p>IdP Login page</p>
<p>Not very pretty, but it does the job.</p>
<p>When the user submits this form, he will be taken to <em>post-saml.php</em></p>
<h3 id="heading-enter-saml">Enter SAML</h3>
<p>Alright, at this point, you should have validated the user’s credentials and know if the login was successful or not (I won’t write code for that for obvious reasons).</p>
<p>Now it’s time to do the real heavy lifting, where will take some information from the IdP and the authentication result and construct a SAML Response to send to the SP, redirecting the user to the post-back Url (the ACS or Assertion Consumer Service Url provided by the SP where the SAML response should be sent to).</p>
<p>Thankfully as you’ll see this is pretty easy with LightSAML.</p>
<h4 id="heading-creating-some-utilities">Creating some Utilities</h4>
<p>First we’re going to create some Utility classes.</p>
<p>We’ll start with an IdpProvider.php class. This is a dummy representation of your system. In a real integration you should instead call whatever APIs are available in your system to obtain the information required.</p>
<pre><code class="lang-plaintext">&lt;?php

class IdpProvider {


}
</code></pre>
<p>We’re also going to create a Utility class called IdpTools.php. This will be just a wrapper with some useful functions for things like reading SAML requests and creating SAML responses, just to keep things nice and tidy:</p>
<pre><code class="lang-plaintext">&lt;?php

class IdpTools{

}
</code></pre>
<p>We’ll place both of these in the <em>src/Utility</em> folder of our project.</p>
<h4 id="heading-creating-a-post-binding-page">Creating a POST Binding Page</h4>
<p>Now we can get started with our <em>post-saml.php</em> file which renders a page with a hidden form to POST data back to the SP.</p>
<p>Let’s start with importing our libraries as well as our Utility classes and instantiating them:</p>
<pre><code class="lang-plaintext">&lt;?php

include "inc.php";
include "src/Utility/IdpProvider.php";
include "src/Utility/IdpTools.php";

// Initiating our IdP Provider dummy connection.
$idpProvider = new IdpProvider();

// Instantiating our Utility class.
$idpTools = new IdpTools();
</code></pre>
<p>Great! Now it’s important to reiterate a point here. <strong>At this stage in our workflow we’ve already authenticated the user and the login was successful, so now we need to start preparing a response to the SP.</strong></p>
<h4 id="heading-reading-the-saml-request">Reading the SAML Request</h4>
<p>To do that, we need to read the SAMLRequest first:</p>
<pre><code class="lang-plaintext">// Receive the HTTP Request and extract the SAMLRequest.
$request = \Symfony\Component\HttpFoundation\Request::createFromGlobals();
$saml_request = $idpTools-&gt;readSAMLRequest($request);
</code></pre>
<p>We’re using Symfony to read the HTTP request and pass it to our Utility function <em>readSAMLRequest()</em> which we’ll implement in our IdpTools class.</p>
<p>Let’s do that now:</p>
<pre><code class="lang-plaintext">/**
 * Reads a SAMLRequest from the HTTP request and returns a messageContext.
 *
 * @param \Symfony\Component\HttpFoundation\Request $request
 *   The HTTP request.
 *
 * @return \LightSaml\Context\Profile\MessageContext
 *   The MessageContext that contains the SAML message.
 */
public function readSAMLRequest($request){

  // We use the Binding Factory to construct a new SAML Binding based on the
  // request.
  $bindingFactory = new \LightSaml\Binding\BindingFactory();
  $binding = $bindingFactory-&gt;getBindingByRequest($request);

  // We prepare a message context to receive our SAML Request message.
  $messageContext = new \LightSaml\Context\Profile\MessageContext();

  // The receive method fills in the messageContext with the SAML Request data.
  /** @var \LightSaml\Model\Protocol\Response $response */
  $binding-&gt;receive($request, $messageContext);

  return $messageContext;
}
</code></pre>
<p>This implementation comes straight from LightSAML’s cookbook in ‘<a target="_blank" href="https://www.lightsaml.com/LightSAML-Core/Cookbook/How-to-receive-SAML-message/">How to read a SAML message</a>’.</p>
<blockquote>
<p>Receiving a SAML message from the HTTP request with the SAML HTTP POST or Redirect binding, in LightSAML is done with the Binding set of classes. The <code>BindingFactory</code> can detect the binding type for the given HTTP request and instantiate corresponding Binding class, <code>HttpPostBinding</code> or <code>HttpRedirectBinding</code>, capable of receiving the SAML message (AuthnRequest, Response…).</p>
<p>First you create Symfony’s HttpFoundation Request, instantiate <code>BindingFactory</code> with that request and get the actual binding, and finally call the binding <code>receive()</code> method, that will return deserialized SAML document from the HTTP Request.</p>
</blockquote>
<p>The return of this method contains the SAML message itself, in this case the SAMLRequest.</p>
<h4 id="heading-getting-some-data-from-idp">Getting some data from IdP</h4>
<p>Continuing in our <em>post-saml.php</em> file, now that we have the SAMLRequest we need to get some data from it. Specifically, we want the <em>ID</em> of the request message (which we’ll use later) and the <em>Issuer</em> which is the identifier of the Service Provider.</p>
<pre><code class="lang-plaintext">// Getting a few details from the message like ID and Issuer.
$issuer = $saml_request-&gt;getMessage()-&gt;getIssuer()-&gt;getValue();
$id = $saml_request-&gt;getMessage()-&gt;getID();
</code></pre>
<p><strong>Note</strong>: If your login page can only process authentication from SAML Service Providers you can check the <em>Issuer</em> in the SAMLRequest in the login page and if it’s not a trusted SP, deny access. In our case this login page is used for other purposes so we won’t be doing that.</p>
<p>Now, we also want to get some information from the IdP related to the authenticated user:</p>
<pre><code class="lang-plaintext">// Simulate user information from IdP
$user_id = $request-&gt;get("username");
$user_email = $idpProvider-&gt;getUserEmail();
</code></pre>
<p>In our use case, I’m interested in the user_id and user_email, but this may vary case by case. You would get this from your own system, but for the purposes of this demo we get the username from the login form and the email from our <em>IdpProvider</em> class:</p>
<pre><code class="lang-plaintext">/**
 * Returns a dummy user email.
 *
 * @return string
 */
public function getUserEmail(){

  return "duarte.garin@samltuts.com";
}
</code></pre>
<h4 id="heading-constructing-the-saml-response">Constructing the SAML Response</h4>
<p>Now we are ready to construct our SAML Response.</p>
<p>We’ll add this call to our <em>post-saml.php</em> file:</p>
<pre><code class="lang-plaintext">// Construct a SAML Response.
$response = $idpTools-&gt;createSAMLResponse($idpProvider, $user_id, $user_email, $issuer, $id);
</code></pre>
<p>Let’s see the implementation for this:</p>
<pre><code class="lang-plaintext">/**
 * Constructs a SAML Response.
 * 
 * @param \IdpProvider $idpProvider
 * @param $user_id
 * @param $user_email
 * @param $issuer
 * @param $id
 */
public function createSAMLResponse($idpProvider, $user_id, $user_email, $issuer, $id){


  $acsUrl = $idpProvider-&gt;getServiceProviderAcs($issuer);

  // Preparing the response XML
    $serializationContext = new \LightSaml\Model\Context\SerializationContext();

    // We now start constructing the SAML Response using LightSAML.
    $response = new \LightSaml\Model\Protocol\Response();
    $response
        -&gt;addAssertion($assertion = new \LightSaml\Model\Assertion\Assertion())
        -&gt;setStatus(new \LightSaml\Model\Protocol\Status(
            new \LightSaml\Model\Protocol\StatusCode(
              \LightSaml\SamlConstants::STATUS_SUCCESS)
            )
        )
        -&gt;setID(\LightSaml\Helper::generateID())
        -&gt;setIssueInstant(new \DateTime())
        -&gt;setDestination($acsUrl)
        // We obtain the Entity ID from the Idp.
        -&gt;setIssuer(new \LightSaml\Model\Assertion\Issuer($idpProvider-&gt;getIdPId()))
    ;

    $assertion
        -&gt;setId(\LightSaml\Helper::generateID())
        -&gt;setIssueInstant(new \DateTime())
        // We obtain the Entity ID from the Idp.
        -&gt;setIssuer(new \LightSaml\Model\Assertion\Issuer($idpProvider-&gt;getIdPId()))
        -&gt;setSubject(
            (new \LightSaml\Model\Assertion\Subject())
                // Here we set the NameID that identifies the name of the user.
                -&gt;setNameID(new \LightSaml\Model\Assertion\NameID(
                  $user_id,
                    \LightSaml\SamlConstants::NAME_ID_FORMAT_UNSPECIFIED
                ))
                -&gt;addSubjectConfirmation(
                    (new \LightSaml\Model\Assertion\SubjectConfirmation())
                        -&gt;setMethod(\LightSaml\SamlConstants::CONFIRMATION_METHOD_BEARER)
                        -&gt;setSubjectConfirmationData(
                            (new \LightSaml\Model\Assertion\SubjectConfirmationData())
                                // We set the ResponseTo to be the id of the SAMLRequest.
                                -&gt;setInResponseTo($id)
                                -&gt;setNotOnOrAfter(new \DateTime('+1 MINUTE'))
                                // The recipient is set to the Service Provider ACS.
                                -&gt;setRecipient($acsUrl)
                        )
                )
        )
        -&gt;setConditions(
            (new \LightSaml\Model\Assertion\Conditions())
                -&gt;setNotBefore(new \DateTime())
                -&gt;setNotOnOrAfter(new \DateTime('+1 MINUTE'))
                -&gt;addItem(
                    // Use the Service Provider Entity ID as AudienceRestriction.
                    new \LightSaml\Model\Assertion\AudienceRestriction([$issuer])
                )
        )
        -&gt;addItem(
            (new \LightSaml\Model\Assertion\AttributeStatement())
                -&gt;addAttribute(new \LightSaml\Model\Assertion\Attribute(
                    \LightSaml\ClaimTypes::EMAIL_ADDRESS,
                  // Setting the user email address.
                  $user_email
                ))
        )
        -&gt;addItem(
            (new \LightSaml\Model\Assertion\AuthnStatement())
                -&gt;setAuthnInstant(new \DateTime('-10 MINUTE'))
                -&gt;setSessionIndex($assertion-&gt;getId())
                -&gt;setAuthnContext(
                    (new \LightSaml\Model\Assertion\AuthnContext())
                        -&gt;setAuthnContextClassRef(\LightSaml\SamlConstants::AUTHN_CONTEXT_PASSWORD_PROTECTED_TRANSPORT)
                )
        )
    ;

  // Sign the response.
  $response-&gt;setSignature(new \LightSaml\Model\XmlDSig\SignatureWriter($idpProvider-&gt;getCertificate(), $idpProvider-&gt;getPrivateKey()));

  // Serialize to XML.
  $response-&gt;serialize($serializationContext-&gt;getDocument(), $serializationContext);

  // Set the postback url obtained from the trusted SPs as the destination.
  $response-&gt;setDestination($acsUrl);

    return $response;
}
</code></pre>
<p>This looks like a lot but don’t worry as 90% of this is templated code. You can reuse this function in your own project so long as you replace the areas where we are passing variables.</p>
<p>You might also want to check <a target="_blank" href="https://www.lightsaml.com/LightSAML-Core/Cookbook/How-to-make-Response/">LightSAML cookbook on how to prepare a SAML Response</a>, which is basically where all this code comes from.</p>
<p>Let’s cover the important parts by commenting on sections of the code above.</p>
<p><strong>Obtaining the Assertion Consumer Service Url</strong></p>
<p>It’s important to know where we send this response to right?</p>
<p>The standard name for this endpoint is <em>Assertion Consumer Service</em> or ACS for short. Depending on how you want to implement this and the capabilities of your SP you can either store a mapping in your IdP between the trusted SP Ids and their ACS urls, or between their Ids and their metadata endpoints.</p>
<p>Metadata endpoints in the SP expose information about them so that you can dynamically fetch them when needed (which is what we do at Prosple). For the purposes of this example however, we’ll do the former.</p>
<p>Let’s add that to our IdpProvider class as an attribute:</p>
<pre><code class="lang-plaintext">// Defining some trusted Service Providers.
private $trusted_sps = [
  'urn:service:provider:id' =&gt; 'https://service-provider.com/login/callback'
];
</code></pre>
<p>And expose a method of retrieving this:</p>
<pre><code class="lang-plaintext">/**
 * Retrieves the Assertion Consumer Service.
 *
 * @param string
 *   The Service Provider Entity Id
 * @return
 *   The Assertion Consumer Service Url.
 */
public function getServiceProviderAcs($entityId){
  return $this-&gt;trusted_sps[$entityId];
}
</code></pre>
<p>And now call it in our SAML Response constructor function:</p>
<pre><code class="lang-plaintext">$acsUrl = $idpProvider-&gt;getServiceProviderAcs($issuer);
</code></pre>
<p>Great! Now we know where to send the response to!</p>
<h4 id="heading-issuer-or-idp-id">Issuer or IdP Id</h4>
<pre><code class="lang-plaintext">// We obtain the Entity ID from the Idp.
-&gt;setIssuer(new \LightSaml\Model\Assertion\Issuer($idpProvider-&gt;getIdPId()))
</code></pre>
<p>This is where we set our Issuer, which in this case (contrary to the request) is the identifier of our IdP:</p>
<pre><code class="lang-plaintext">/**
 * Returning a dummy IdP identifier.
 *
 * @return string
 */
public function getIdPId(){
  return "https://www.idp.com";
}
</code></pre>
<p>Obviously in your use case you need to obtain this from your system.</p>
<p>This needs to be done in two places, both on the Response message and on the assertion node of the response.</p>
<h4 id="heading-setting-the-nameid">Setting the NameID</h4>
<p>This is a very important part of the response, and it’s in the subject section of the response message.</p>
<p>There are various formats for this. Because this is a member organisation, let’s assume we have numerical uids like <code>2349</code> and so we’ll be using the <em>Unspecified NameID Format.</em> You can see the various formats in the <a target="_blank" href="http://docs.oasis-open.org/security/saml/v2.0/saml-core-2.0-os.pdf">SAML spec</a>.</p>
<p>Here is how it looks in our code (notice we are passing the user_id):</p>
<pre><code class="lang-plaintext">-&gt;setSubject(
    (new \LightSaml\Model\Assertion\Subject())
        // Here we set the NameID that identifies the name of the user.
        -&gt;setNameID(new \LightSaml\Model\Assertion\NameID(
          $user_id,
            \LightSaml\SamlConstants::NAME_ID_FORMAT_UNSPECIFIED
        ))
</code></pre>
<h4 id="heading-setting-responseto">Setting ResponseTo</h4>
<p>We also need to somehow define that this <em>SAML Response</em> is related to the <em>SAML Request</em> we received. This is done in the <em>ResponseTo</em> field.</p>
<p>We do this by setting the field to be the <em>ID</em> of the <em>SAMLRequest:</em></p>
<pre><code class="lang-plaintext">// We set the ResponseTo to be the id of the SAMLRequest.
-&gt;setInResponseTo($id)
</code></pre>
<h4 id="heading-setting-the-recipient">Setting the Recipient</h4>
<p>This is where we define the recipient of our message. Remember when we obtained the ACS endpoint above? This is where we’ll be using it:</p>
<pre><code class="lang-plaintext">// The recipient is set to the Service Provider ACS.
-&gt;setRecipient($acsUrl)
</code></pre>
<h4 id="heading-setting-audience-restriction">Setting Audience Restriction</h4>
<p>This defines the audience for our response message.</p>
<p>Since we’re replying to our SP, it makes sense that the SP is the audience and so we pass here its ID:</p>
<pre><code class="lang-plaintext">// Use the Service Provider Entity ID as AudienceRestriction.
new \LightSaml\Model\Assertion\AudienceRestriction([$issuer])
</code></pre>
<h4 id="heading-setting-the-email-address">Setting the Email address</h4>
<p>For those using the email address, that is an important field to pass in the response. We’ll be using the email we obtained from the IdP authentication here:</p>
<pre><code class="lang-plaintext">-&gt;addItem(
    (new \LightSaml\Model\Assertion\AttributeStatement())
        -&gt;addAttribute(new \LightSaml\Model\Assertion\Attribute(
            \LightSaml\ClaimTypes::EMAIL_ADDRESS,
          // Setting the user email address.
          $user_email
        ))
)
</code></pre>
<p>Almost done, we now have a fully compliant <em>SAML Response</em> message ready to be sent! However, there are a few important things still missing.</p>
<h3 id="heading-signing-the-saml-response-message">Signing the SAML Response Message</h3>
<p>It’s normally good practice for the IdP to sign the responses with a self signed certificate. This can then be given to the SP to decrypt the message received on their end.</p>
<p>Let’s add that to your <em>post-saml.php</em> file:</p>
<pre><code class="lang-plaintext">// Sign the response.
$response-&gt;setSignature(new \LightSaml\Model\XmlDSig\SignatureWriter($idpProvider-&gt;getCertificate(), $idpProvider-&gt;getPrivateKey()));
</code></pre>
<p>See the <a target="_blank" href="https://www.lightsaml.com/LightSAML-Core/Cookbook/How-to-sign-SAML-message/">LightSAML Cookbook on how to sign a SAML message</a>.</p>
<p>We’re getting both the certificate (which is shared with the SP) and the private key from the IdpProvider class. Here is the implementation for both:</p>
<pre><code class="lang-plaintext">/**
 * Retrieves the certificate from the IdP.
 *
 * @return \LightSaml\Credential\X509Certificate
 */
public function getCertificate(){
  return \LightSaml\Credential\X509Certificate::fromFile('cert/saml_test_certificate.crt');
}

/**
 * Retrieves the private key from the Idp.
 *
 * @return \RobRichards\XMLSecLibs\XMLSecurityKey
 */
public function getPrivateKey(){
  return \LightSaml\Credential\KeyHelper::createPrivateKey('cert/saml_test_certificate.key', '', true);
}
</code></pre>
<p>I won’t cover in this article how to generate self signed certificates as that’s a well covered topic on the web.</p>
<h3 id="heading-preparing-the-post-binding">Preparing the POST Binding</h3>
<p>Alright, at this point we have everything we need. Now we just need to render the POST binding, which is a hidden form that gets autosubmitted with the SAMLResponse and RelayState. Let’s add this to our <em>post-saml.php</em> class:</p>
<pre><code class="lang-plaintext">// Prepare the POST binding (form).
$bindingFactory = new \LightSaml\Binding\BindingFactory();
$postBinding = $bindingFactory-&gt;create(\LightSaml\SamlConstants::BINDING_SAML2_HTTP_POST);
$messageContext = new \LightSaml\Context\Profile\MessageContext();
$messageContext-&gt;setMessage($response);

// Ensure we include the RelayState.
$message = $messageContext-&gt;getMessage();
$message-&gt;setRelayState($request-&gt;get('RelayState'));
$messageContext-&gt;setMessage($message);
</code></pre>
<pre><code class="lang-plaintext">// Return the Response.
/** @var \Symfony\Component\HttpFoundation\Response $httpResponse */
$httpResponse = $postBinding-&gt;send($messageContext);
print $httpResponse-&gt;getContent();
</code></pre>
<p>Again, refer to the useful <a target="_blank" href="https://www.lightsaml.com/LightSAML-Core/Cookbook/How-to-send-SAML-message/">LightSAML cookbook on how to send a SAML message</a> for more info.</p>
<p>That’s it!</p>
<h3 id="heading-lights-camera-action">Lights, Camera, Action!</h3>
<p>Everything is now in place, so it’s time to test this out.</p>
<p>For the purpose of this test I’m going to use Auth0 as the Service Provider.</p>
<p>Here is how it will work:</p>
<ol>
<li><p>I will trigger a flow from Auth0 (SP) which will take the user to the IdP login page containing the SAMLRequest and RelayState parameters.</p>
</li>
<li><p>I will fill in my (dummy) credentials in the login page and make pretend that authentication was done in the IdP.</p>
</li>
<li><p>The form will then take me to the POST Binding page creating a hidden form with the signed SAML Response and RelayState which will autosubmit and take me back to the SP.</p>
</li>
<li><p>The SP will decrypt the SAML Response and grant me access.</p>
</li>
</ol>
<p>Let’s see it in action in this video:</p>
<h3 id="heading-thats-all-folks">That’s all folks!</h3>
<p>I hope you found this useful, and that it saves someone the time of researching and fiddling with code snippets in order to get a basic understanding on how to do this.</p>
<p>The entire code for this little demo is available here:</p>
<p><a target="_blank" href="https://github.com/Prosple/saml_idp_example">https://github.com/Prosple/saml_idp_example</a></p>
]]></content:encoded></item></channel></rss>