Skip to content

Quick intro to Genetic algorithms (with a JavaScript example)

This article was written over 18 months ago and may contain information that is out of date. Some content may be relevant but please refer to the relevant official documentation or available resources for the latest information.

The purpose of this article is to give the reader a quick insight into what genetic algorithms are, and how they work.

In short: What are genetic algorithms?

Genetic algorithms are a metaheuristic inspired by Charles Darwin's theory of natural selection. They replicate the operations: mutation, crossover and selection. These features make them suitable for use in optimization problems. Because of this, they find applications in a lot of disciplines like chemistry, climatology, robotics, gaming, etc. They are part of evolutionary computation.

The example problem

Before implementing the algorithm, let's take a look of the problem we are going to solve.

You might have heard about the infinite monkey theorem which states that, given an infinite amount of time, a monkey will almost certainly type logical, meaningful text while randomly pressing the keys on a keyboard. Of course, the lifetime of a monkey will probably not be long enough for this to happen. Still, the chances are greater than zero. Very close to zero, but not zero. Anyway, let's stick to the metaphor.

Another way we can imagine this: imagine we have a device that produces random strings. At some point in the future, regardless how long it takes, it will output our desired text.

The purpose of the genetic algorithm we are going to develop is to find an optimal way to achieve the result we are looking for rather than brute forcing it (which will inevitably take a ludicrous amount of time).

This problem is widely used as an example for GA, so for the sake of consistency and simplicity, we will stick to it.

Parts of GA

We already mentioned some of the operations involved in the process. Let's break down the different steps involved in the GA:

  1. Initializing the population - This is the first step. We will pick a set of members with different traits (or genes). The key here is to have a big enough variety in order to evolve to the target for which we are aiming.
  2. Selection - The process of selecting those members of our population who are fit enough to be reproduced. This happens by predetermined criteria.
  3. Reproduction - After we select the fittest members from our current generation, we will perform a crossover -- mixing their traits/genes. Optionally, we will mutate the newly produced members by some random factor. That way, we will further support the variety in our population.
  4. Repeat 2 and 3 - Simply said -- we will perform selection, crossover and mutation for every new generation until the population evolves to such an extent that we have the desired candidate.

Implementation

Okay, before we start with the algorithm, let's first introduce two utility functions that we are going to use throughout the example. The first one will be used for generating a random integer value in a provided interval, and the second -- generating a letter from a-z:

// Src: MDN
function random(min, max) {
  min = Math.ceil(min);
  max = Math.floor(max);

  // The maximum is exclusive and the minimum is inclusive
  return Math.floor(Math.random() * (max - min)) + min;
}

function generateLetter() {
  const code = random(97, 123); // ASCII char codes
  return String.fromCharCode(code);
}

Now that we have our utils ready, we'll start with the first step of the genetic algorithms, which is:

Initializing the population

Let's introduce our population member or the monkey that we referenced previously. It will be represented by an object called Member which contains an array with the pressed keys, and the target string. The constructor will generate a set of randomly pressed keys:

class Member {
  constructor(target) {
    this.target = target;
    this.keys = [];

    for (let i = 0; i < target.length; i += 1) {
      this.keys[i] = generateLetter();
    }
  }
}

For example, new Member('qwerty') can result in a member who pressed a, b, c, d, e, f or y, a, w, r, t, w.

Next, after we have our member created, let's move to the population object, which will keep the set of members we want to evolve:

class Population {
  constructor(size, target) {
    size = size || 1;
    this.members = [];

    for (let i = 0; i < size; i += 1) {
      this.members.push(new Member(target));
    }
  }
}

The Population should accept size and target (target string). In the constructor, we are going to generate size members all with randomly generated words, but with equal length.

This should be enough to create a diverse initial population!

Selection

It's time for us to add some methods to the classes we created. We will start with the Member. Do you remember that we had to develop some criteria based on how we determine the fitness of our members? This is exactly what we are going to do by introducing the fitness method.

Let me elaborate:

class Member {
  // ...

  fitness() {
    let match = 0;

    for (let i = 0; i < this.keys.length; i += 1) {
      if (this.keys[i] === this.target[i]) {
        match += 1;
      }
    }

    return match / this.target.length;
  }
}

As you can see, what we are attempting to do is to find how many of the randomly pressed keys (by a member) match the letters in our target string. The more, the better! Since we want to have a coefficient as an output, we will divide the matches by the length of the target.

Going back to our Population class, we will implement the actual selection operation. Logically, we would like to have the fitness calculated to every member of the population.

Based on that value, we want to increase the chances of the fittest members being selected, and respectively, decrease the chances of selection for the least fit candidates for the new generation. This can happen by adding an additional array called the mating pool. In it, we will add every member, multiplied by its-fitness-coefficient. To grasp this concept, let's take a look at this example:

Member1 (fitness = 4)
Member2 (fitness = 2)
Member3 (fitness = 1)

== Output ==

MatingPool = [
  Member1, Member1, Member1, Member1,
  Member2, Member2,
  Member3
]

At this point, you should be able to guess the idea behind this approach -- to increase the chances of the fittest members being randomly selected. Note that there are different ways for building your mating pool distribution.

And the code:

class Population {
  // ...

  _selectMembersForMating() {
    const matingPool = [];

    this.members.forEach((m) => {
      // The fitter it is, the more often it will be present in the mating pool
      // i.e. increasing the chances of selection
      // If fitness == 0, add just one member
      const f = Math.floor(m.fitness() * 100) || 1;

      for (let i = 0; i < f; i += 1) {
        matingPool.push(m);
      }
    });

    return matingPool;
  }
}

Now that we have readied our selection process, it is time to reproduce!

Reproduction

In this section of the implementation, we will focus on the crossover, and mutation operations. This will require further extending of our Member class. I will start by creating the crossover method. Its role is to combine/mix the traits, genes, or in our case, keys, of two parents randomly selected from the mating pool. The output child might contain only keys from parent A or keys from parent B, but in most cases, it will contain keys from both the parents. This happens by selecting a random index from the array that contains them:

class Member {
  // ...

  crossover(partner) {
    const { length } = this.target;
    const child = new Member(this.target);
    const midpoint = random(0, length);

    for (let i = 0; i < length; i += 1) {
      if (i > midpoint) {
        child.keys[i] = this.keys[i];
      } else {
        child.keys[i] = partner.keys[i];
      }
    }

    return child;
  }
}

Since we are working with the Member, let's add our last method to it. This is the mutate method. Its purpose is to give some uniqueness, and randomness, to the newly created child. It's worth mentioning that the mutation is not something that every genetic algorithm incorporates. That's why we will use a rate which will make the process controlled:

class Member {
  // ...

  mutate(mutationRate) {
    for (let i = 0; i < this.keys.length; i += 1) {
      // If below predefined mutation rate,
      // generate a new random letter on this position.
      if (Math.random() < mutationRate) {
        this.keys[i] = generateLetter();
      }
    }
  }
}

// We will also have to update the Population constructor
class Population {
  constructor(size, target, mutationRate) {
    size = size || 1;
    this.members = [];
    this.mutationRate = mutationRate // <== New property

    // ...
  }

  // ...
}

It's time to tie these two in our Population class. Previously, we added the _selectMembersForMating. Let's make use of it:

class Population {
  // ...

  _reproduce(matingPool) {
    for (let i = 0; i < this.members.length; i += 1) {
      // Pick 2 random members/parents from the mating pool
      const parentA = matingPool[random(0, matingPool.length)];
      const parentB = matingPool[random(0, matingPool.length)];

      // Perform crossover
      const child = parentA.crossover(parentB);

      // Perform mutation
      child.mutate(this.mutationRate);

      this.members[i] = child;
    }
  }
}

Here, we only pick 2 random parents. Then, we perform the crossover. After that comes the mutation. And then finally, we replace the member with the new child. That's it!

We are almost done. For the purposes of this demo, we are going to add control over the number of generations (or iterations) that we want to have. That way, we can closely monitor how our population changes over time. In a real world scenario, we would like to create new generations until we reach our final goal or target. Anyway, let's add the final evolve method to the Population class for that goal:

class Population {
  // ...

  evolve(generations) {
    for (let i = 0; i < generations; i += 1) {
      const pool = this._selectMembersForMating();
      this._reproduce(pool);
    }
  }
}

In the end, we will add a helper function that will make the running process a bit easier for us:

function generate(populationSize, target, mutationRate, generations) {
  // Create a population and evolve for N generations
  const population = new Population(populationSize, target, mutationRate);
  population.evolve(generations);

  // Get the typed words from all members, and find if someone was able to type the target
  const membersKeys = population.members.map((m) => m.keys.join(''));
  const perfectCandidatesNum = membersKeys.filter((w) => w === target);

  // Print the results
  console.log(membersKeys);
  console.log(`${perfectCandidatesNum ? perfectCandidatesNum.length : 0} member(s) typed "${target}"`);
}

generate(20, 'hit', 0.05, 5);

Running this, we will create a population of 20 members whose target is to type "hit". We'll execute it for 5 generations with a mutation rate of 5%.

Note: Keep in mind that, for our demo, the target string must contain only lowercase a-z letters.

Results

Having this particular population evolve for 5 generations will probably end up with 0 members who typed "hit". However, making that number 20 might land us a good candidate:

  5 Generations  |  20 Generations
-----------------|------------------
       ...       |        ...
                 |
      'gio'      |       'fit'
      'yix'      |       'hrt'
      'gio'      |    >> 'hit' <<
      'kbo'      |       'hiy'

Online demo

You can find the full source code on GitHub

If you want to dig more into genetic algorithms ...

Since the article only lightly touches on what genetic algorithms are, I highly recommend you check out The Nature of Code by Daniel Shiffman (from which this text was based on). It is a great resource for those who want a deeper insight.

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

You might also like

Next.js + MongoDB Connection Storming cover image

Next.js + MongoDB Connection Storming

Building a Next.js application connected to MongoDB can feel like a match made in heaven. MongoDB stores all of its data as JSON objects, which don’t require transformation into JavaScript objects like relational SQL data does. However, when deploying your application to a serverless production environment such as Vercel, it is crucial to manage your database connections properly. If you encounter errors like these, you may be experiencing Connection Storming: * MongoServerSelectionError: connect ECONNREFUSED &lt;IP_ADDRESS>:&lt;PORT> * MongoNetworkError: failed to connect to server [&lt;hostname>:&lt;port>] on first connect * MongoTimeoutError: Server selection timed out after &lt;x> ms * MongoTopologyClosedError: Topology is closed, please connect * Mongo Atlas: Connections % of configured limit has gone above 80 Connection storming occurs when your application has to mount a connection to Mongo for every serverless function or API endpoint call. Vercel executes your application’s code in a highly concurrent and isolated fashion. So, if you create new database connections on each request, your app might quickly exceed the connection limit of your database. We can leverage Vercel’s fluid compute model to keep our database connection objects warm across function invocations. Traditional serverless architecture was designed for quick, stateless web app transactions. Now, especially with the rise of LLM-oriented applications built with Next.js, interactions with applications are becoming more sequential. We just need to ensure that we assign our MongoDB connection to a global variable. Protip: Use global variables Vercel’s fluid compute model means all memory, including global constants like a MongoDB client, stays initialized between requests as long as the instance remains active. By assigning your MongoDB client to a global constant, you avoid redundant setup work and reduce the overhead of cold starts. This enables a more efficient approach to reusing connections for your application’s MongoDB client. The example below demonstrates how to retrieve an array of users from the users collection in MongoDB and either return them through an API request to /api/users or render them as an HTML list at the /users route. To support this, we initialize a global clientPromise variable that maintains the MongoDB connection across warm serverless executions, avoiding re-initialization on every request. ` Using this database connection in your API route code is easy: ` You can also use this database connection in your server-side rendered React components. ` In serverless environments like Vercel, managing database connections efficiently is key to avoiding connection storming. By reusing global variables and understanding the serverless execution model, you can ensure your Next.js app remains stable and performant....

The Importance of a Scientific Mindset in Software Engineering: Part 2 (Debugging) cover image

The Importance of a Scientific Mindset in Software Engineering: Part 2 (Debugging)

The Importance of a Scientific Mindset in Software Engineering: Part 2 (Debugging) In the first part of my series on the importance of a scientific mindset in software engineering, we explored how the principles of the scientific method can help us evaluate sources and make informed decisions. Now, we will focus on how these principles can help us tackle one of the most crucial and challenging tasks in software engineering: debugging. In software engineering, debugging is often viewed as an art - an intuitive skill honed through experience and trial and error. In a way, it is - the same as a GP, even a very evidence-based one, will likely diagnose most of their patients based on their experience and intuition and not research scientific literature every time; a software engineer will often rely on their experience and intuition to identify and fix common bugs. However, an internist faced with a complex case will likely not be able to rely on their intuition alone and must apply the scientific method to diagnose the patient. Similarly, a software engineer can benefit from using the scientific method to identify and fix the problem when faced with a complex bug. From that perspective, treating engineering challenges like scientific inquiries can transform the way we tackle problems. Rather than resorting to guesswork or gut feelings, we can apply the principles of the scientific method—forming hypotheses, designing controlled experiments, gathering and evaluating evidence—to identify and eliminate bugs systematically. This approach, sometimes referred to as "scientific debugging," reframes debugging from a haphazard process into a structured, disciplined practice. It encourages us to be skeptical, methodical, and transparent in our reasoning. For instance, as Andreas Zeller notes in the book _Why Programs Fail_, the key aspect of scientific debugging is its explicitness: Using the scientific method, you make your assumptions and reasoning explicit, allowing you to understand your assumptions and often reveals hidden clues that can lead to the root cause of the problem on hand. Note: If you'd like to read an excerpt from the book, you can find it on Embedded.com. Scientific Debugging At its core, scientific debugging applies the principles of the scientific method to the process of finding and fixing software defects. Rather than attempting random fixes or relying on intuition, it encourages engineers to move systematically, guided by data, hypotheses, and controlled experimentation. By adopting debugging as a rigorous inquiry, we can reduce guesswork, speed up the resolution process, and ensure that our fixes are based on solid evidence. Just as a scientist begins with a well-defined research question, a software engineer starts by identifying the specific symptom or error condition. For instance, if our users report inconsistencies in the data they see across different parts of the application, our research question could be: _"Under what conditions does the application display outdated or incorrect user data?"_ From there, we can follow a structured debugging process that mirrors the scientific method: - 1. Observe and Define the Problem: First, we need to clearly state the bug's symptoms and the environment in which it occurs. We should isolate whether the issue is deterministic or intermittent and identify any known triggers if possible. Such a structured definition serves as the groundwork for further investigation. - 2. Formulate a Hypothesis: A hypothesis in debugging is a testable explanation for the observed behavior. For instance, you might hypothesize: _"The data inconsistency occurs because a caching layer is serving stale data when certain user profiles are updated."_ The key is that this explanation must be falsifiable; if experiments don't support the hypothesis, it must be refined or discarded. - 3. Collect Evidence and Data: Evidence often includes logs, system metrics, error messages, and runtime traces. Similar to reviewing primary sources in academic research, treat your raw debugging data as crucial evidence. Evaluating these data points can reveal patterns. In our example, such patterns could be whether the bug correlates with specific caching mechanisms, increased memory usage, or database query latency. During this step, it's essential to approach data critically, just as you would analyze the quality and credibility of sources in a research literature review. Don't forget that even logs can be misleading, incomplete, or even incorrect, so cross-referencing multiple sources is key. - 4. Design and Run Experiments: Design minimal, controlled tests to confirm or refute your hypothesis. In our example, you may try disabling or shortening the cache's time-to-live (TTL) to see if more recent data is displayed correctly. By manipulating one variable at a time - such as cache invalidation intervals - you gain clearer insights into causation. Tools such as profilers, debuggers, or specialized test harnesses can help isolate factors and gather precise measurements. - 5. Analyze Results and Refine Hypotheses: If the experiment's outcome doesn't align with your hypothesis, treat it as a stepping stone, not a dead end. Adjust your explanation, form a new hypothesis, or consider additional variables (for example, whether certain API calls bypass caching). Each iteration should bring you closer to a better understanding of the bug's root cause. Remember, the goal is not to prove an initial guess right but to arrive at a verifiable explanation. - 6. Implement and Verify the Fix: Once you're confident in the identified cause, you can implement the fix. Verification doesn't stop at deployment - re-test under the same conditions and, if possible, beyond them. By confirming the fix in a controlled manner, you ensure that the solution is backed by evidence rather than wishful thinking. - Personally, I consider implementing end-to-end tests (e.g., with Playwright) that reproduce the bug and verify the fix to be a crucial part of this step. This both ensures that the bug doesn't reappear in the future due to changes in the codebase and avoids possible imprecisions of manual testing. Now, we can explore these steps in more detail, highlighting how the scientific method can guide us through the debugging process. Establishing Clear Debugging Questions (Formulating a Hypothesis) A hypothesis is a proposed explanation for a phenomenon that can be tested through experimentation. In a debugging context, that phenomenon is the bug or issue you're trying to resolve. Having a clear, falsifiable statement that you can prove or disprove ensures that you stay focused on the real problem rather than jumping haphazardly between possible causes. A properly formulated hypothesis lets you design precise experiments to evaluate whether your explanation holds true. To formulate a hypothesis effectively, you can follow these steps: 1. Clearly Identify the Symptom(s) Before forming any hypothesis, pin down the specific issue users are experiencing. For instance: - "Users intermittently see outdated profile information after updating their accounts." - "Some newly created user profiles don't reflect changes in certain parts of the application." Having a well-defined problem statement keeps your hypothesis focused on the actual issue. Just like a research question in science, the clarity of your symptom definition directly influences the quality of your hypothesis. 2. Draft a Tentative Explanation Next, convert your symptom into a statement that describes a _possible root cause_, such as: - "Data inconsistency occurs because the caching layer isn't invalidating or refreshing user data properly when profiles are updated." - "Stale data is displayed because the cache timeout is too long under certain load conditions." This step makes your assumption about the root cause explicit. As with the scientific method, your hypothesis should be something you can test and either confirm or refute with data or experimentation. 3. Ensure Falsifiability A valid hypothesis must be falsifiable - meaning it can be proven _wrong_. You'll struggle to design meaningful experiments if a hypothesis is too vague or broad. For example: - Not Falsifiable: "Occasionally, the application just shows weird data." - Falsifiable: "Users see stale data when the cache is not invalidated within 30 seconds of profile updates." Making your hypothesis specific enough to fail a test will pave the way for more precise debugging. 4. Align with Available Evidence Match your hypothesis to what you already know - logs, stack traces, metrics, and user reports. For example: - If logs reveal that cache invalidation events aren't firing, form a hypothesis explaining why those events fail or never occur. - If metrics show that data served from the cache is older than the configured TTL, hypothesize about how or why the TTL is being ignored. If your current explanation contradicts existing data, refine your hypothesis until it fits. 5. Plan for Controlled Tests Once you have a testable hypothesis, figure out how you'll attempt to _disprove_ it. This might involve: - Reproducing the environment: Set up a staging/local system that closely mimics production. For instance with the same cache layer configurations. - Varying one condition at a time: For example, only adjust cache invalidation policies or TTLs and then observe how data freshness changes. - Monitoring metrics: In our example, such monitoring would involve tracking user profile updates, cache hits/misses, and response times. These metrics should lead to confirming or rejecting your explanation. These plans become your blueprint for experiments in further debugging stages. Collecting and Evaluating Evidence After formulating a clear, testable hypothesis, the next crucial step is to gather data that can either support or refute it. This mirrors how scientists collect observations in a literature review or initial experiments. 1. Identify "Primary Sources" (Logs, Stack Traces, Code History): - Logs and Stack Traces: These are your direct pieces of evidence - treat them like raw experimental data. For instance, look closely at timestamps, caching-related events (e.g., invalidation triggers), and any error messages related to stale reads. - Code History: Look for related changes in your source control, e.g. using Git bisect. In our example, we would look for changes to caching mechanisms or references to cache libraries in commits, which could pinpoint when the inconsistency was introduced. Sometimes, reverting a commit that altered cache settings helps confirm whether the bug originated there. 2. Corroborate with "Secondary Sources" (Documentation, Q&A Forums): - Documentation: Check official docs for known behavior or configuration details that might differ from your assumptions. - Community Knowledge: Similar issues reported on GitHub or StackOverflow may reveal known pitfalls in a library you're using. 3. Assess Data Quality and Relevance: - Look for Patterns: For instance, does stale data appear only after certain update frequencies or at specific times of day? - Check Environmental Factors: For instance, does the bug happen only with particular deployment setups, container configurations, or memory constraints? - Watch Out for Biases: Avoid seeking only the data that confirms your hypothesis. Look for contradictory logs or metrics that might point to other root causes. You keep your hypothesis grounded in real-world system behavior by treating logs, stack traces, and code history as primary data - akin to raw experimental results. This evidence-first approach reduces guesswork and guides more precise experiments. Designing and Running Experiments With a hypothesis in hand and evidence gathered, it's time to test it through controlled experiments - much like scientists isolate variables to verify or debunk an explanation. 1. Set Up a Reproducible Environment: - Testing Environments: Replicate production conditions as closely as possible. In our example, that would involve ensuring the same caching configuration, library versions, and relevant data sets are in place. - Version Control Branches: Use a dedicated branch to experiment with different settings or configuration, e.g., cache invalidation strategies. This streamlines reverting changes if needed. 2. Control Variables One at a Time: - For instance, if you suspect data inconsistency is tied to cache invalidation events, first adjust only the invalidation timeout and re-test. - Or, if concurrency could be a factor (e.g., multiple requests updating user data simultaneously), test different concurrency levels to see if stale data issues become more pronounced. 3. Measure and Record Outcomes: - Automated Tests: Tests provide a great way to formalize and verify your assumptions. For instance, you could develop tests that intentionally update user profiles and check if the displayed data matches the latest state. - Monitoring Tools: Monitor relevant metrics before, during, and after each experiment. In our example, we might want to track cache hit rates, TTL durations, and query times. - Repeat Trials: Consistency across multiple runs boosts confidence in your findings. 4. Validate Against a Baseline: - If baseline tests manifest normal behavior, but your experimental changes manifest the bug, you've isolated the variable causing the issue. E.g. if the baseline tests show that data is consistently fresh under normal caching conditions but your experimental changes cause stale data. - Conversely, if your change eliminates the buggy behavior, it supports your hypothesis - e.g. that the cache configuration was the root cause. Each experiment outcome is a data point supporting or contradicting your hypothesis. Over time, these data points guide you toward the true cause. Analyzing Results and Iterating In scientific debugging, an unexpected result isn't a failure - it's valuable feedback that brings you closer to the right explanation. 1. Compare Outcomes to the hypothesis. For instance: - Did user data stay consistent after you reduced the cache TTL or fixed invalidation logic? - Did logs show caching events firing as expected, or did they reveal unexpected errors? - Are there only partial improvements that suggest multiple overlapping issues? 2. Incorporate Unexpected Observations: - Sometimes, debugging uncovers side effects - e.g. performance bottlenecks exposed by more frequent cache invalidations. Note these for future work. - If your hypothesis is disproven, revise it. For example, the cache may only be part of the problem, and a separate load balancer setting also needs attention. 3. Avoid Confirmation Bias: - Don't dismiss contrary data. For instance, if you see evidence that updates are fresh in some modules but stale in others, you may have found a more nuanced root cause (e.g., partial cache invalidation). - Consider other credible explanations if your teammates propose them. Test those with the same rigor. 4. Decide If You Need More Data: - If results aren't conclusive, add deeper instrumentation or enable debug modes to capture more detailed logs. - For production-only issues, implement distributed tracing or sampling logs to diagnose real-world usage patterns. 5. Document Each Iteration: - Record the results of each experiment, including any unexpected findings or new hypotheses that arise. - Through iterative experimentation and analysis, each cycle refines your understanding. By letting evidence shape your hypothesis, you ensure that your final conclusion aligns with reality. Implementing and Verifying the Fix Once you've identified the likely culprit - say, a misconfigured or missing cache invalidation policy - the next step is to implement a fix and verify its resilience. 1. Implementing the Change: - Scoped Changes: Adjust just the component pinpointed in your experiments. Avoid large-scale refactoring that might introduce other issues. - Code Reviews: Peer reviews can catch overlooked logic gaps or confirm that your changes align with best practices. 2. Regression Testing: - Re-run the same experiments that initially exposed the issue. In our stale data example, confirm that the data remains fresh under various conditions. - Conduct broader tests - like integration or end-to-end tests - to ensure no new bugs are introduced. 3. Monitoring in Production: - Even with positive test results, real-world scenarios can differ. Monitor logs and metrics (e.g. cache hit rates, user error reports) closely post-deployment. - If the buggy behavior reappears, revisit your hypothesis or consider additional factors, such as unpredicted user behavior. 4. Benchmarking and Performance Checks (If Relevant): - When making changes that affect the frequency of certain processes - such as how often a cache is refreshed - be sure to measure the performance impact. Verify you meet any latency or resource usage requirements. - Keep an eye on the trade-offs: For instance, more frequent cache invalidations might solve stale data but could also raise system load. By systematically verifying your fix - similar to confirming experimental results in research - you ensure that you've addressed the true cause and maintained overall software stability. Documenting the Debugging Process Good science relies on transparency, and so does effective debugging. Thorough documentation guarantees your findings are reproducible and valuable to future team members. 1. Record Your Hypothesis and Experiments: - Keep a concise log of your main hypothesis, the tests you performed, and the outcomes. - A simple markdown file within the repo can capture critical insights without being cumbersome. 2. Highlight Key Evidence and Observations: - Note the logs or metrics that were most instrumental - e.g., seeing repeated stale cache hits 10 minutes after updates. - Document any edge cases discovered along the way. 3. List Follow-Up Actions or Potential Risks: - If you discover additional issues - like memory spikes from more frequent invalidation - note them for future sprints. - Identify parts of the code that might need deeper testing or refactoring to prevent similar issues. 4. Share with Your Team: - Publish your debugging report on an internal wiki or ticket system. A well-documented troubleshooting narrative helps educate other developers. - Encouraging open discussion of the debugging process fosters a culture of continuous learning and collaboration. By paralleling scientific publication practices in your documentation, you establish a knowledge base to guide future debugging efforts and accelerate collective problem-solving. Conclusion Debugging can be as much a rigorous, methodical exercise as an art shaped by intuition and experience. By adopting the principles of scientific inquiry - forming hypotheses, designing controlled experiments, gathering evidence, and transparently documenting your process - you make your debugging approach both systematic and repeatable. The explicitness and structure of scientific debugging offer several benefits: - Better Root-Cause Discovery: Structured, hypothesis-driven debugging sheds light on the _true_ underlying factors causing defects rather than simply masking symptoms. - Informed Decisions: Data and evidence lead the way, minimizing guesswork and reducing the chance of reintroducing similar issues. - Knowledge Sharing: As in scientific research, detailed documentation of methods and outcomes helps others learn from your process and fosters a collaborative culture. Ultimately, whether you are diagnosing an intermittent crash or chasing elusive performance bottlenecks, scientific debugging brings clarity and objectivity to your workflow. By aligning your debugging practices with the scientific method, you build confidence in your solutions and empower your team to tackle complex software challenges with precision and reliability. But most importantly, do not get discouraged by the number of rigorous steps outlined above or by the fact you won't always manage to follow them all religiously. Debugging is a complex and often frustrating process, and it's okay to rely on your intuition and experience when needed. Feel free to adapt the debugging process to your needs and constraints, and as long as you keep the scientific mindset at heart, you'll be on the right track....

Advanced Authentication and Onboarding Workflows with Docusign Extension Apps cover image

Advanced Authentication and Onboarding Workflows with Docusign Extension Apps

Advanced Authentication and Onboarding Workflows with Docusign Extension Apps Docusign Extension Apps are a relatively new feature on the Docusign platform. They act as little apps or plugins that allow building custom steps in Docusign agreement workflows, extending them with custom functionality. Docusign agreement workflows have many built-in steps that you can utilize. With Extension Apps, you can create additional custom steps, enabling you to execute custom logic at any point in the agreement process, from collecting participant information to signing documents. An Extension App is a small service, often running in the cloud, described by the Extension App manifest. The manifest file provides information about the app, including the app's author and support pages, as well as descriptions of extension points used by the app or places where the app can be integrated within an agreement workflow. Most often, these extension points need to interact with an external system to read or write data, which cannot be done anonymously, as all data going through Extension Apps is usually sensitive. Docusign allows authenticating to external systems using the OAuth 2 protocol, and the specifics about the OAuth 2 configuration are also placed in the manifest file. Currently, only OAuth 2 is supported as the authentication scheme for Extension Apps. OAuth 2 is a robust and secure protocol, but not all systems support it. Some systems use alternative authentication schemes, such as the PKCE variant of OAuth 2, or employ different authentication methods (e.g., using secret API keys). In such cases, we need to use a slightly different approach to integrate these systems with Docusign. In this blog post, we'll show you how to do that securely. We will not go too deep into the implementation details of Extension Apps, and we assume a basic familiarity with how they work. Instead, we'll focus on the OAuth 2 part of Extension Apps and how we can extend it. Extending the OAuth 2 Flow in Extension Apps For this blog post, we'll integrate with an imaginary task management system called TaskVibe, which offers a REST API to which we authenticate using a secret API key. We aim to develop an extension app that enables Docusign agreement workflows to communicate with TaskVibe, allowing tasks to be read, created, and updated. TaskVibe does not support OAuth 2. We need to ensure that, once the TaskVibe Extension App is connected, the user is prompted to enter their secret API key. We then need to store this API key securely so it can be used for interacting with the TaskVibe API. Of course, the API key can always be stored in the database of the Extension App. Sill, then, the Extension App has a significant responsibility for storing the API key securely. Docusign already has the capability to store secure tokens on its side and can utilize that instead. After all, most Extension Apps are meant to be stateless proxies to external systems. Updating the Manifest To extend OAuth 2, we will need to hook into the OAuth 2 flow by injecting our backend's endpoints into the authorization URL and token URL parts of the manifest. In any other external system that supports OAuth 2, we would be using their OAuth 2 endpoints. In our case, however, we must use our backend endpoints so we can emulate OAuth 2 to Docusign. ` The complete flow will look as follows: In the diagram, we have four actors: the end-user on behalf of whom we are authenticating to TaskVibe, DocuSign, the Extension App, and TaskVibe. We are only in control of the Extension App, and within the Extension App, we need to adhere to the OAuth 2 protocol as expected by Docusign. 1. In the first step, Docusign will invoke the /authorize endpoint of the Extension App and provide the state, client_id, and redirect_uri parameters. Of these three parameters, state and redirect_uri are essential. 2. In the /authorize endpoint, the app needs to store state and redirect_uri, as they will be used in the next step. It then needs to display a user-facing form where the user is expected to enter their TaskVibe API key. 3. Once the user submits the form, we take the API key and encode it in a JWT token, as we will send it over the wire back to Docusign in the form of the code query parameter. This is the "custom" part of our implementation. In a typical OAuth 2 flow, the code is generated by the OAuth 2 server, and the client can then use it to request the access token. In our case, we'll utilize the code to pass the API key to Docusign so it can send it back to us in the next step. Since we are still in control of the user session, we redirect the user to the redirect URI provided by Docusign, along with the code and the state as query parameters. 4. The redirect URI on Docusign will display a temporary page to the user, and in the background, attempt to retrieve the access token from our backend by providing the code and state to the /api/token endpoint. 5. The /api/token endpoint takes the code parameter and decodes it to extract the TaskVibe API secret key. It can then verify if the API key is even valid by making a dummy call to TaskVibe using the API key. If the key is valid, we encode it in a new JWT token and return it as the access token to Docusign. 6. Docusign stores the access token securely on its side and uses it when invoking any of the remaining extension points on the Extension App. By following the above step, we ensure that the API key is stored in an encoded format on Docusign, and the Extension App effectively extends the OAuth 2 flow. The app is still stateless and does not have the responsibility of storing any secure information locally. It acts as a pure proxy between Docusign and TaskVibe, as it's meant to be. Writing the Backend Most Extension Apps are backend-only, but ours needs to have a frontend component for collecting the secret API key. A good fit for such an app is Next.js, which allows us to easily set up both the frontend and the backend. We'll start by implementing the form for entering the secret API key. This form takes the state, client ID, and redirect URI from the enclosing page, which takes these parameters from the URL. The form is relatively simple, with only an input field for the API key. However, it can also be used for any additional onboarding questions. If you will ever need to store additional information on Docusign that you want to use implicitly in your Extension App workflow steps, this is a good place to store it alongside the API secret key on Docusign. ` Submitting the form invokes a server action on Next.js, which takes the entered API key, the state, and the redirect URI. It then creates a JWT token using Jose that contains the API key and redirects the user to the redirect URI, sending the JWT token in the code query parameter, along with the state query parameter. This JWT token can be short-lived, as it's only meant to be a temporary holder of the API key while the authentication flow is running. This is the server action: ` After the user is redirected to Docusign, Docusign will then invoke the /api/token endpoint to obtain the access token. This endpoint will also be invoked occasionally after the authentication flow, before any extension endpoint is invoked, to get the latest access token using a refresh token. Therefore, the endpoint needs to cover two scenarios. In the first scenario, during the authentication phase, Docusign will send the code and state to the /api/token endpoint. In this scenario, the endpoint must retrieve the value of the code parameter (storing the JWT value), parse the JWT, and extract the API key. Optionally, it can verify the API key's validity by invoking an endpoint on TaskVibe using that key. Then, it should return an access token and a refresh token back to Docusign. Since we are not using refresh tokens in our case, we can create a new JWT token containing the API key and return it as both the access token and the refresh token to Docusign. In the second scenario, Docusign will send the most recently obtained refresh token to get a new access token. Again, because we are not using refresh tokens, we can return both the retrieved access token and the refresh token to Docusign. The api/token endpoint is implemented as a Next.js route handler: ` In all the remaining endpoints defined in the manifest file, Docusign will provide the access token as the bearer token. It's up to each endpoint to then read this value, parse the JWT, and extract the secret API key. Conclusion In conclusion, your Extension App does not need to be limited by the fact that the external system you are integrating with does not have OAuth 2 support or requires additional onboarding. We can safely build upon the existing OAuth 2 protocol and add custom functionality on top of it. This is also a drawback of the approach - it involves custom development, which requires additional work on our part to ensure all cases are covered. Fortunately, the scope of the Extension App does not extend significantly. All remaining endpoints are implemented in the same manner as any other OAuth 2 system, and the app remains a stateless proxy between Docusign and the external system as all necessary information, such as the secret API key and other onboarding details, is stored as an encoded token on the Docusign side. We hope this blog post was helpful. Keep an eye out for more Docusign content soon, and if you need help building an Extension App of your own, feel free to reach out. The complete source code for this project is available on StackBlitz....

The Quirks And Gotchas of PHP cover image

The Quirks And Gotchas of PHP

The Quirks And Gotchas of PHP If you come from a JavaScript background, you'll likely be familiar with some of its famous quirks, such as 1 + "1" equaling "11". Well, PHP has its own set of quirks and gotchas, too. Some are oddly similar to JavaScript's, while others can surprise a JavaScript developer. Let's start with the more familiar ones. 1. Type Juggling and Loose Comparisons Like JavaScript, PHP has two types of comparison operators: strict and loose. The loose comparison operator in PHP uses ==, while the strict comparison operator uses ===. Here's an example of a loose vs. strict comparison in PHP: ` PHP is a loosely typed language, meaning it will automatically convert variables from one type to another when necessary, just like JavaScript. This is not only when doing comparisons but also, for example, when doing numeric operations. Such conversions can lead to some unexpected results if you're not careful: ` As you can see, the type system has gotten a bit stricter in PHP 8, so it won't let you commit some of the "atrocities" that were possible in earlier versions, throwing a TypeError instead. PHP 8 introduced many changes that aim to eliminate some of the unpredictable behavior; we will cover some of them throughout this article. 1.1. Truthiness of Strings This is such a common gotcha in PHP that it deserves its own heading. By default, PHP considers an empty string as false and a non-empty string as true: ` But wait, there's more! PHP also considers the string "0" as false: ` You might think we're done here, but no! Try comparing a string such as "php" to 0: ` Until PHP7, any non-numeric string was converted to 0 when cast to an integer to compare it to the other integer. That's why this example will be evaluated as true. This quirk has been fixed in PHP 8. For a comprehensive comparison table of PHP's truthiness, check out the PHP documentation. 1.2. Switch Statements Switch statements in PHP use loose comparisons, so don't be surprised if you see some unexpected behavior when using them: ` The New Match Expression in PHP 8 PHP 8 introduced the match expression, which is similar to switch but uses strict comparisons (i.e., === under the hood) and returns a value: ` Unlike switch, there is no "fall-through" behavior in match, and each branch must return a value, making match a great alternative when you need a more precise or concise form of branching—especially if you want to avoid the loose comparisons of a traditional switch. 1.3 String to Number Conversion In earlier versions of PHP, string-to-number conversions were often done silently, even if the string wasn’t strictly numeric (like '123abc'). In PHP 7, this would typically result in 123 plus a Notice: ` In PHP 8, you’ll still get int(123), but now with a Warning, and in other scenarios (like extremely malformed strings), you might see a TypeError. This stricter behavior can reveal hidden bugs in code that relied on implicit type juggling. Stricter Type Checks & Warnings in PHP 8 - Performing arithmetic on non-numeric strings: As noted, in older versions, something like "123abc" + 0 would silently drop the non-numeric part, often producing 123 plus a PHP Notice. In PHP 8, such operations throw a more visible Warning or TypeError, depending on the exact scenario. - Null to Non-Nullable Internal Arguments: Passing null to a function parameter that’s internally declared as non-nullable will trigger a TypeError in PHP 8. Previously, this might have been silently accepted or triggered only a warning. - Internal Function Parameter Names: PHP 8 introduced named arguments but also made internal parameter names part of the public API. If you use named arguments with built-in functions, be aware that renaming or reordering parameters in future releases might break your code. Always match official parameter names as documented in the PHP manual. Union Types & Mixed Since PHP 8.0, we can declare union types, which allows you to specify that a parameter or return value can be one of multiple types. For example: ` Specifying the union of types your function accepts can help clarify your code’s intent and reveal incompatibilities if your existing code relies on looser type checking, preventing some of the conversion quirks we’ve discussed. 2. Operator Precedence and Associativity Operator precedence can lead to confusing situations if you’re not careful with parentheses. For instance, the . operator (string concatenation similar to + in JavaScript) has left-to-right associativity, but certain logical operators have lower precedence than assignment or concatenation, leading to puzzling results in PHP 7 and earlier: ` PHP 8 has fixed this issue by making the + and - operators take a higher precedence. 3. Variable Variables and Variable Functions Now, we're getting into unfamiliar territory as JavaScript Developers. PHP allows you to define variable variables and variable functions. This can be a powerful feature, but it can also lead to some confusing code: ` In this example, the variable $varName contains the string 'hello'. By using $$varName, we're creating a new variable with the name 'hello' and assigning it the value 'world'. Similarly, you can create variable functions: ` 4. Passing Variables by Reference You can pass variables by reference using the & operator in PHP. This means that any changes made to the variable inside the function will be reflected outside the function: ` While this example is straightforward, not knowing the pass-by-reference feature can lead to some confusion, and bugs can arise when you inadvertently pass variables by reference. 5. Array Handling PHP arrays are a bit different from JavaScript arrays. They can be used as both arrays and dictionaries, and they have some quirks that can catch you off guard. For example, if you try to access an element that doesn't exist in an array, PHP will return null instead of throwing an error: ` Furthermore, PHP arrays can contain both numerical and string keys at the same time, but numeric string keys can sometimes convert to integers, depending on the context> ` In this example: - "1" (string) and 1 (integer) collide, resulting in the array effectively having only one key: 1. - true is also cast to 1 as an integer, so it overwrites the same key. And last, but not least, let's go back to the topic of passing variables by reference. You can assign an array element by reference, which can feel quite unintuitive: ` 6 Checking for Variable Truthiness (isset, empty, and nullsafe operator) In PHP, you can use the empty() function to check if a variable is empty. But what does "empty" mean in PHP? The mental model of what's considered "empty" in PHP might differ from what you're used to in JavaScript. Let's clarify this: The following values are considered empty by the empty() function: - "" (an empty string) - 0 (0 as an integer) - 0.0 (0 as a float) - "0" (0 as a string) - null - false - [] (an empty array) This means that the following values are not considered empty: - "0" (a string containing "0") - " " (a string containing a space) - 0.0 (0 as a float) - new stdClass() (an empty object) Keep this in mind when using empty() in your code, otherwise, you might end up debugging some unexpected behavior. Undefined Variables and isset() Another little gotcha is that you might expect empty() to return true for undefined variables too - they contain nothing after all, right? Unfortunately, empty() will throw a notice in such case. To account for undefined variables, you may want to use the isset() function, which checks if a variable is set and not null: ` The Nullsafe Operator If you have a chain of properties or methods that you want to access, you may tend to check each step with isset() to avoid errors: ` In fact, because isset() is a special language construct and it doesn't fully evaluate an undefined part of the chain, it can be used to evaluate the whole chain at once: ` That's much nicer! However, it could be even more elegant with the nullsafe operator (?->) introduced in PHP 8: ` If you’ve used optional chaining in JavaScript or other languages, this should look familiar. It returns null if any part of the chain is null, which is handy but can also hide potential logic mistakes — if your application logic expects objects to exist, silently returning null may lead to subtle bugs. Conclusion While PHP shares a few loose typing quirks with JavaScript, it also has its own distinctive behaviors around type juggling, operator precedence, passing by reference, and array handling. Becoming familiar with these nuances — and with the newer, more predictable features in PHP 8 — will help you avoid subtle bugs and write clearer, more robust code. PHP continues to evolve, so always consult the official documentation to stay current on best practices and language changes....

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

Prefer email? hi@thisdot.co