Javascript

Seeding Initial Data in Amplify

Published Sep 8, 2021

Updated Feb 14, 2023

6 min read

This article was written over 18 months ago and may contain information that is out of date. Some content may be relevant but please refer to the relevant official documentation or available resources for the latest information.

Challenges With Amplify

Amplify is a powerful framework and toolchain that enables developers to build their frontend applications with a serverless backend using all major languages and platforms. It relieves the developers from the effort of writing a backend from scratch - the entire backend stack is provided by AWS.

However, one of the things that does not come with Amplify out of the box, at least at this moment, is a way to perform initial database seeding right after setting up an environment. Database seeding is the process of populating an empty database with data. The data can be some fake placeholder data to use in development, or some necessary data such as a set of initial configuration properties required for the proper functioning of the application.

In this blog post, we'll walk you through our attempt to write our own database seeding script, while having some guiding principles in mind. First of all, the script should be generic - it should not be specific to a particular project. Anyone should be able to copy it and place it in their project with no or minimal adjustments required. Secondly, the database seeding process should be automatic and ideally not require any manual steps by the user - the user should just execute the script and the database should be populated after the script is finished running. The script is responsible for reading the Amplify environment, and putting the data in proper tables.

Prerequisites

Before proceeding further, note that some of the concepts presented here assume at least basic Amplify experience. Likewise, there should be an existing Amplify project. For this purpose, we scaffolded a simple Angular project using Amplify starter project instructions and the entire project is available on GitHub. The tutorial also assumes that a GraphQL API is generated as part of the project, just like in the starter project.

Looking at the official AWS guide for writing items to a DynamoDB database in batch, we know that we need several pieces of information before making a connection to the database:

AWS credentials
The profile under which AWS credentials are stored
Region
Amplify environment name
Database ID (Tables created by Amplify have the format of tableName-databaseId-environmentName)

A recommended way of reading AWS credentials is through a shared credentials file. The credentials file is located in the .aws folder of your user folder and contains your access key ID and secret access key which are used to sign programmatic requests to AWS. It should have been created after executing amplify configure, and before initializing the Amplify project. The file has the following format:

[your-profile]
aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_SECRET_ACCESS_KEY

There is one additional file that contains your region preferences. It is named config, and it is also located in the .aws folder:

[profile your-profile]
region = eu-central-1

Extracting Information From Amplify Environment

Now that we have our credentials, we need to find the remaining information. Going through the amplify folder (generated by the Amplify CLI) we see that all information that we need is already there, but scattered across multiple files.

For example:

The environment name is located in amplify/.config/local-env-info.json
The profile name is located in amplify/.config/local-aws-info.json, assuming that the Amplify project was initialized using a local AWS profile name and not IAM keys
Database table ID and region is located in amplify/backend/amplify-meta.json

The risk in this approach is that this logic could change in the future if AWS decides to change the format of any of the above files, which aren't really meant for this kind of use. However, the benefit is ease and simplicity of use, and we feel that the benefits outweigh the risks. Should the files change in the future, we'll simply update the seeding script.

Writing the Script

Reading the environment and the profile name from local-env-info.json and local-aws-info.json is easy. Reading the database table ID and region is a bit tricky since we need to know the GraphQL API name first. Each GraphQL API will generate its own set of database tables on DynamoDB, so knowing GraphQL API name will easily get us the database table ID as well. Fortunately, amplify-meta.json has the list of APIs under the api field, and we'll iterate over this object to get API names.

Within the API object, there are also two fields that we are interested in. The first one is the name of the provider, found in the providerPlugin field. We'll use this to fetch the region from the providers root field in amplify-meta.json. The other one is output.GraphQLAPIIdOutput, which is the GraphQL API ID, and the same ID is used as the database ID for the tables.

Having everything we said so far in mind, the first version of the script is:

const AWS = require('aws-sdk');
const localEnvInfo = require('../../amplify/.config/local-env-info.json');
const localAwsInfo = require('../../amplify/.config/local-aws-info.json');
const amplifyMeta = require('../../amplify/backend/amplify-meta.json');

const environmentName = localEnvInfo.envName;
const profileName = localAwsInfo[environmentName]?.profileName;

if (!profileName) {
  throw Error('Please reinitialize your Amplify project using your AWS profile');
}

for (const [apiName] of Object.entries(amplifyMeta.api)) {
  const providerName = amplifyMeta.api[apiName].providerPlugin;
  const databaseId = amplifyMeta.api[apiName].output.GraphQLAPIIdOutput;
  const region = amplifyMeta.providers[providerName].Region;

  AWS.config.credentials = new AWS.SharedIniFileCredentials({ profile: profileName });
  AWS.config.update({ region: region })
  const documentClient = new AWS.DynamoDB.DocumentClient();

  // ToDo: read seed data and write to DynamoDB
}

The script reads the local Amplify environment from local-env-info.json, local-aws-info.json, amplify-meta.json, performs some safety checks, then iterates over each API found in amplify-meta.json. In the end, it creates an instance of the DocumentClient class which is used to write to DynamoDB.

Organizing Seed Data

The missing step is reading and writing seed data. Again, we don't want any manual configuration here - the script should just run, read the seed data and use the Amplify information we extracted earlier to insert the data to the DynamoDB database. For this to work, we'll organize our directories and files in a way that they are easily mapped to objects on AWS. The directory structure we're aiming for is this:

tools/
├─ seeder/
│  ├─ index.js (this is our seeding script)
│  ├─ fixtures/
│  │  ├─ [API 1 name]/
│  │  │  ├─ [Table 1 name].json
│  │  │  ├─ [Table 2 name].json
│  │  │  ├─ ...
│  │  ├─ [API 2 name]/
│  │  ├─ .../

For each API found in amplify-meta.json, the script will find the respective directory under fixtures, and iterate over JSON files found in that directory. Each file's name corresponds to the table name we're populating, and the content of the file is a JSON array with table items. For example, the JSON file to populate the Restaurant's table from our Angular starter project would be named Restaurants.json and would have the following contents:

[
  {
    "city": "Menton",
    "description": "",
    "name": "Mirazur"
  },
  {
    "city": "Copenhagen",
    "description": "",
    "name": "Noma"
  },
  {
    "city": "Axpe",
    "description": "",
    "name": "Asador Etxebarri"
  }
]

DynamoDB's DocumentClient

The next step is implementing the logic that reads JSON files, and writes them to DynamoDB. DocumentClient's batchWrite method allows providing multiple items for multiple tables, all in the same request. This means that the method will be called only once, and all items will be pushed in batch, which is very performant.

The structure of this write request looks as follows:

const writeParams = {
  RequestItems: {
    table1Name: [
      {
        PutRequest: {
          Item: {
            id: 1,
            field1:'field 1 value',
            field2: 'field 2 value',
            //...
          },
        }
      },
      {
        PutRequest: {
          Item: {
            id: 2,
            field1:'field 1 value',
            field2: 'field 2 value',
            //...
          },
        }
      },
      //...
    ]
  }
};

The table name is the complete table name, which is in the form of baseTableName-databaseId-environmentName when using Amplify. baseTableName normally comes from the GraphQL schema type, databaseId is equal to GraphQL API ID, while environmentName is the Amplify environment name (such as dev).

In the PutRequest, we should specify all fields that would normally be provisioned when using the GraphQL API. This includes some fields that Amplify uses internally, like:

id (primary key, which must be unique and can be generated using uuidv4() for example)
__typename (equal to type in the GraphQL schema, which is normally equal to baseTableName)
_lastChangedAt (can be the current date, written as Unix timestamp)
_version (can be 1)
createdAt (can be the current date, written as ISO-8601 string)
updatedAt (can be the current date, written as ISO-8601 string)

Other, external fields, can be provided from seed data.

Written in code, this piece of logic looks like this:

const writeParams = {
  RequestItems: {}
};
const baseTableNames = readdirSync(join(__dirname, 'fixtures', apiName)).map((filename) => parse(filename).name);
for (const baseTableName of baseTableNames) {
  const fullTableName = `${baseTableName}-${databaseId}-${environmentName}`;
  const tableItems = require(`./fixtures/${apiName}/${baseTableName}.json`);
  writeParams.RequestItems[fullTableName] = tableItems.map((tableItem) => ({
    PutRequest: {
      Item: {
        id: uuidv4(),
        __typename: baseTableName,
        _lastChangedAt: new Date().getTime(),
        _version: 1,
        createdAt: new Date().toISOString(),
        updatedAt: new Date().toISOString(),
        ...tableItem
      }
    }
  }));
}

documentClient.batchWrite(writeParams, (error, data) => {
  if (error) {
    console.error('Error in batch write', error);
  } else {
    console.error('Successfully executed batch write', data);
  }
});

Conclusion

And that's it - the script is complete. In just over 50 lines of code, we now have an efficient way of populating multiple DynamoDB tables with initial data. The initial data itself is stored in JSON files, which itself is quite readable and anyone can customize it to their project needs.

We hope you enjoyed this tutorial. Should you have any questions, do not hesitate to drop us a line.

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

About the author(s)

Dario Djuric
Dario is a full-stack engineer who has spent most of his career doing enterprise Java projects. He has always had a hidden passion for frontend, though -- and he is now able to pursue that passion at This Dot. He spends most of his free time with his two sons, but occasionally, he manages to squeeze in some casual sports activities such as jogging, cycling, and soccer.
@dario_djuric @dariodjuric

Next.js + MongoDB Connection Storming

Building a Next.js application connected to MongoDB can feel like a match made in heaven. MongoDB stores all of its data as JSON objects, which don’t require transformation into JavaScript objects like relational SQL data does. However, when deploying your application to a serverless production environment such as Vercel, it is crucial to manage your database connections properly. If you encounter errors like these, you may be experiencing Connection Storming: * MongoServerSelectionError: connect ECONNREFUSED <IP_ADDRESS>:<PORT> * MongoNetworkError: failed to connect to server [<hostname>:<port>] on first connect * MongoTimeoutError: Server selection timed out after <x> ms * MongoTopologyClosedError: Topology is closed, please connect * Mongo Atlas: Connections % of configured limit has gone above 80 Connection storming occurs when your application has to mount a connection to Mongo for every serverless function or API endpoint call. Vercel executes your application’s code in a highly concurrent and isolated fashion. So, if you create new database connections on each request, your app might quickly exceed the connection limit of your database. We can leverage Vercel’s fluid compute model to keep our database connection objects warm across function invocations. Traditional serverless architecture was designed for quick, stateless web app transactions. Now, especially with the rise of LLM-oriented applications built with Next.js, interactions with applications are becoming more sequential. We just need to ensure that we assign our MongoDB connection to a global variable. Protip: Use global variables Vercel’s fluid compute model means all memory, including global constants like a MongoDB client, stays initialized between requests as long as the instance remains active. By assigning your MongoDB client to a global constant, you avoid redundant setup work and reduce the overhead of cold starts. This enables a more efficient approach to reusing connections for your application’s MongoDB client. The example below demonstrates how to retrieve an array of users from the users collection in MongoDB and either return them through an API request to /api/users or render them as an HTML list at the /users route. To support this, we initialize a global clientPromise variable that maintains the MongoDB connection across warm serverless executions, avoiding re-initialization on every request. ` Using this database connection in your API route code is easy: ` You can also use this database connection in your server-side rendered React components. ` In serverless environments like Vercel, managing database connections efficiently is key to avoiding connection storming. By reusing global variables and understanding the serverless execution model, you can ensure your Next.js app remains stable and performant....

Jul 11, 2025

3 mins

NextJSJavaScript

The Importance of a Scientific Mindset in Software Engineering: Part 2 (Debugging)

The Importance of a Scientific Mindset in Software Engineering: Part 2 (Debugging) In the first part of my series on the importance of a scientific mindset in software engineering, we explored how the principles of the scientific method can help us evaluate sources and make informed decisions. Now, we will focus on how these principles can help us tackle one of the most crucial and challenging tasks in software engineering: debugging. In software engineering, debugging is often viewed as an art - an intuitive skill honed through experience and trial and error. In a way, it is - the same as a GP, even a very evidence-based one, will likely diagnose most of their patients based on their experience and intuition and not research scientific literature every time; a software engineer will often rely on their experience and intuition to identify and fix common bugs. However, an internist faced with a complex case will likely not be able to rely on their intuition alone and must apply the scientific method to diagnose the patient. Similarly, a software engineer can benefit from using the scientific method to identify and fix the problem when faced with a complex bug. From that perspective, treating engineering challenges like scientific inquiries can transform the way we tackle problems. Rather than resorting to guesswork or gut feelings, we can apply the principles of the scientific method—forming hypotheses, designing controlled experiments, gathering and evaluating evidence—to identify and eliminate bugs systematically. This approach, sometimes referred to as "scientific debugging," reframes debugging from a haphazard process into a structured, disciplined practice. It encourages us to be skeptical, methodical, and transparent in our reasoning. For instance, as Andreas Zeller notes in the book _Why Programs Fail_, the key aspect of scientific debugging is its explicitness: Using the scientific method, you make your assumptions and reasoning explicit, allowing you to understand your assumptions and often reveals hidden clues that can lead to the root cause of the problem on hand. Note: If you'd like to read an excerpt from the book, you can find it on Embedded.com. Scientific Debugging At its core, scientific debugging applies the principles of the scientific method to the process of finding and fixing software defects. Rather than attempting random fixes or relying on intuition, it encourages engineers to move systematically, guided by data, hypotheses, and controlled experimentation. By adopting debugging as a rigorous inquiry, we can reduce guesswork, speed up the resolution process, and ensure that our fixes are based on solid evidence. Just as a scientist begins with a well-defined research question, a software engineer starts by identifying the specific symptom or error condition. For instance, if our users report inconsistencies in the data they see across different parts of the application, our research question could be: _"Under what conditions does the application display outdated or incorrect user data?"_ From there, we can follow a structured debugging process that mirrors the scientific method: - 1. Observe and Define the Problem: First, we need to clearly state the bug's symptoms and the environment in which it occurs. We should isolate whether the issue is deterministic or intermittent and identify any known triggers if possible. Such a structured definition serves as the groundwork for further investigation. - 2. Formulate a Hypothesis: A hypothesis in debugging is a testable explanation for the observed behavior. For instance, you might hypothesize: _"The data inconsistency occurs because a caching layer is serving stale data when certain user profiles are updated."_ The key is that this explanation must be falsifiable; if experiments don't support the hypothesis, it must be refined or discarded. - 3. Collect Evidence and Data: Evidence often includes logs, system metrics, error messages, and runtime traces. Similar to reviewing primary sources in academic research, treat your raw debugging data as crucial evidence. Evaluating these data points can reveal patterns. In our example, such patterns could be whether the bug correlates with specific caching mechanisms, increased memory usage, or database query latency. During this step, it's essential to approach data critically, just as you would analyze the quality and credibility of sources in a research literature review. Don't forget that even logs can be misleading, incomplete, or even incorrect, so cross-referencing multiple sources is key. - 4. Design and Run Experiments: Design minimal, controlled tests to confirm or refute your hypothesis. In our example, you may try disabling or shortening the cache's time-to-live (TTL) to see if more recent data is displayed correctly. By manipulating one variable at a time - such as cache invalidation intervals - you gain clearer insights into causation. Tools such as profilers, debuggers, or specialized test harnesses can help isolate factors and gather precise measurements. - 5. Analyze Results and Refine Hypotheses: If the experiment's outcome doesn't align with your hypothesis, treat it as a stepping stone, not a dead end. Adjust your explanation, form a new hypothesis, or consider additional variables (for example, whether certain API calls bypass caching). Each iteration should bring you closer to a better understanding of the bug's root cause. Remember, the goal is not to prove an initial guess right but to arrive at a verifiable explanation. - 6. Implement and Verify the Fix: Once you're confident in the identified cause, you can implement the fix. Verification doesn't stop at deployment - re-test under the same conditions and, if possible, beyond them. By confirming the fix in a controlled manner, you ensure that the solution is backed by evidence rather than wishful thinking. - Personally, I consider implementing end-to-end tests (e.g., with Playwright) that reproduce the bug and verify the fix to be a crucial part of this step. This both ensures that the bug doesn't reappear in the future due to changes in the codebase and avoids possible imprecisions of manual testing. Now, we can explore these steps in more detail, highlighting how the scientific method can guide us through the debugging process. Establishing Clear Debugging Questions (Formulating a Hypothesis) A hypothesis is a proposed explanation for a phenomenon that can be tested through experimentation. In a debugging context, that phenomenon is the bug or issue you're trying to resolve. Having a clear, falsifiable statement that you can prove or disprove ensures that you stay focused on the real problem rather than jumping haphazardly between possible causes. A properly formulated hypothesis lets you design precise experiments to evaluate whether your explanation holds true. To formulate a hypothesis effectively, you can follow these steps: 1. Clearly Identify the Symptom(s) Before forming any hypothesis, pin down the specific issue users are experiencing. For instance: - "Users intermittently see outdated profile information after updating their accounts." - "Some newly created user profiles don't reflect changes in certain parts of the application." Having a well-defined problem statement keeps your hypothesis focused on the actual issue. Just like a research question in science, the clarity of your symptom definition directly influences the quality of your hypothesis. 2. Draft a Tentative Explanation Next, convert your symptom into a statement that describes a _possible root cause_, such as: - "Data inconsistency occurs because the caching layer isn't invalidating or refreshing user data properly when profiles are updated." - "Stale data is displayed because the cache timeout is too long under certain load conditions." This step makes your assumption about the root cause explicit. As with the scientific method, your hypothesis should be something you can test and either confirm or refute with data or experimentation. 3. Ensure Falsifiability A valid hypothesis must be falsifiable - meaning it can be proven _wrong_. You'll struggle to design meaningful experiments if a hypothesis is too vague or broad. For example: - Not Falsifiable: "Occasionally, the application just shows weird data." - Falsifiable: "Users see stale data when the cache is not invalidated within 30 seconds of profile updates." Making your hypothesis specific enough to fail a test will pave the way for more precise debugging. 4. Align with Available Evidence Match your hypothesis to what you already know - logs, stack traces, metrics, and user reports. For example: - If logs reveal that cache invalidation events aren't firing, form a hypothesis explaining why those events fail or never occur. - If metrics show that data served from the cache is older than the configured TTL, hypothesize about how or why the TTL is being ignored. If your current explanation contradicts existing data, refine your hypothesis until it fits. 5. Plan for Controlled Tests Once you have a testable hypothesis, figure out how you'll attempt to _disprove_ it. This might involve: - Reproducing the environment: Set up a staging/local system that closely mimics production. For instance with the same cache layer configurations. - Varying one condition at a time: For example, only adjust cache invalidation policies or TTLs and then observe how data freshness changes. - Monitoring metrics: In our example, such monitoring would involve tracking user profile updates, cache hits/misses, and response times. These metrics should lead to confirming or rejecting your explanation. These plans become your blueprint for experiments in further debugging stages. Collecting and Evaluating Evidence After formulating a clear, testable hypothesis, the next crucial step is to gather data that can either support or refute it. This mirrors how scientists collect observations in a literature review or initial experiments. 1. Identify "Primary Sources" (Logs, Stack Traces, Code History): - Logs and Stack Traces: These are your direct pieces of evidence - treat them like raw experimental data. For instance, look closely at timestamps, caching-related events (e.g., invalidation triggers), and any error messages related to stale reads. - Code History: Look for related changes in your source control, e.g. using Git bisect. In our example, we would look for changes to caching mechanisms or references to cache libraries in commits, which could pinpoint when the inconsistency was introduced. Sometimes, reverting a commit that altered cache settings helps confirm whether the bug originated there. 2. Corroborate with "Secondary Sources" (Documentation, Q&A Forums): - Documentation: Check official docs for known behavior or configuration details that might differ from your assumptions. - Community Knowledge: Similar issues reported on GitHub or StackOverflow may reveal known pitfalls in a library you're using. 3. Assess Data Quality and Relevance: - Look for Patterns: For instance, does stale data appear only after certain update frequencies or at specific times of day? - Check Environmental Factors: For instance, does the bug happen only with particular deployment setups, container configurations, or memory constraints? - Watch Out for Biases: Avoid seeking only the data that confirms your hypothesis. Look for contradictory logs or metrics that might point to other root causes. You keep your hypothesis grounded in real-world system behavior by treating logs, stack traces, and code history as primary data - akin to raw experimental results. This evidence-first approach reduces guesswork and guides more precise experiments. Designing and Running Experiments With a hypothesis in hand and evidence gathered, it's time to test it through controlled experiments - much like scientists isolate variables to verify or debunk an explanation. 1. Set Up a Reproducible Environment: - Testing Environments: Replicate production conditions as closely as possible. In our example, that would involve ensuring the same caching configuration, library versions, and relevant data sets are in place. - Version Control Branches: Use a dedicated branch to experiment with different settings or configuration, e.g., cache invalidation strategies. This streamlines reverting changes if needed. 2. Control Variables One at a Time: - For instance, if you suspect data inconsistency is tied to cache invalidation events, first adjust only the invalidation timeout and re-test. - Or, if concurrency could be a factor (e.g., multiple requests updating user data simultaneously), test different concurrency levels to see if stale data issues become more pronounced. 3. Measure and Record Outcomes: - Automated Tests: Tests provide a great way to formalize and verify your assumptions. For instance, you could develop tests that intentionally update user profiles and check if the displayed data matches the latest state. - Monitoring Tools: Monitor relevant metrics before, during, and after each experiment. In our example, we might want to track cache hit rates, TTL durations, and query times. - Repeat Trials: Consistency across multiple runs boosts confidence in your findings. 4. Validate Against a Baseline: - If baseline tests manifest normal behavior, but your experimental changes manifest the bug, you've isolated the variable causing the issue. E.g. if the baseline tests show that data is consistently fresh under normal caching conditions but your experimental changes cause stale data. - Conversely, if your change eliminates the buggy behavior, it supports your hypothesis - e.g. that the cache configuration was the root cause. Each experiment outcome is a data point supporting or contradicting your hypothesis. Over time, these data points guide you toward the true cause. Analyzing Results and Iterating In scientific debugging, an unexpected result isn't a failure - it's valuable feedback that brings you closer to the right explanation. 1. Compare Outcomes to the hypothesis. For instance: - Did user data stay consistent after you reduced the cache TTL or fixed invalidation logic? - Did logs show caching events firing as expected, or did they reveal unexpected errors? - Are there only partial improvements that suggest multiple overlapping issues? 2. Incorporate Unexpected Observations: - Sometimes, debugging uncovers side effects - e.g. performance bottlenecks exposed by more frequent cache invalidations. Note these for future work. - If your hypothesis is disproven, revise it. For example, the cache may only be part of the problem, and a separate load balancer setting also needs attention. 3. Avoid Confirmation Bias: - Don't dismiss contrary data. For instance, if you see evidence that updates are fresh in some modules but stale in others, you may have found a more nuanced root cause (e.g., partial cache invalidation). - Consider other credible explanations if your teammates propose them. Test those with the same rigor. 4. Decide If You Need More Data: - If results aren't conclusive, add deeper instrumentation or enable debug modes to capture more detailed logs. - For production-only issues, implement distributed tracing or sampling logs to diagnose real-world usage patterns. 5. Document Each Iteration: - Record the results of each experiment, including any unexpected findings or new hypotheses that arise. - Through iterative experimentation and analysis, each cycle refines your understanding. By letting evidence shape your hypothesis, you ensure that your final conclusion aligns with reality. Implementing and Verifying the Fix Once you've identified the likely culprit - say, a misconfigured or missing cache invalidation policy - the next step is to implement a fix and verify its resilience. 1. Implementing the Change: - Scoped Changes: Adjust just the component pinpointed in your experiments. Avoid large-scale refactoring that might introduce other issues. - Code Reviews: Peer reviews can catch overlooked logic gaps or confirm that your changes align with best practices. 2. Regression Testing: - Re-run the same experiments that initially exposed the issue. In our stale data example, confirm that the data remains fresh under various conditions. - Conduct broader tests - like integration or end-to-end tests - to ensure no new bugs are introduced. 3. Monitoring in Production: - Even with positive test results, real-world scenarios can differ. Monitor logs and metrics (e.g. cache hit rates, user error reports) closely post-deployment. - If the buggy behavior reappears, revisit your hypothesis or consider additional factors, such as unpredicted user behavior. 4. Benchmarking and Performance Checks (If Relevant): - When making changes that affect the frequency of certain processes - such as how often a cache is refreshed - be sure to measure the performance impact. Verify you meet any latency or resource usage requirements. - Keep an eye on the trade-offs: For instance, more frequent cache invalidations might solve stale data but could also raise system load. By systematically verifying your fix - similar to confirming experimental results in research - you ensure that you've addressed the true cause and maintained overall software stability. Documenting the Debugging Process Good science relies on transparency, and so does effective debugging. Thorough documentation guarantees your findings are reproducible and valuable to future team members. 1. Record Your Hypothesis and Experiments: - Keep a concise log of your main hypothesis, the tests you performed, and the outcomes. - A simple markdown file within the repo can capture critical insights without being cumbersome. 2. Highlight Key Evidence and Observations: - Note the logs or metrics that were most instrumental - e.g., seeing repeated stale cache hits 10 minutes after updates. - Document any edge cases discovered along the way. 3. List Follow-Up Actions or Potential Risks: - If you discover additional issues - like memory spikes from more frequent invalidation - note them for future sprints. - Identify parts of the code that might need deeper testing or refactoring to prevent similar issues. 4. Share with Your Team: - Publish your debugging report on an internal wiki or ticket system. A well-documented troubleshooting narrative helps educate other developers. - Encouraging open discussion of the debugging process fosters a culture of continuous learning and collaboration. By paralleling scientific publication practices in your documentation, you establish a knowledge base to guide future debugging efforts and accelerate collective problem-solving. Conclusion Debugging can be as much a rigorous, methodical exercise as an art shaped by intuition and experience. By adopting the principles of scientific inquiry - forming hypotheses, designing controlled experiments, gathering evidence, and transparently documenting your process - you make your debugging approach both systematic and repeatable. The explicitness and structure of scientific debugging offer several benefits: - Better Root-Cause Discovery: Structured, hypothesis-driven debugging sheds light on the _true_ underlying factors causing defects rather than simply masking symptoms. - Informed Decisions: Data and evidence lead the way, minimizing guesswork and reducing the chance of reintroducing similar issues. - Knowledge Sharing: As in scientific research, detailed documentation of methods and outcomes helps others learn from your process and fosters a collaborative culture. Ultimately, whether you are diagnosing an intermittent crash or chasing elusive performance bottlenecks, scientific debugging brings clarity and objectivity to your workflow. By aligning your debugging practices with the scientific method, you build confidence in your solutions and empower your team to tackle complex software challenges with precision and reliability. But most importantly, do not get discouraged by the number of rigorous steps outlined above or by the fact you won't always manage to follow them all religiously. Debugging is a complex and often frustrating process, and it's okay to rely on your intuition and experience when needed. Feel free to adapt the debugging process to your needs and constraints, and as long as you keep the scientific mindset at heart, you'll be on the right track....

May 9, 2025

13 mins

JavaScript

Integrating Next.js with New Relic

Integrating Next.js with New Relic When it comes to application monitoring, New Relic is genuinely one of the veterans in the market. Founded in 2008, it offers a large set of tools designed for developers to help them analyze application performance, troubleshoot problems, and gain insight into user behavior through real-time analytics. We needed to integrate New Relic's application monitoring with a modern Next.js app built on top of the app router. This blog post will explain your options if you want to do the same in your project. Please keep in mind that depending on when you read this blog post, there may be some advances in how New Relic works with Next.js. You can always refer to the official New Relic website for up-to-date documentation. The approach is also different, depending on whether you use Vercel's managed Next.js hosting (or any other provider that is lambda-based) or self-host Next.js on an alternative provider such as Fly.io. We'll start with the typical steps you need to do for both ways of deploying; then we'll continue with the self-hosted option, which uses a complete set of features from New Relic, and finish with Vercel where we recommend the OpenTelemetry integration for now. Common Prerequisite Steps First, let's install the necessary packages. These were built by New Relic and are needed whether you self-host your Next.js app or host it on Vercel. ` The next thing you'll want to do is define your app name. The name will be different depending on the environment the app is running in, such as a local environment, staging, or production. This will ensure that the data you send to New Relic is always tagged with a proper app name and will not be mixed between environments. An environment-changing parameter, such as the app name, should always be included in an environment variable. In your local environment, that would be in .env.local, while staging/production or any other hosted environments would be configured in the provider. ` To get full coverage with New Relic, in Next.js, we need to cover the parts running on the client (in the browser) and on the server. The server-running part is different on Vercel's managed hosting, where all of the logic is executed within lambdas, and self-hosting variants, where the server logic is executed on a Node.js server. The client part is configured by going to the New Relic dashboard and clicking "Add Data" / "Browser Monitoring". Choose "place a JavaScript snippet in frontend code," name it the same name you used in the environment variable above, click "Continue" to the last step, and copy the part between the script tags: ` Then paste it to a variable inside any module of your code. For example, this is how we did it: ` Now, create a component that will use this script and put it to Next's Script component. Then you can use the component in your root layout: ` There are several things to note here. First, we need to set a unique ID for the Script component for Next.js to track and optimize the script. We also need to set the strategy to beforeInteractive so that the New Relic script is added to the document’s head element. Finally, we use some environment helper variables. IS_PRODUCTION and IS_STAGING are just helper variables that read the environment variables to determine if we are in a local environment, staging, or production. Now you can create an app router error page and report any uncaught errors to New Relic: ` With this in place, the client part is covered, and you should start seeing data in the New Relic dashboard under "All Entities" / "Browser Applications." Let's now focus on the server part, with both a self-hosted option and Vercel. Self-Hosting Next.js If you self-host Next.js, It will typically run through a Node server (containerized or not), meaning you can fully utilize New Relic's APM agent. Start by adding this to the Next.js config (next.config.js) to mark the newrelic package as external, otherwise you would not be able to import it. ` The next step that New Relic recommends is to copy the newrelic.js file from the newrelic library to the root of the folder. However, we want to keep our project root clean, and we'll store the file in o11y/newrelic.js (“o11y” is an abbreviation for observability). This will also demonstrate the configuration needed to make this possible. This file, by default, will look something like this. ` As you can see, some environment variables are involved, such as NEW_RELIC_APP_NAME and NEW_RELIC_LICENSE_KEY. These environment variables will be needed when the New Relic agent starts up. The trick here is that the agent doesn't start up with your app but *earlier* by preloading the library through another environment variable, NODE_OPTIONS. On the server, that would be done by modifying the start script to this: ` This ensures the agent is preloaded before starting the app, and the rest of the environment variables (such as NEW_RELIC_APP_NAME and NEW_RELIC_LICENSE_KEY) should already exist. Making sure this works in local development is not that easy due to different stages of loading environment variables. The environment variables from .env.local are loaded by the framework, which is too late for the New Relic agent (which is preloaded by the Node process). However, it is possible with a trick. First, add the following new environment variables to .env.local: ` (We have not mentioned NEW_RELIC_HOME so far, but the New Relic agent implicitly uses it to locate the newrelic.js file. We need to specify that since we moved it to the o11y folder.) Next, install the nodenv package. ` The nodenv package provides a cross-platform way of preloading environment variables before we start the Node process. If we provide it with the .env.local file, this means all of the environment variables from that file will be available to the New Relic agent! Change the scripts in package.json: ` And that's it. If you now start the local server using the dev command, nodenv will make environment variables available to the Node process. The Node process will preload the New Relic agent (because we have NODE_OPTIONS in .local.env), the agent will read all the environment variables needed by newrelic.js, and everything will work as expected. The first thing you might notice is the creation of the newrelic_agent.log. This is just a log file by the agent which you can use to troubleshoot if the agent is not working correctly. You can safely add this file to .gitignore. From this point on, the server-side part of the app can utilize New Relic's API (such as noticeError and recordLogEvent) and the agent will automatically be recording transactions made through the server. The data should be visible in the New Relic dashboard, under "All Entities" / "Services - APM". This is how you would log an error in your server action, for example: ` The full source of this type of integration is available on StackBlitz. Using Vercel's Managed Next.js Hosting Using the New Relic APM agent on Vercel is currently not possible, as the server part of Next.js runs within lambdas, and not within a typical Node server as in the self-hosted option. In the future, this may be possible. At the moment, the only way to integrate Vercel and New Relic is by using the Vercel OpenTelemetry collector, which is only available for some plans. The first step is to enable the integration in Vercel's dashboard and then install the following packages: ` Create an instrumentation.ts file in the root of your project and add the following code: ` In the the Next.js config (next.config.js), add: ` The OpenTelemetry integration does not work locally, hence the above conditional check. With the above configuration, the collected OpenTelemetry data should be visible in the New Relic dashboard under "All Entities" / "Services - OpenTelemetry". Optionally, Logging via New Relic API When using the Vercel OpenTelemetry integration, there is no easy way to submit custom logs to New Relic. New Relic offers a logging API, however, so to send custom logs from your backend, you could use a logger such as Winston, which has good library support for custom transports, such as the one for New Relic. Install the following packages: ` Then create a logger.ts file with the following code: ` NewRelicTransport utilizes the New Relic logging API to send application logs. Now you can use the serverLogger in your backend code, from server components to server actions. Conclusion As you can see, various options exist for integrating New Relic with the modern, app-router-based Next.js. Sadly, at the time of writing, no one-size-fits-all solution works equally for all hosting options, as Vercel's managed Next.js platform is still relatively young, and so is the app router in Next.js. If you liked this blog post, feel free to browse our other Next.js blog posts for more tips and tricks in other areas of the Next.js ecosystem....

Jul 19, 2024

7 mins

NextJS

Roo Custom Modes

Roo Custom Modes Roo Code is an extension for VS Code that provides agentic-style AI code editing functionality. You can configure Roo to use any LLM model and version you want by providing API keys. Once configured, Roo allows you to easily switch between models and provide custom instructions through what Roo calls "modes." Roo Modes can be thought of as a "personality" that the LLM takes on. When you create a new mode in Roo, you provide it with a description of what personality Roo should take on, what LLM model should be used, and what custom instructions the mode should follow. You can also define workspace-level instructions via a .roo/rules-{modeSlug}/ directory at your project root with markdown files inside. Having different modes allows developers to quickly fine-tune how the Roo Code agent performs its tasks. Roo ships out-of-the-box with some default modes: Code Mode, Architect Mode, Ask Mode, Debug Mode, and Orchestrator Mode. These can get you far, but I have expanded on this list with a few custom modes I have made for specific scenarios I run into every day as a software engineer. My Custom Modes 📜 Documenter Mode I created this mode to help me with generating documentation for legacy codebases my team works with. I use this mode to help produce documentation interactively with me while I read a codebase. Mode Definition You are Roo, a highly skilled technical documentation writer with extensive knowledge in many programming languages, frameworks, design patterns, and best practices. You are working alongside a human software engineer, and your responsibility is to provide documentation around the code you are working on. You will be asked to provide documentation in the form of comments, markdown files, or other formats as needed. Mode-specific Instructions You will respect the following rules: * You will not write any code, only markdown files. * In your documentation, you will provide references to specific files and line numbers of code you are referencing. * You will not attempt to execute any commands. * You will not attempt to run the application in the browser. * You will only look at the code and infer functionality from that. 👥 Pair Programmer Mode I created a “Pair Programmer” mode to serve as my personal coding partner. It’s designed to work in a more collaborative way with a human software engineer. When I want to explore multiple ideas quickly, I switch to this mode to rapidly iterate on code with Roo. In this setup, I take on the role of the navigator—guiding direction, strategy, and decisions—while Roo handles the “driving” by writing and testing the code we need. Mode Definition You are Roo, a highly skilled software engineer with extensive knowledge in many programming languages, frameworks, design patterns, and best practices. You are working alongside a human software engineer who will be checking your work and providing instructions. If you get stuck, ask for help and we will solve problems together. Mode-specific Instructions You will respect the following rules: * You will not install new 3rd party libraries without first providing usage metrics (stars, downloads, latest version update date). * You will not do any additional tasks outside of what you have been told to do. * You will not assume to do any additional work outside of what you have been instructed to do. * You will not open the browser and test the application. Your pairing partner will do that for you. * You will not attempt to open the application or the URL at which the application is running. Assume your pairing partner will do that for you. * You will not attempt to run npm run dev or similar commands. Your pairing partner will do that for you. * You will not attempt to run a development server of any kind. Your pairing partner will handle that for you. * You will not write tests unless instructed to. * You will not make any git commits unless explicitly told to do so. * You will not make suggestions of commands to run the software or execute the test suite. Assume that your human counterpart has the application running and will check your work. 🧑‍🏫 Project Manager I created this mode to help me write tasks for my team with clear and actionable acceptance criteria. Mode Definition You are a professional project manager. You are highly skilled in breaking down large tasks into bite-sized pieces that are actionable by an engineering team or an LLM performing engineering tasks. You analyze features carefully and detail out all edge cases and scenarios so that no detail is missed. Mode-specific Instructions Think creatively about how to detail out features. Provide a technical and business case explanation about feature value. Break down features and functionality in the following way. The following example would be for user login: User Login: As a user, I can log in to the application so that I can make changes. This prevents anonymous individuals from accessing the admin panel. Acceptance Criteria * On the login page, I can fill in my email address: * This field is required. * This field must enforce email format validation. * On the login page, I can fill in my password: * This field is required. * The input a user types into this field is hidden. * On failure to log in, I am provided an error dialog: * The error dialog should be the same if the email exists or not so that bad actors cannot glean info about active user accounts in our system. * Error dialog should be a red box pinned to the top of the page. * Error dialog can be dismissed. * After 4 failed login attempts, the form becomes locked: * Display a dialog to the user letting them know they can try again in 30 minutes. * Form stays locked for 30 minutes and the frontend will not accept further submissions. 🦾 Agent Consultant I created this mode for assistance with modifying my existing Roo modes and rules files as well as generating higher quality prompts for me. This mode leverages the Context7 MCP to keep up-to-date with documentation on Roo Code and prompt engineering best practices. Mode Definition You are an AI Agent coding expert. You are proficient in coding with agents and defining custom rules and guidelines for AI powered coding agents. Your specific expertise is in the Roo Code tool for VS Code are you are exceptionally capable at creating custom rules files and custom mode. This is your workflow that you should always follow: 1. 1. Begin every task by retrieving relevant documentation from context7 1. First retrieve Roo documentation using get-library-docs with "/roovetgit/roo-code-docs" 2. Then retrieve prompt engineering best practices using get-library-docs with “/dair-ai/prompt-engineering-guide" 2. Reference this documentation explicitly in your analysis and recommendations 3. Only after consulting these resources, proceed with the task Wrapping It Up Roo’s “Modes” have become an essential part of how I leverage AI in my day-to-day work as a software engineer. By tailoring each mode to specific tasks—whether it’s generating documentation, pairing on code, writing project specs, or improving prompt quality—I’ve been able to streamline my workflow and get more done with greater clarity and precision. Roo’s flexibility lets me define how it should behave in different contexts, giving me fine-grained control over how I interact with AI in my coding environment. Roo also has the capability of defining custom modes per project if that is needed by your team. If you find yourself repeating certain workflows or needing more structure in your interactions with AI tools, I highly recommend experimenting with your own custom modes. The payoff in productivity and developer experience is absolutely worth it....

Jun 13, 2025

6 mins

AIRoo Code

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

Seeding Initial Data in Amplify

Challenges With Amplify

Prerequisites

Extracting Information From Amplify Environment

Writing the Script

Organizing Seed Data

DynamoDB's DocumentClient

Conclusion

Dario Djuric

You might also like

Next.js + MongoDB Connection Storming

The Importance of a Scientific Mindset in Software Engineering: Part 2 (Debugging)

Integrating Next.js with New Relic

Roo Custom Modes

Let's innovate together!

You might also like

Next.js + MongoDB Connection Storming

The Importance of a Scientific Mindset in Software Engineering: Part 2 (Debugging)

Integrating Next.js with New Relic

Roo Custom Modes