General

What is a Monorepo and What Are the Advantages for Using It in Your Project?

Published Sep 23, 2022

Updated Feb 13, 2023

6 min read

This article was written over 18 months ago and may contain information that is out of date. Some content may be relevant but please refer to the relevant official documentation or available resources for the latest information.

Monorepos are very popular in the tech industry, and many large companies like Microsoft and Google use them. But what exactly is a monorepo, and would it work for your project?

In this article, we are going to explore what the monorepo architecture is, and learn about some of the commonly used tools for monorepos.

What is a monorepo?
A look into the starter.dev monorepo
Conclusion

What is a monorepo?

A monorepo is a single version controlled repository that contains multiple independent projects. Some monorepos might contain just 2 or 3 projects while others will contain hundreds of projects using different technologies. This differs from the polyrepo structure where the projects are spread out amongst multiple repositories. It is also important to note that a monorepo is not the same as a monolith. A monolith has multiple sub-projects inside one giant project while monorepos have multiple independent projects inside one repository.

Unified build process and CI/CD process

In a monorepo structure, all of CI/CD(Continuous integration/Continuous delivery) lives in the same repository. This provides teams with more flexibility, and allows you to deploy your services together. You can also choose to deploy your projects separately.

Increased cross team communication

In a polyrepo structure, you will have multiple teams working on different projects spread across many repositories. Often times these teams will not be aware of the status of other repositories outside of their main one. With a monorepo structure, teams are more aware of the changes being made to the projects used in the repo and might spot issues in another projects.

When all of your projects are under a single repository, it makes it easier to establish a consistent code style and set of guidelines across all projects. Some of the guidelines can include naming conventions, best practices for code reviews, branching policies, etc.

In situations where there are breaking changes in the main branch, a monorepo can help identify where the breaking change came from and that team would be responsible for fixing it. A monorepo can also help teams discuss different versioning strategies for the projects.

Steeper learning curve and issues of inclusivity for newer developers

Inclusivity of all developers on a team is really important for the success and outcome of a software project. When you have many diverse perspectives and levels it will lead to a stronger finished product. But a monorepo structure could be seen as intimidating to novice developers. It might be their first time seeing a list of independent projects within the same repository and they might not know how to best contribute in a meaningful way especially if it is an open source situation. It can also be difficult for newer developers to understand the git history under a monorepo structure. If you are using a tool like git blame,it will have to sift through a lot of unrelated commits just to generate that information for you. For these reasons, it is important for the team to help newer developers through the monorepo structure and create issues where all levels can meaningfully contribute.

A look into the starter.dev monorepo

Let's take a look at a couple of This Dot Labs open source monorepos and understand why the decision was made to use a monorepo.

Please note: We will not be exploring all of the possible tools used for monorepos. If you are interested in exploring more monorepo tools, then please checkout this curated list.

What is starter.dev?

starter.dev makes it easy to setup a new project with zero configuration involved. Once you install the starter.dev package(npm i @this-dot/create-starter), you can run the npx @this-dot/create-starter command and it will provide you with a list of starter kits to choose from. Each starter kit includes testing, linting, code examples and a basic configuration setup.

starter.dev is an open source project that consists of a documentation website repository and a GitHub showcases repository dedicated to all of the different starter kits.

Why did the team choose a monorepo structure?

When it came time to plan out the project, the team had to decide if they wanted to maintain each kit in isolation or create different folders for each kit under the same repository. The reason for using a monorepo structure is to provide the team the ability to build the CLI(command-line interface) with greater ease and allow the team to still build the kits in isolation.

What are Yarn Workspaces?

The documentation website repository uses Yarn Workspaces which allows you to setup multiple packages and use the yarn install command to install them all at once.

In the root directory, we have starters and packages directories. If we look at the starters directory, we will see all of the starter kits. Each starter kit will have its own package.json, basic configuration, README and code examples setup.

Inside the root package.json file, you will notice this "workspaces": [ "packages/*"] key.

{
  "name": "starter.dev",
  "version": "0.1.0",
  "private": true,
  "scripts": {
    "lint": "yarn workspace website lint && yarn workspace @this-dot/create-starter lint",
    "build:cli": "yarn workspace @this-dot/create-starter build",
    "build:website": "yarn workspace website build",
    "start:cli": "yarn workspace @this-dot/create-starter start",
    "start:website": "yarn workspace website preview",
    "dev:website": "yarn workspace website dev",
    "watch:cli": "yarn workspace @this-dot/create-starter watch"
  },
  "workspaces": [
    "packages/*"
  ]
}

That will tell Yarn to import all of the packages that are listed inside of the packages folder.

Unified CI/CD pipeline with Amplify

The starter.dev GitHub showcases repository has a unified CI/CD and uses Amplify. Inside the root directory, there is an Amplify yaml file where each application has their own set of build commands. Each project listed in the monorepo has its own deployment process setup.

version: 1
applications:
  - appRoot: next-react-query-tailwind
    frontend:
      phases:
        preBuild:
          commands:
            - nvm install --lts=gallium
            - yarn install
        build:
          commands:
            - yarn run build
      artifacts:
        baseDirectory: .next
        files:
          - "**/*"
      cache:
        paths:
          - node_modules/**/*

  - appRoot: angular-apollo-tailwind
    frontend:
      phases:
        preBuild:
          commands:
            - nvm install --lts=gallium
            - yarn install
            - yarn generate
        build:
          commands:
            - node -v
            - yarn run build
      artifacts:
        baseDirectory: dist/starter-dev-angular
        files:
          - "**/*"
      cache:
        paths:
          - node_modules/**/*

How does the team manage issues?

If you look at the issues tab, you will notice that all of the developers are following this same naming convention for issues: [starter kit] title for issue.

This works well for a variety of reasons:

Any developer interested in contributing to the project, can see at a quick glance which issues fit their skill and interest level.
It will be easier to track the status of the issues on a project board and see which areas need more attention.

Conclusion

We have explored the monorepo structure and talked about a few features. We have also compared it to other architectures like monoliths and polyrepos to see how if differs.

Then we took an in depth look into starter.dev and how it uses Yarn Workspaces to help organize the project.

I hope your enjoyed this article and best of luck on your programming journey.

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

About the author(s)

Jessica Wilkins
Jessica Wilkins is a classical musician turned Software Engineer. Prior to joining the tech industry, she spent her time running her own sheet music company (JDW Sheet Music) as well as performing and teaching in Los Angeles, CA. She enjoys working with React and TypeScript. She is also a prolific technical writer for freeCodeCamp.
@codergirl1991 @jdwilkin4

Leveling Up Your Work Through Architecture Design and Time Estimation

> This post can be useful for developers at any level! However, it is written mostly for entry-level developers who are starting to transition into a more intermediate role. You’re rocking your first development job. You’re completing tasks, the team loves you, you still have tons of questions but you can get answers and build what you’re asked to build. But you want to keep improving your skills and giving yourself more options. Some of that skill can only come from actually building things over time. But surely there’s something you can do to level up besides that, right? There is! One way to help deepen your understanding of the software you’re writing and make yourself look more impressive is by improving your ability to estimate your time, and how you talk about your work. That’s what we’re going to dig into today! Some terminology You might have heard of, or have read terms like “software architecture” and “quality attributes” and, upon trying to look into them, been met with a LOT of jargon and dense terminology. So before we get too far into this, let’s go over a few terms we’ll use. (This is by no means meant to be a complete description of these terms. The aim is to be clear enough to continue our discussion here, and let you get started.) - Software architecture: This term means how your system is organized. It’s like the blueprint for the application or website you’re building. How do all the pieces we build fit together and interact with each other? What’s the main goal we’re trying to achieve with this codebase? That’s what this term is talking about. - Software design: Once we have the blueprint, we need to figure out how to build the pieces to make that structure come to life. This is where the design comes in. It’s more code-specific and focuses more on the specifics of what we need to build and how. What languages are we using? How do we get this component to be interactive or usable like we want? This is all the design. - Software quality attributes: Sometimes referred to as “ilities” because a lot of them end with those letters, these are words we use to describe the software we write. Common ones would be accessibility, reliability, or performance. These are words used to describe some of the key qualities we want our software to have. You can start with this list of quality attributes or do a search for “software quality attributes”, and find all sorts of articles covering a lot of the common ones. - Time estimation: This is the skill of being able to consider a task and estimate how long it would take you to complete the task. Time can be a funny thing, and it’s pretty common to think something will be easy and have it take way longer, or think something will be complex, and it turns out to be straightforward. - Trade-offs: This is the idea that two things can’t be equally balanced and/or important. There’s a joke about how people want things fast, cheap, and good and you can’t have all three. This is the same idea. Very often, by increasing one quality, you have to decrease another. The balance of those, and the choice to focus on one quality more than another, is called a trade-off. You’re trading the strength of one quality for another. When we’re using the phrase “architecture design” in this article, we’re combining all these concepts together. How do we get started? Ok, so we’ve got some terms now. But what do we do with them? How do we get better at designing software, and estimating our time? I’ve got two templates to share with you that can help you start to develop these skills. Both will involve writing, but I’ve tried to make them as straightforward to fill out as possible so it’s not a chore to use them. I’ll share links for the templates later on. But first, we’ll go over the actual contents of both of them and talk about how to use them. These templates can be useful for any task, no matter how complex. However, super simple tasks like fixing a broken link on a page, or adjusting a color value to improve its contrast may not be the best choice for it. It is useful to use this template on a task that you fully understand, so you get some practice with thinking about the tradeoffs you’re making, and get used to some of the terminology and how to talk about the software you build. But this can really help when you start getting larger tasks or features- ones that might contain multiple parts working together or some complex logic to get the task working properly. The architecture design template For this template, the goal is to complete this first before you do any work on your task. Use this as a tool to help you think through your work before you get started on it so you can have a better understanding of what the task is asking of you, and let you start thinking about the attributes you’re building for. There are 7 sections for this document, each with a title and description. Let’s go over each piece to see how they work. Big picture: What’s the ask? *What’s the ticket/task trying to solve? What’s the end goal of this body of work? Describe the problem in the most ELI5 way you can.* Use this section to describe the task you’re working on. Act like you’re talking to someone with very little understanding of the situation, like a project manager on another team or to a friend of yours that doesn’t work with you. This section helps you make sure you have a strong handle on the work you’re being asked to do, and what the end result should be. Requirements *Is there acceptance criteria? A specific way it needs to work? What makes this ticket / task count as complete? Steps to reproduce and/or screenshots are helpful here too, if available.* Some of this can often be copied over from the ticket you’re working on. Make note of how you can tell that this task is complete, and any specifics that you need to make sure work as expected when you’re ready to have someone review it. Constraints *Are there specific things you can / can’t use? Something you have to make sure doesn’t break? Tools you have to use because of where you have to work in the codebase?* Making note of constraints can be super helpful. Maybe you have to use a specific UI library to complete this because the rest of the project uses it. Or perhaps this task has been attempted before, and you don’t want to repeat a version that didn’t work. Getting used to thinking about the constraints you’re working within can help prep your brain to think around these concepts. Architecture and Tradeoffs *Where does this code live? (login, ui, db, etc) What software quality attributes are we focused on for this task? List out the primary and secondary attributes (maybe a third if it feels necessary).* In this section, we’re focusing on developing our architecture and design skills in detail. Talk a little about how the codebase is structured, and where the work for this task will be located. Also, pick one or two software quality attributes that you think fit this task. Does the work you’re doing help improve the site’s overall accessibility? Does it make the application more reliable for end users? Does it help to keep our codebase’s maintainability within a reasonable level? Use one of the lists above or your own searches to build out a list of the attributes you might build for regularly, and pick out one or two of them that relate the most to this task. Then, talk a little about why you picked those traits and your thoughts about why they’re applicable. Being able to explain your reasoning here will both help you gain more understanding of the work you’re doing, and the words you’re using to describe it, and give whoever reads this document a better glimpse into the work you do and how you think about it! Initial Gameplan *Considering all the above, what seems like the path forward? What are the steps for how you think it can be solved/completed? Where are you starting?* Now that you’ve spent a little time thinking through the task, and some of the things you should remember about it (like how it’s structured, the tools you have to use, and what’s important to keep in mind while you’re solving it) - write out the steps you think you should take to accomplish the task. It’s perfectly okay to not fully follow this plan as you start to build! We can never fully predict everything that might happen as we start to work in a codebase. The goal here is to give yourself a solid place to start, with all the details we’ve listed fresh in your mind. Process Notes *As you solve this - keep this area updated with rough notes. What did you try that didn’t work? Why not? If you have to pivot, what did you switch to and why? Maybe something worked but you still pivoted - what tradeoffs were made, and what’s the reasoning behind it?* As the description implies, this space is all for you. Try to keep notes as you go. If you do have to change course from your game plan, why and what did you change? Did you find another problem you didn’t even know existed? Or maybe you were able to try a new concept out and it worked! Celebrate your process here. None of this needs to be in complete sentences or easy for anyone but you to understand. This space is all yours to help you keep thinking through the work as you’re doing it! Final Solution *You did it! Form a narrative. Now that you’ve got the issue solved, what does it look like? How did you do it? What did you learn along the way? Is the final solution the same as what you thought, or did the ask pivot along the way? Share screenshots if available.* The final wrap-up! Try to write this similarly to how you started with the big picture. Did your initial plan end up working, and if not why? Were there any interesting plot twists throughout the process? Share the wins, the struggles, and the end result here. Screenshots can be perfect here to see a visual of your work! I also typically leave a little space at the bottom to wrap up any lessons I learned for myself, and leave a little room to talk about my time estimate and reasonings behind how it turned out the way it did. But those are completely optional. --- Because I use a Notion database to help me keep track of these documents, I also have a few properties I can fill out related to the project this document is for, what my time estimate and result were, and the quality attributes I selected. Having those visible at a glance is super helpful for me. The biggest thing I’d recommend to track with this is the date you filled it out. Having that date to help you organize and find your documents is super helpful as your collection grows. I typically title mine based on the project it’s for and the name of the task, but you can name yours whatever you’d like! Now let’s talk about how to track our time for this task. The time estimate template This one is a lot less detailed, but just as important! Being able to improve your knowledge of how long different tasks take you will be a super handy skill as your career continues. The main idea with this template is to keep track of two sections: how long you think something will take you (the estimate), and how long it truly takes you. Our goal is to get those two sections to be as close to the same number as possible. Remember that our goal here is NOT perfection. That’s impossible to reach. Our goal is simply to get them to be close to each other. Most people (even managers) understand that an estimate is not a guarantee. But the closer you can get your estimates to the true time it takes you, the more reliable you seem and the more accurately your team can make progress. There are three main sections to this template. The actual numbers Most importantly, we want an estimated total time and an actual total time, as well as a calculation of the difference between those two numbers. Is our estimate higher, or is the actual total time higher? That difference is what we’re trying to keep as close to zero as possible. I like to break my time tracking down into categories, so I can also see if particular parts of my work take me longer (or shorter) than I think. For each of these, I do the same thing: make an estimate, and record the actual value. The categories I like to track are: Investigation: time to fill out my architecture design, do a little looking into the task, make sure I understand fully what I’m being asked to do, and that I have all the information I need to complete it. Coding: time to do the actual work. Most of my time calculation goes into this category. Testing: I use this for either writing actual unit or end-to-end tests, or for manual testing. This is where I double-check to make sure I didn’t break anything or track time spent on that one last piece of functionality that doesn’t quite work right. Review: any feedback I get on my Pull Requests or code reviews that require me to make changes go into this category. Space to record the numbers as I go The Pomodoro technique works really well for me with time tracking, though I change up my “working time” numbers to make them easier for me to calculate. I’ve found the most straight forward way for me to do this is to have a title for each section of time that I’m tracking, with space underneath each one. Then, I have a legend of colored dot emojis related to a different amount of time: 15 minutes, 30 minutes, and 60 minutes. Then, as I do my rounds of working time, each time I’m done with a round, I select the right colored dot for the amount of time I did, copy it, and paste it under the section the work I did belongs to. Keep repeating this until the work is complete! Once your task is done, all you need to do is count your dots, and record the total number they represent in your actual time section from above. Here’s a screenshot of what one of my completed sections look like, so you can better see what I mean. From this tracking section, I can quickly see that I spent 15 minutes on both testing and review, and an hour and 45 minutes on coding. If I don’t happen to spend any time on a section (in this instance, I’d done some investigating in a separate ticket, so I already knew the work that needed to be done), I just leave it blank. I have a section at the top of my page for tracking the total number for each category. So I use this area to keep track as I’m doing small rounds of work. Then, when I’m done, I add up each number and put it in the section at the top of the page for that category. Notes I also have a section for notes at the bottom of my document. I keep this area for anything relating to my time tracking. Did something break unexpectedly, causing my estimate to be off? Did I not need a section for some reason? Or did something work way better than I thought it would? Those are the kinds of things worth keeping notes of here. Can you tell why your estimate and actual times were off? Sometimes we simply don’t know, but being able to keep notes on the things we can realize as we’re doing them helps us get better the next time! Tying these together and talking about it with others You can use both of these templates together or on their own, and you’ll gain a great amount of knowledge about your skill level and your growth over time from them! But they can also really shine using them together. While these are great tools for your own personal knowledge and growth, I also highly recommend sharing them with someone else. It’s a great idea to set a goal to be better with estimating your time, and sharing that goal with your manager. Then, you can use the time estimation template to build up some estimates, and share those with your manager. Maybe you have a senior developer on your team that you really respect, and you’d like to get their opinion on the task you’re working on. If you’ve filled out the architecture design for that task, you can share it with them and have a conversation about it. They can potentially provide you with things to consider for your next task, or get you thinking more deeply about the quality attributes you selected, and why they may or may not have been the best fit for your work. The design documents are also great to share with your manager! It’s a great way to show that you’re starting to think about your work on a deeper level and starting to consider the quality and complexity of your work. It can also be helpful reference material for them when promotions and new work becomes available. They’re more likely to think you might be a good pick for the next big thing if you’re already showing them you’re starting to think about things at a higher level! The Notion template links The links for both of these documents as Notion templates are below. Please feel free to duplicate a copy for yourself if you like using Notion, or just take a peek at them to see the actual layout of them and adapt that for whatever tool works best for you. These documents are set up to be used within a Notion database. We won’t cover that in detail here, but the Notion documentation should be able to take care of those details for you (and we have a link to get you started below as well). In general, you can create a database within Notion, and then set a template to use for each new entry. That way, when you go to create a new item, it will automatically pull up these templates for you, so you don’t have to copy and paste them every time! And the fields at the top will be visible when you go to look at your database, which makes it a little nicer to get a quick overview of your documents and the progress you’re making. Now go forth and deepen your knowledge! - Time estimation template - Architecture design template - Notion database templates documentation...

Jan 6, 2023

14 mins

Career developmentArchitectureNotion

The Dangers of ORMs and How to Avoid Them

Background We recently engaged with a client reporting performance issues with their product. It was a legacy ASP .NET application using Entity Framework to connect to a Microsoft SQL Server database. As the title of this post and this information suggest, the misuse of this Object-Relational Mapping (ORM) tool led to these performance issues. Since our blog primarily focuses on JavaScript/TypeScript content, I will leverage TypeORM to share examples that communicate the issues we explored and their fixes, as they are common in most ORM tools. Our Example Data If you're unfamiliar with TypeORM I recommend reading their docs or our other posts on the subject. To give us a variety of data situations, we will leverage a classic e-commerce example using products, product variants, customers, orders, and order line items. Our data will be related as follows: The relationship between customers and products is one we care about, but it is optional as it could be derived from customer->order->order line item->product. The TypeORM entity code would look as follows: ` With this example set, let's explore some common misuse patterns and how to fix them. N + 1 Queries One of ORMs' superpowers is quick access to associated records. For example, we want to get a product and its variants. We have 2 options for writing this operation. The first is to join the table at query time, such as: ` Which resolves to a SQL statement like (mileage varies with the ORM you use): ` The other option is to query the product variants separately, such as: ` This example executes 2 queries. The join operation performs 1 round trip to the database at 200ms, whereas the two operations option performs 2 round trips at 100ms. Depending on the location of your database to your server, the round trip time will impact your decision here on which to use. In this example, the decision to join or not is relatively unimportant and you can implement caching to make this even faster. But let's imagine a different situation/query that we want to run. Let's say we want to fetch a set of orders for a customer and all their order items and the products associated with them. With our ORM, we may want to write the following: ` When written out like this and with our new knowledge of query times, we can see that we'll have the following performance: 100ms * 1 query for orders + 100ms * order items count + 100ms * order items' products count. In the best case, this only takes 100ms, but in the worst case, we're talking about seconds to process all the data. That's an O(1 + N + M) operation! We can eagerly fetch with joins or collection queries based on our entity keys, and our query performance becomes closer to 100ms for orders + 100ms for order line items join + 100ms for product join. In the worst case, we're looking at 300ms or O(1)! Normally, N+1 queries like this aren't so obvious as they're split into multiple helper functions. In other languages' ORMs, some accessors look like property lookups, e.g. order.orderItems. This can be achieved with TypeORM using their lazy loading feature (more below), but we don't recommend this behavior by default. Also, you need to be wary of whether your ORM can be utilized through a template/view that may be looping over entities and fetching related records. In general, if you see a record being looped over and are experiencing slowness, you should verify if you've prefetched the data to avoid N+1 operations, as they can be a major bottleneck for performance. Our above example can be optimized by doing the following: ` Here, we prefetch all the needed order items and products and then index them in a hash table/dictionary to look them up during our loops. This keeps our code with the same readability but improves the performance, as in-memory lookups are constant time and nearly instantaneous. For performance comparison, we'd need to compare it to doing the full operation in the database, but this removes the egregious N+1 operations. Eager v Lazy Loading Another ORM feature to be aware of is eager loading and lazy loading. This implementation varies greatly in different languages and ORMs, so please reference your tool's documentation to confirm its behavior. TypeORM's eager v lazy loading works as follows: - If eager loading is enabled for a field when you fetch a record, the relationship will automatically preload in memory. - If lazy loading is enabled, the relationship is not available by default, and you need to request it via a key accessor that executes a promise. - If neither is enabled, it defaults to the behavior ascribed above when we explained handling N+1 queries. Sometimes, you don't want these relationships to be preloaded as you don't use the data. This behavior should not be used unless you have a set of relations that are always loaded together. In our example, products likely will always need product variants loaded, so this is a safe eager load, but eager loading order items on orders wouldn't always be used and can be expensive. You also need to be aware of the nuances of your ORM. With our original problem, we had an operation that looked like product.productVariants.insert({ … }). If you read more on Entity Framework's handling of eager v lazy loading, you'll learn that in this example, the product variants for the product are loaded into memory first and then the insert into the database happens. Loading the product variants into memory is unnecessary. A product with 100s (if not 1000s) of variants can get especially expensive. This was the biggest offender in our client's code base, so flipping the query to include the ID in the insert operation and bypassing the relationship saved us _seconds_ in performance. Database Field Performance Issues Another issue in the project was loading records with certain data types in fields. Specifically, the text type. The text type can be used to store arbitrarily long strings like blog post content or JSON blobs in the absence of a JSON type. Most databases use a technique that stores text fields off rows, which requires a special file system lookup operation to fetch that data. This can make a typical database lookup that would take 100ms under normal conditions to take 200ms. If you combine this problem with some of the N+1 and eager loading problems we've mentioned, this can lead to seconds, if not minutes, of query slowdown. For these, you should consider not including the column by default as part of your ORM. TypeORM allows for this via their hidden columns feature. In this example, for our product description, we could change the definition to be: ` This would allow us to query products without descriptions quickly. If we needed to include the product description in an operation, we'd have to use the addSelect function to our query to include that data in our result like: ` This is an optimization you should be wary of making in existing systems but one worth considering to improve performance, especially for data reports. Alternatively, you could optimize a query using the select method to limit the fields returned to those you need. Database v. In-Memory Fetching Going back to one of our earlier examples, we wrote the following: ` This involves loading our data in memory and then using system memory to perform a group-by operation. Our database could have also returned this result grouped. We opted to perform this operation like this because it made fetching the order item IDs easier. This takes some performance challenges away from our database and puts the performance effort on our servers. Depending on your database and other system constraints, this is a trade-off, so do some benchmarking to confirm your best options here. This example is not too bad, but let's say we wanted to get all the orders for a customer with an item that cost more than $20. Your first inclination might be to use your ORM to fetch all the orders and their items and then filter that operation in memory using JavaScript's filter method. In this case, you're loading data you don't need into memory. This is a time to leverage the database more. We could write this query as follows in SQL: ` This just loads the orders that had the data that matched our condition. We could write this as: ` We constrained this to a single customer, but if it were for a set of customers, it could be significantly faster than loading all the data into memory. If our server has memory limitations, this is a good concern to be aware of when optimizing for performance. We noticed a few instances on our client's implementation where the filter operations were applied in functions that appeared to run the operation in the database but were running the operation in memory, so this was preloading more data into memory than needed on a memory-constrained server. Refer to your ORM manual to avoid this type of performance hit. Lack of Indexes The final issue we encountered was a lack of indexes on key lookups. Some ORMs do not support defining indexes in code and are manually applied to databases. These tend not to be documented, so an out-of-sync issue can happen in different environments. To avoid this challenge, we prefer ORMs that support indexes in code like TypeORM. In our last example, we filtered on the cost of an order item, but the cost field does not contain an index. This leads to a full table scan of our data collection filtered by the customer. The query cost can be very expensive if a customer has thousands of orders. Adding the index can make our query super fast, but it comes at a cost. Each new index makes writing to the database slower and can exponentially increase the size of our database needs. You should only add indexes to fields that you are querying against regularly. Again, be sure you can notate these indexes in code so they can be replicated across environments easily. In our client's system, the previous developer did not include indexes in the code, so we retroactively added the database indexes to the codebase. We recommend using your database's recommended tool for inspection to determine what indexes are in place and keep these systems in sync at all times. Conclusion ORMs can be an amazing tool to help you and your teams build applications quickly. However, they have gotchas for performance that you should be aware of and can identify while developing or during code reviews. These are some of the most common examples I could think of for best practices. When you hear the horror stories about ORMs, these are some of the challenges typically discussed. I'm sure there are more, though. What are some that you know? Let us know!...

Jul 3, 2024

8 mins

Architectural DesignDesign PatternsArchitecture

How to Resolve Nested Queries in Apollo Server

When working with relational data, there will be times when you will need to access information within nested queries. But how would this work within the context of Apollo Server? In this article, we will take a look at a few code examples that explore different solutions on how to resolve nested queries in Apollo Server. I have included all code examples in CodeSandbox if you are interested in trying them out on your own. Prerequisites This article assumes that you have a basic knowledge of GraphQL terminology. Table of Contents - How to resolve nested queries: An approach using resolvers and the filter method - A refactored approach using Data Loaders and Data Sources - What are Data Loaders - How to setup a Data Source - Setting up our schemas and resolvers - Resolving nested queries when microservices are involved - Conclusion How to resolve nested queries: An approach using resolvers and the filter method In this first example, we are going to be working with two data structures called musicBrands and musicAccessories. musicBrands is a collection of entities consisting of id and name. musicAccessories is a collection of entities consisting of the product name, price, id and an associated brandId. You can think of the brandId as a foreign key that connects the two database tables. We also need to set up the schemas for the brands and accessories. ` The next step is to set up a resolver for our Query to return all of the music accessories. ` When we run the following query and start the server, we will see this JSON output: ` ` As you can see, we are getting back the value of null for the brands field. This is because we haven't set up that relationship yet in the resolvers. Inside our resolver, we are going to create another query for the MusicAccessories and have the value for the brands key be a filtered array of results for each brand. ` When we run the query, this will be the final result: ` ` This single query makes it easy to access the data we need on the client side as compared to the REST API approach. If this were a REST API, then we would be dealing with multiple API calls and a Promise.all which could get a little messy. You can find the entire code in this CodeSandbox example. A refactored approach using Data Loaders and Data Sources Even though our first approach does solve the issue of resolving nested queries, we still have an issue fetching the same data repeatedly. Let’s look at this example query: ` If we take a look at the results, we are making additional queries for the brand each time we request the information. This leads to the N+1 problem in our current implementation. We can solve this issue by using Data Loaders and Data Sources. What are Data Loaders Data Loaders are used to batch and cache fetch requests. This allows us to fetch the same data and work with cached results, and reduce the number of API calls we have to make. To learn more about Data Loaders in GraphQL, please read this helpful article. How to setup a Data Source In this example, we will be using the following packages: - apollo-datasource - apollo-server-caching - dataloader We first need to create a BrandAccessoryDataSource class which will simulate the fetching of our data. ` We will then set up a constructor with a custom Dataloader. ` Right below our constructor, we will set up the context and cache. ` We then want to set up the error handling and cache keys for both the accessories and brands. To learn more about how caching works with GraphQL, please read through this article. ` Next, we are going to set up an asynchronous function called get which takes in an id. The goal of this function is to first check if there is anything in the cached results and if so return those cached results. Otherwise, we will set that data to the cache and return it. We will set the ttl(Time to Live in cache) value to 15 seconds. ` Below the get function, we will create another asynchronous function called getByBrand which takes in a brand. This function will have a similar setup to the get function but will filter out the data by brand. ` Setting up our schemas and resolvers The last part of this refactored example includes modifying the resolvers. We first need to add an accessory key to our Query schema. ` Inside the resolver, we will add the accessories key with a value for the function that returns the data source we created earlier. ` We also need to refactor our Brand resolver to include the data source we set up earlier. ` Lastly, we need to modify our ApolloServer object to include the BrandAccessoryDataSource. ` Here is the entire CodeSandbox example. When the server starts up, click on the Query your server button and run the following query: ` Resolving nested queries when microservices are involved Microservices is a type of architecture that will split up your software into smaller independent services. All of these smaller services can interact with a single API data layer. In this case, this data layer would be GraphQL. The client will interact directly with this data layer, and will consume API data from a single entry point. You would similarly resolve your nested queries as before because, at the end of the day, there are just functions. But now, this single API layer will reduce the number of requests made by the client because only the data layer will be called. This simplifies the data fetching experience on the client side. Conclusion In this article, we looked at a few code examples that explored different solutions on how to resolve nested queries in Apollo Server. The first approach involved creating custom resolvers and then using the filter method to filter out music accessories by brand. We then refactored that example to use a custom DataLoader and Data Source to fix the "N+1 problem". Lastly, we briefly touched on how to approach this solution if microservices were involved. If you want to get started with Apollo Server and build your own nested queries and resolvers using these patterns, check out our serverless-apollo-contentful starter kit!...

May 17, 2023

6 mins

GraphQL

Roo Custom Modes

Roo Custom Modes Roo Code is an extension for VS Code that provides agentic-style AI code editing functionality. You can configure Roo to use any LLM model and version you want by providing API keys. Once configured, Roo allows you to easily switch between models and provide custom instructions through what Roo calls "modes." Roo Modes can be thought of as a "personality" that the LLM takes on. When you create a new mode in Roo, you provide it with a description of what personality Roo should take on, what LLM model should be used, and what custom instructions the mode should follow. You can also define workspace-level instructions via a .roo/rules-{modeSlug}/ directory at your project root with markdown files inside. Having different modes allows developers to quickly fine-tune how the Roo Code agent performs its tasks. Roo ships out-of-the-box with some default modes: Code Mode, Architect Mode, Ask Mode, Debug Mode, and Orchestrator Mode. These can get you far, but I have expanded on this list with a few custom modes I have made for specific scenarios I run into every day as a software engineer. My Custom Modes 📜 Documenter Mode I created this mode to help me with generating documentation for legacy codebases my team works with. I use this mode to help produce documentation interactively with me while I read a codebase. Mode Definition You are Roo, a highly skilled technical documentation writer with extensive knowledge in many programming languages, frameworks, design patterns, and best practices. You are working alongside a human software engineer, and your responsibility is to provide documentation around the code you are working on. You will be asked to provide documentation in the form of comments, markdown files, or other formats as needed. Mode-specific Instructions You will respect the following rules: * You will not write any code, only markdown files. * In your documentation, you will provide references to specific files and line numbers of code you are referencing. * You will not attempt to execute any commands. * You will not attempt to run the application in the browser. * You will only look at the code and infer functionality from that. 👥 Pair Programmer Mode I created a “Pair Programmer” mode to serve as my personal coding partner. It’s designed to work in a more collaborative way with a human software engineer. When I want to explore multiple ideas quickly, I switch to this mode to rapidly iterate on code with Roo. In this setup, I take on the role of the navigator—guiding direction, strategy, and decisions—while Roo handles the “driving” by writing and testing the code we need. Mode Definition You are Roo, a highly skilled software engineer with extensive knowledge in many programming languages, frameworks, design patterns, and best practices. You are working alongside a human software engineer who will be checking your work and providing instructions. If you get stuck, ask for help and we will solve problems together. Mode-specific Instructions You will respect the following rules: * You will not install new 3rd party libraries without first providing usage metrics (stars, downloads, latest version update date). * You will not do any additional tasks outside of what you have been told to do. * You will not assume to do any additional work outside of what you have been instructed to do. * You will not open the browser and test the application. Your pairing partner will do that for you. * You will not attempt to open the application or the URL at which the application is running. Assume your pairing partner will do that for you. * You will not attempt to run npm run dev or similar commands. Your pairing partner will do that for you. * You will not attempt to run a development server of any kind. Your pairing partner will handle that for you. * You will not write tests unless instructed to. * You will not make any git commits unless explicitly told to do so. * You will not make suggestions of commands to run the software or execute the test suite. Assume that your human counterpart has the application running and will check your work. 🧑‍🏫 Project Manager I created this mode to help me write tasks for my team with clear and actionable acceptance criteria. Mode Definition You are a professional project manager. You are highly skilled in breaking down large tasks into bite-sized pieces that are actionable by an engineering team or an LLM performing engineering tasks. You analyze features carefully and detail out all edge cases and scenarios so that no detail is missed. Mode-specific Instructions Think creatively about how to detail out features. Provide a technical and business case explanation about feature value. Break down features and functionality in the following way. The following example would be for user login: User Login: As a user, I can log in to the application so that I can make changes. This prevents anonymous individuals from accessing the admin panel. Acceptance Criteria * On the login page, I can fill in my email address: * This field is required. * This field must enforce email format validation. * On the login page, I can fill in my password: * This field is required. * The input a user types into this field is hidden. * On failure to log in, I am provided an error dialog: * The error dialog should be the same if the email exists or not so that bad actors cannot glean info about active user accounts in our system. * Error dialog should be a red box pinned to the top of the page. * Error dialog can be dismissed. * After 4 failed login attempts, the form becomes locked: * Display a dialog to the user letting them know they can try again in 30 minutes. * Form stays locked for 30 minutes and the frontend will not accept further submissions. 🦾 Agent Consultant I created this mode for assistance with modifying my existing Roo modes and rules files as well as generating higher quality prompts for me. This mode leverages the Context7 MCP to keep up-to-date with documentation on Roo Code and prompt engineering best practices. Mode Definition You are an AI Agent coding expert. You are proficient in coding with agents and defining custom rules and guidelines for AI powered coding agents. Your specific expertise is in the Roo Code tool for VS Code are you are exceptionally capable at creating custom rules files and custom mode. This is your workflow that you should always follow: 1. 1. Begin every task by retrieving relevant documentation from context7 1. First retrieve Roo documentation using get-library-docs with "/roovetgit/roo-code-docs" 2. Then retrieve prompt engineering best practices using get-library-docs with “/dair-ai/prompt-engineering-guide" 2. Reference this documentation explicitly in your analysis and recommendations 3. Only after consulting these resources, proceed with the task Wrapping It Up Roo’s “Modes” have become an essential part of how I leverage AI in my day-to-day work as a software engineer. By tailoring each mode to specific tasks—whether it’s generating documentation, pairing on code, writing project specs, or improving prompt quality—I’ve been able to streamline my workflow and get more done with greater clarity and precision. Roo’s flexibility lets me define how it should behave in different contexts, giving me fine-grained control over how I interact with AI in my coding environment. Roo also has the capability of defining custom modes per project if that is needed by your team. If you find yourself repeating certain workflows or needing more structure in your interactions with AI tools, I highly recommend experimenting with your own custom modes. The payoff in productivity and developer experience is absolutely worth it....

Jun 13, 2025

6 mins

AIRoo Code

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

What is a Monorepo and What Are the Advantages for Using It in Your Project?

Table of contents

What is a monorepo?

Unified build process and CI/CD process

Increased cross team communication

Steeper learning curve and issues of inclusivity for newer developers

A look into the starter.dev monorepo

What is starter.dev?

Why did the team choose a monorepo structure?

What are Yarn Workspaces?

Unified CI/CD pipeline with Amplify

How does the team manage issues?

Conclusion

Jessica Wilkins

You might also like

Leveling Up Your Work Through Architecture Design and Time Estimation

The Dangers of ORMs and How to Avoid Them

How to Resolve Nested Queries in Apollo Server

Roo Custom Modes

Let's innovate together!

You might also like

Leveling Up Your Work Through Architecture Design and Time Estimation

The Dangers of ORMs and How to Avoid Them

How to Resolve Nested Queries in Apollo Server

Roo Custom Modes