May 22nd, 2023 × #nodejs#webdev#javascript
Why Is node_modules So Big?
Discussion on why Node.js node_modules folders get so large and what actually takes up most of the space inside them.
- Node modules folder size
- What's in node_modules
- Shipping source files
- Syntax podcast transcripts
- Gzipping reduces size
- Text files are large
- Using DaisyDisk to analyze
- Publishing ESM vs CJS
- Keeping source files
- AWS SDK types size
- TypeScript types for auto-complete
- Polyfills for browser compatibility
- core-js polyfill size
- Babel plugins have large types
- Shipping CJS, ESM and browser builds
- Markdown files are large
- Licenses must be included
- Translation strings take space
- Floppy disk size reference
- Bundling reduces size
- Actual runtime code is small
- Use npm ignore
- Mable Preset Env reduces polyfills
- npkill finds old unused modules
- Find and remove node_modules
- npkill UI for removing modules
Transcript
Wes Bos
Boss, and Scott key.
Node modules folder size
Scott Tolinski
Welcome to syntax on this Monday hasty treat. We're gonna be talking about node modules, And, why is that folder so big? Why are the node modules folder what's going on inside of that giant behemoth of a folder inside of every single one of your projects. My name is Scott Talinski. I'm a developer from Denver. With me, as always, is Wes, The, node modules boss.
Wes Bos
Yeah. So, it's it's kind of a bit of a a meme in the web development world that the Node modules folder is massive. And anytime anyone brings up working in JavaScript. It's sort of like the punching bag of, yeah, LOL. It's so huge. And then there's that meme with the heaviest things on Earth and the node modules. The biggest one. Down. Yeah. It's like this. How do you center stuff in CSS of JavaScript? Yeah. And it's I I laugh at those, but also, like, sometimes it's like, well, like, it's not your end. JavaScript is not actually going to be that big.
What's in node_modules
Wes Bos
Most of that I would say I would say almost all of it is not the actual JavaScript that will make it into your bundle or your runtime or something like that. So I thought, let's do a show and just talk about, like, what is actually in there? Why are these things so big because people like to complain about it being so big. And certainly, there are issues that could be solved, but There's a lot of stuff in there that makes your development environment very smooth and simple and make sure that it runs on older versions of node and whatnot. So, Let's get on into it.
Shipping source files
Wes Bos
The spoiler alert is that it's text. Text is surprisingly large, Especially when you get into things that are descriptions and polyfills and even for like I'm working on the syntax transcripts right now.
Wes Bos
And we have, like, annotations for every single word of when they start and finish.
Wes Bos
And some of these files are, like 7 megs. And I was like, we can't load that on the website.
Syntax podcast transcripts
Wes Bos
But yeah, that's way too big to load on the website. But it actually Gzips very well because Gzip is the compression algorithm that servers can send and the browser can unpack it.
Gzipping reduces size
Wes Bos
And it's actually like a 100 times smaller. I don't know how much smaller. It was only like I still have to trim it down, but it was like 300 ks or something like that when I gzipped it because the JSON keys of start and end are the same thing on every single word, right? And the browser only has to send that once
Scott Tolinski
for every time that it is, even if it's in there 6,000 times. Yeah. And and, you know, one common way that developers can illustrate just how big text is in general is log files. Right? We all have had the experience where a log file gets written and written and written. And all of a sudden, your, hard drive disk is out of space, and and you're wondering where that went. And it turns out it's just from writing a lot of text to log files. So At the end of the day, anytime you're dealing with anything that gets stored in a file, the bigger it is, the more space it's going to take up. And, eventually, it could take up a lot of space. So, yeah, you know, there doesn't need to be a lot of multimedia things to have, very large files. If you ever used, software like DaisyDisk or any of those things that'll help you I love DaisyDisk. Yeah. DaisyDisk is fantastic. If you're a Mac user, I'm very positive there's similar things for Windows, but what it does is it's a visualizer that shows you how big all of your folders are, and then you can dive in and see where exactly the space on your hard disk is being taken up with. Anybody's ever used any of those will instantly see how big their cache node modules or even their project node modules folders actually are on your hard disk. They are taking up a lot of space, which is one of the reasons why we also Often don't commit them to GitHub or anything like that. They're they're big. I mean and we can generate them automatically anytime we run an NPM install. Anyways, I actually used DaisyDisk to
Using DaisyDisk to analyze
Wes Bos
research the show. Oh, really? I just took I took I took a couple like, I opened up a couple of instances of it, and I took, like, Boss Monster, which is my course platform. I took the whole, like, The back end node modules, pop that in there. And then I took the front end, like the React application node modules, and I took a couple other projects and I just opened them up and said, all right, what's actually big in here and and, like, where is it coming from? So the first one is source files. So when somebody Publishes something to NPM.
Wes Bos
It's generally run through a bundler. You can publish straight up JavaScript to NPM.
Publishing ESM vs CJS
Wes Bos
However, often people are writing it in TypeScript or people are writing it ESM, and they want to be able to ship a common JS version or for whatever. Someone wrote it in CoffeeScript and they need to be able to transpile it to JavaScript. So generally, they are shipping the built version, but a lot of people don't take the source files out of the bundled version either. Whether that's because they like to be able to reference it, you pop up in your no modules, can you see the actual source or whether that's because people don't know that there is a file called NPM ignore and that will exclude it. Like, I I've had published a couple node modules, and I had some screenshots from the GitHub read me. And that screenshot was, like, I don't know, like 400 ks. And it was being added to the bundle. And that's totally unnecessary because there's no world where somebody needs to see that screenshot in the read me in their node modules folder, and it's it's that's half a meg right there. You know? And, like so times that by a 1000 node modules. Everybody's adding an accidental extra 200 k. It really adds up. And I have had node modules folders that are gigs and gigs, which is like a good spot to find space if you run
Keeping source files
Scott Tolinski
out of hard drive space. Yeah. Nuke what's in there because chances are, especially like the the one that's in your user folder or wherever it may live, for cache modules can end up getting pretty big. You know, I, I personally, I I like it sometimes when they Keep the source files in node modules like you mentioned. A lot of people do that even if it is taking up my hard drive space when I'm working in a third party package.
Scott Tolinski
I really do appreciate being able to dive into the source code for what I'm working on to further understand it because not all documentation is great. Not only that, but it it honestly helps me become a better developer to be able to read source code and being able to read the source code of the thing I'm using, especially in components or, bring in a self component or something like that, being able to dive into that. What that component actually looks like under the hood Will allow you to understand its implementation a little bit more and how to use it properly. Under the hood, we did a,
Wes Bos
Check. I did a a search of the last 130 InstaDex podcasts.
Wes Bos
And we said under the hood in a quarter of them, which is Significant. Definitely significant. Next up, we have types. This actually is pretty large in that, the AWS SDK. So this is one of my projects had the AWS SDK, and they had 14 megs alone of just types. So you want your TypeScript to just work in 14 minutes. That's probably an outlier. But the AWS SDK is a single package, And it has literally like AWS has a 1,000,000 services, and I needed to, I was installing the transcription package.
Wes Bos
But you install literally the SDK for everything on AWS S3 and cloud, whatever they have. And I was like, hokey hokey doodle, 14 megs of type
AWS SDK types size
Scott Tolinski
doodle. We should check how many times hokey doodle has been said on this show.
Wes Bos
That's a good idea.
Wes Bos
So, yeah, you need types. And I I kinda hate that when you have to go install the types yourself. So a lot of packages just ship them by default because a lot of developers are using TypeScript. Or if you're not even using TypeScript, those types are still used by Versus Code to give you really nice auto completion.
TypeScript types for auto-complete
Scott Tolinski
So those are what take up a good chunk as well. Yep. Yep. And some of it is source maps, which source maps are really just a map from the, the code to the compiled code. So that way when you debug, the debugger can know where the issue is in the actual source code, not the compiled code. And the source map, as we've seen occasionally, I I had a tweet about it a little while ago that was like, what's up with all these, semicolons? There's just, like, a whole document full of semicolons, and that takes up a lot of space. And source maps are are just kinda wild if if you ever look into one exactly what's going on. Somebody actually maybe I'll find a link to that tweet because somebody explained what's up with all of those semicolons in a source map. But, you know, the source map files are text, and they have to be shipped if you want to have a better debugging experience.
Wes Bos
Next up is polyfills. So if you want your JavaScript to work on older browsers, older versions of Node.
Polyfills for browser compatibility
Wes Bos
Often they will ship what's called Core JS, Encore. Js is polyfill to make sure that newer JavaScript features work on older versions of JavaScript and browsers and node runtimes and all that stuff. So even if you are running on the most modern version, they will often include Core. Js as the dependency. And then we npm install it. That comes along and literally every package.
Wes Bos
Not every package, but Core. Js is like one of the most popular Node modules out there. Let's take a look at it on NPM.
core-js polyfill size
Wes Bos
Oh, man, I have this little thing for Raycast called package, and it brings me right to the npm you're on. Oh, bless. Love it. Yeah. 38 Is that an extension, or did you write that? No. It's extension that you can just install. You know what I would like is, like, one that would link to bundle phobia.
Scott Tolinski
Oh, yeah. Do you read bundle phobia? Bundle phobia shows you how big a an NPM package is, which I guess is relevant to this episode. Maybe we should link to NPM phobia. It allows you to look up and see. Know, I use NPM phobia a lot myself when I'm trying to debug something that I'm shipping to see how big it actually is or, what I'm shipping to users. And also speaking of Raycast in the side west, I accidentally set up a snippet for Raycast and, like, not thinking that SC would autocomplete to my email. Now I can't type any words with s e in it without it auto completing my email and me getting really frustrated. So, I'm going to delete that at this very second. Prefix all my
Wes Bos
expansions with a colon.
Wes Bos
And that Oh, that's Yeah. That way, I never Why are you so smart? I've I've been doing that for probably, like, 10 years, and it's awesome because Very rarely do I ever hit a weird thing where it expands unintentionally.
Wes Bos
So just colon, and then whatever it is, all of my expansions are are prefixed with that. You could you could use another
Scott Tolinski
something else like a plus or something, but I I'm gonna use Colin. Yeah. Colin's me. I I have another one, which is c c for the, fake credit card number for
Wes Bos
Braintree and Stripe. Oh. So Men. Colon cc. Let me tell you what I've I've done in BetterTouchTool.
Wes Bos
So BetterTouchTool will allow you to It pastes the the fake Stripe credit card for 24242, hits the tab key, pastes in, the month and day expiry hits the tab key, pastes in the CCV, hits the tab key twice, puts in a postal code, and then hits the tab key and enters it.
Scott Tolinski
So if you're testing Stripe, so that's a pain, which you do. Yeah. It it's just something if you're ever testing your credit card number or anything like that, You do it once. It doesn't work. You do it again. It doesn't work. You do it again. You it it ends up becoming, you know, one of those processes where you just try it over and over again. Alright. Back to new model.
Wes Bos
Where were we? Polyfills. Oh, it's it's Babble or or Babel. Every single Babble plug in, which is included in most node modules, includes, like, 8 k of code and almost 860 k of types. And I couldn't I couldn't really figure this one out. I think it's because every babel plug in ships every other babel plug in's types, but Every babel plug in has a meg, almost a meg of types Jeez. In it, which is is is quite a bit because you might have 20, 30, 40 babel plugins per project. So that's easy, 30, 30 megs right add it to your project every time that you install it.
Babel plugins have large types
Scott Tolinski
That is
Wes Bos
significant. Yeah, I didn't think I realized. No, I didn't either. That's why I love this DaisyDisk.
Wes Bos
Just drop it in and you could take a look at. Oh, wow. I didn't realize that was there, right? Because you almost never look at the exported value. Like you look on GitHub, you don't see what's actually published to NPM. And now NPM finally has like a a search. You can click on the tab and see the code and see what's there. It will tell you where everything is, but it's not very good.
Shipping CJS, ESM and browser builds
Wes Bos
So,
Scott Tolinski
I kinda wish that was better. Yeah. Well and also too okay. We're living in this world where we have ESM, ECMAScript modules, as well as CJS common JS. We you know, they're not super compatible all the time. So would would one of the strategies for dealing with that is just shipping them both, which again, duplicating files, duplicating the same thing. Here's 2 different ways to import the same code That's compiled from some source code. So now we ship ESM and we ship CJS, and everybody's
Wes Bos
happy, but you have doubled up the files. Yeah. Yeah. It's it's sort of that's the ecosystem, right? You got to make it work with both of them. So you ship literally 2 versions of each of them or sometimes there's even 3 versions of it being shipped. Why was what was the reason for that? I think sometimes what was the reason for 3? Yeah. I think maybe there was browser version, a Node ESM version, and then a Node common JS version.
Wes Bos
Or and then sometimes there's even just like a version you can include with a script tag, you know? So that's pretty crazy.
Markdown files are large
Wes Bos
Next one is markdown. Markdowns, again, text takes up a lot of room.
Wes Bos
So I opened up my react project for the player for my boss monster, and I did a search for all markdown files, and then I tallied up the total 10 megs of markdown. They're Jeez. Read me's licenses.
Wes Bos
Like, you have to you have to ship licenses with your open source code. That's kind of the rule, right? You can't you can't just get rid of those things. The licenses are you're allowed to redistribute code or you're allowed to link back to code.
Wes Bos
So the license sort of has to come along, change logs for every single thing that has ever changed. Read me all of those good things
Scott Tolinski
It's pretty crazy. It is. That is significant.
Licenses must be included
Scott Tolinski
Yeah.
Scott Tolinski
Also, translations.
Translation strings take space
Scott Tolinski
You know how many languages if you're using international things like that. So, types it says TypeScript has 4 megs of translation strings alone.
Scott Tolinski
Wow.
Scott Tolinski
I found myself just thinking like, wow. This is all significant. Oh, all of the stuff that isn't the actual code we're using at any given point. Right? Just dump it. Yeah. If you want to know, like, what is
Wes Bos
this type doesn't have significant enough overlap? What is that TypeScript error where you're trying to Change the type of something. You're trying to cast it, and this is, you can't do that. Like, they have that in Turkish and Chinese and every other one. Right? And Mhmm.
Wes Bos
I don't like it doesn't make any sense that you would just go like, oh, I'm installing TypeScript. Now let me go install the the Chinese translations for that. It just just has to work, right? So that's why every single person that installs TypeScript gets every single language that it's ever been translated to. And that's 4 Megs, right? Like, None of these things are very significant.
Floppy disk size reference
Wes Bos
Like, the typescript translations can just could be almost be put on a floppy disk.
Wes Bos
It's not big. Yeah. But they add up. Speaking of all of this, though, like,
Scott Tolinski
it kinda makes you wonder if somebody like PNPM or any of these install scripts Could have, like, some sort of configuration option to nuke out certain things. Now granted this sounds like a a tricky situation.
Scott Tolinski
Yeah. Like, imagine you had a package installer that just removed any change log or read me or license file automatically.
Scott Tolinski
Automatically removed CJS if you're using an ESM project. Automatically Removed, any of the little things that we've been talking about here that you might not be using. I I think that's probably a bad idea. Yeah. Who knows? I feel like that would just you would just run into
Wes Bos
like nobody really knows how to How to get rid of like, nobody even knows if they're importing common JS or ESM. Like, they wouldn't know which versions to to import. It would just be what if, like, change logs or read me's and licenses, like, automatically
Scott Tolinski
nuke those files out? It's probably not saving enough to make a difference, but
Wes Bos
it. And like, PMPM already saves you a lot of space because it just links them to the versions that are already installed on your computer rather than duplicating That code over and over again. That's why the PMPM install is so much faster than regular MPM install. Word. So what is the solution to all of this madness? The reality is that almost all of this code isn't need to actually run your application.
Wes Bos
So first of all, chill out. It's not that big of a deal. Yeah, for sure. Second of all, bundle your code.
Bundling reduces size
Wes Bos
Even if you do care about what that looks like, then You could bundle your code and ship it like if you are like, okay, well, I need to like the example is we have a show coming up on this is that Node allows you to Package your entire application into a single executable file and that includes your code, that includes the Node. Js runtime, and that's going to include all of your dependencies.
Wes Bos
And if you are worried that, okay, well, this thing needs to be as small as possible and I can't be having 4 megs of TypeScript translations being put into the final thing. Then you can just run your code through any bundler Vite or Esbuild or any of these tools.
Wes Bos
And out the other end, you're going to get at like under a meg, probably just 50, 100, depends on how large the application is. The actual code that runs your application is not very large in comparison to the rest of this stuff. Yeah. I mean, it none of it really matters. It's yet to this day. Right?
Actual runtime code is small
Scott Tolinski
And most like, My computer's got 2 terabytes of hard drive space. What's a, you know, couple 100 meg? Oh, man. When I was when I was on 1 terabyte,
Wes Bos
I was struggling, but it's also because, like, Scott and I, we record video for a living. Right? Yeah. You wanna see what takes up big files. It's video. It's true. I have I have, like, 1 hour of screen recording can be, like, a 30 gig file for me. You know? It's Yeah. Ridiculous. And then here we are whining about 4 megs of translation strings. Right? Well, Les Wes, let me tell you. I've been working in videos since, like, 2004,
Scott Tolinski
maybe even longer.
Scott Tolinski
And back when my hard drive was 40 gigabytes, doing anything in video was I had a LaCie 200 gigabyte external hard drive
Wes Bos
that was, like, the greatest thing in the entire world because now all of a sudden I could do more than 1 video project at a time without deleting. It's the best being able to have it. I even love it on my computer. Like when we went SSD and Apple is trying to sell your computer with 256 gigs on it. It's just a joke for that type of thing. But now that the SSDs are somewhat coming down, Apple still charges you through the roof for a 2 terabyte, but it's so nice just having every project I've worked on in the last year, and then I just archive the rest.
Wes Bos
We'll talk about that on a different show. But, anyways, what else do we have here? Use npm ignore. Submit pull requests to people. Say, hey. I don't think you I've had out the other end to npm. So you can add npm, ignore, use n we use PNPM. We talked about that. Clean up after you drop older versions of Node or older versions of browsers. This is something that the Mable Preset E and V is very good at in that it will tell it I'm supporting these browsers and it will figure out These are the polyfills you need, and it won't add in anymore.
Mable Preset Env reduces polyfills
Scott Tolinski
Yeah. You could also just use PNPM, which isn't necessarily an option for everybody considering, you know, moving your package manager is a bit of a buy in, but you know what? I've been using PNPM for a little while. The switch was not that painful. Maybe there's a couple things here and there, but PNPM, What it does is it stores all of them in a root folder under your username and caches them so that you only have to download each of them once, then it does, what's it called? Sim links for everything. Yep. You have a monorepo? It's doing sim links. It's always it's always sim linking everything together so that way you're not loading 1 package a 100 times on your machine. You're just loading it once per version, so to say. So PNPM is a great solution for that, and it's It's very fast because of that. It has to download less. It's more efficient because it doesn't have to take up as much hard drive space, And there's a lot of great features on top of it. So if if you're like a Yarn folk or a NPM folk, give give PNPM an actual try because it's it's really easy to switch over to. Mhmm. And I think you might like it. It's it's been a nice little shift for me. I've been on it for definitely over a year and a half now and have had nothing but great things to say. And then finally, there's a tool called n p kill. You just run this you run n p x n p kill, and it will search your hard drive for
Wes Bos
node modules folders. It will tell you how long ago you last worked on that project, and it will tell you how large those are.
npkill finds old unused modules
Wes Bos
And if you are looking to free up some space on your hard drive, like, I remember I did it once and I hadn't done it. It was the 1st time I'd done it, like, ever.
Wes Bos
And I had 30 gigs of node modules that I deleted, which is significant. That's fairly large, right? So that's great because you can always get them back. You can always npm install and get all that stuff back, and it's a great way to speed up, clean up your computer if you do need some more space. It's also just really easy to run
Scott Tolinski
a quick command line find for node modules To just simply find and remove all node modules on your computer. It's like an exceedingly like, find node modules,
Find and remove node_modules
Wes Bos
Remove them. If you find a directory named node modules on your computer, remove them. Easy script. That always that always scares me In that, I I would delete the wrong thing or or, like, also, like, it's a pain because I am working on 6 or 7 projects. I don't wanna delete those node modules just yet.
Scott Tolinski
No. It would you'd run it from the directory that you're currently in.
Scott Tolinski
I'm in a directory. I would like to remove it from this directory, and I even use it just Just to nuke out my node module sometime instead of just an RMRF because you can run the same thing for, you know, all of your site experiments. Like, I don't know how you structure your site folder.
Scott Tolinski
But for me, I have an ex EXP, which is my experiments folder. Anytime I'm working on anything that I just want a quick hack on, I throw it in there, and sometimes you forget about that stuff. Yeah. So they just regularly run a Just nuke on all of the node modules within this experiments folder. Nuke it. That's fine. Yeah. Experiments is one that I, again. It scares me. I know that it works fine, but I love the
Wes Bos
NP kill has, like, the best UI. Like, use your arrow keys, hit space to delete it.
npkill UI for removing modules
Wes Bos
And, I don't need that. Oh, man. It's the best. I'm gonna say it on me. Come to me when you delete the wrong Gerkley by accident.
Scott Tolinski
Hey, Hey, man. You're talking to the guy who ran get clean on his home directory. So yeah.
Wes Bos
Alright. That's it for today. Thank you everybody for tuning in. Catch you on Wednesday.
Wes Bos
Peace.
Scott Tolinski
Head on over to syntax.fm for a full archive of all of our shows, And don't forget to subscribe in your podcast player or drop a review if you like this show.