11 KiB
Modules
Node has a simple module loading system. In Node, files and modules are in
one-to-one correspondence. As an example, foo.js
loads the module
circle.js
in the same directory.
The contents of foo.js
:
var circle = require('./circle.js');
console.log( 'The area of a circle of radius 4 is '
+ circle.area(4));
The contents of circle.js
:
var PI = Math.PI;
exports.area = function (r) {
return PI * r * r;
};
exports.circumference = function (r) {
return 2 * PI * r;
};
The module circle.js
has exported the functions area()
and
circumference()
. To export an object, add to the special exports
object.
Variables
local to the module will be private. In this example the variable PI
is
private to circle.js
.
Core Modules
Node has several modules compiled into the binary. These modules are described in greater detail elsewhere in this documentation.
The core modules are defined in node's source in the lib/
folder.
Core modules are always preferentially loaded if their identifier is
passed to require()
. For instance, require('http')
will always
return the built in HTTP module, even if there is a file by that name.
File Modules
If the exact filename is not found, then node will attempt to load the
required filename with the added extension of .js
, and then .node
.
.js
files are interpreted as JavaScript text files, and .node
files
are interpreted as compiled addon modules loaded with dlopen
.
A module prefixed with '/'
is an absolute path to the file. For
example, require('/home/marco/foo.js')
will load the file at
/home/marco/foo.js
.
A module prefixed with './'
is relative to the file calling require()
.
That is, circle.js
must be in the same directory as foo.js
for
require('./circle')
to find it.
Without a leading '/' or './' to indicate a file, the module is either a
"core module" or is loaded from a node_modules
folder.
Loading from node_modules
Folders
If the module identifier passed to require()
is not a native module,
and does not begin with '/'
, '../'
, or './'
, then node starts at the
parent directory of the current module, and adds /node_modules
, and
attempts to load the module from that location.
If it is not found there, then it moves to the parent directory, and so on, until either the module is found, or the root of the tree is reached.
For example, if the file at '/home/ry/projects/foo.js'
called
require('bar.js')
, then node would look in the following locations, in
this order:
/home/ry/projects/node_modules/bar.js
/home/ry/node_modules/bar.js
/home/node_modules/bar.js
/node_modules/bar.js
This allows programs to localize their dependencies, so that they do not clash.
Optimizations to the node_modules
Lookup Process
When there are many levels of nested dependencies, it is possible for these file trees to get fairly long. The following optimizations are thus made to the process.
First, /node_modules
is never appended to a folder already ending in
/node_modules
.
Second, if the file calling require()
is already inside a node_modules
hierarchy, then the top-most node_modules
folder is treated as the
root of the search tree.
For example, if the file at
'/home/ry/projects/foo/node_modules/bar/node_modules/baz/quux.js'
called require('asdf.js')
, then node would search the following
locations:
/home/ry/projects/foo/node_modules/bar/node_modules/baz/node_modules/asdf.js
/home/ry/projects/foo/node_modules/bar/node_modules/asdf.js
/home/ry/projects/foo/node_modules/asdf.js
Folders as Modules
It is convenient to organize programs and libraries into self-contained
directories, and then provide a single entry point to that library.
There are three ways in which a folder may be passed to require()
as
an argument.
The first is to create a package.json
file in the root of the folder,
which specifies a main
module. An example package.json file might
look like this:
{ "name" : "some-library",
"main" : "./lib/some-library.js" }
If this was in a folder at ./some-library
, then
require('./some-library')
would attempt to load
./some-library/lib/some-library.js
.
This is the extent of Node's awareness of package.json files.
If there is no package.json file present in the directory, then node
will attempt to load an index.js
or index.node
file out of that
directory. For example, if there was no package.json file in the above
example, then require('./some-library')
would attempt to load:
./some-library/index.js
./some-library/index.node
Caching
Modules are cached after the first time they are loaded. This means
(among other things) that every call to require('foo')
will get
exactly the same object returned, if it would resolve to the same file.
All Together...
To get the exact filename that will be loaded when require()
is called, use
the require.resolve()
function.
Putting together all of the above, here is the high-level algorithm in pseudocode of what require.resolve does:
require(X)
1. If X is a core module,
a. return the core module
b. STOP
2. If X begins with `./` or `/`,
a. LOAD_AS_FILE(Y + X)
b. LOAD_AS_DIRECTORY(Y + X)
3. LOAD_NODE_MODULES(X, dirname(Y))
4. THROW "not found"
LOAD_AS_FILE(X)
1. If X is a file, load X as JavaScript text. STOP
2. If X.js is a file, load X.js as JavaScript text. STOP
3. If X.node is a file, load X.node as binary addon. STOP
LOAD_AS_DIRECTORY(X)
1. If X/package.json is a file,
a. Parse X/package.json, and look for "main" field.
b. let M = X + (json main field)
c. LOAD_AS_FILE(M)
2. LOAD_AS_FILE(X/index)
LOAD_NODE_MODULES(X, START)
1. let DIRS=NODE_MODULES_PATHS(START)
2. for each DIR in DIRS:
a. LOAD_AS_FILE(DIR/X)
b. LOAD_AS_DIRECTORY(DIR/X)
NODE_MODULES_PATHS(START)
1. let PARTS = path split(START)
2. let ROOT = index of first instance of "node_modules" in PARTS, or 0
3. let I = count of PARTS - 1
4. let DIRS = []
5. while I > ROOT,
a. if PARTS[I] = "node_modules" CONTINUE
c. DIR = path join(PARTS[0 .. I] + "node_modules")
b. DIRS = DIRS + DIR
6. return DIRS
Loading from the require.paths
Folders
In node, require.paths
is an array of strings that represent paths to
be searched for modules when they are not prefixed with '/'
, './'
, or
'../'
. For example, if require.paths were set to:
[ '/home/micheil/.node_modules',
'/usr/local/lib/node_modules' ]
Then calling require('bar/baz.js')
would search the following
locations:
- 1:
'/home/micheil/.node_modules/bar/baz.js'
- 2:
'/usr/local/lib/node_modules/bar/baz.js'
The require.paths
array can be mutated at run time to alter this
behavior.
It is set initially from the NODE_PATH
environment variable, which is
a colon-delimited list of absolute paths. In the previous example,
the NODE_PATH
environment variable might have been set to:
/home/micheil/.node_modules:/usr/local/lib/node_modules
Loading from the require.paths
locations is only performed if the
module could not be found using the node_modules
algorithm above.
Global modules are lower priority than bundled dependencies.
Note: Please Avoid Modifying require.paths
For compatibility reasons, require.paths
is still given first priority
in the module lookup process. However, it may disappear in a future
release.
While it seemed like a good idea at the time, and enabled a lot of
useful experimentation, in practice a mutable require.paths
list is
often a troublesome source of confusion and headaches.
Setting require.paths
to some other value does nothing.
This does not do what one might expect:
require.paths = [ '/usr/lib/node' ];
All that does is lose the reference to the actual node module lookup paths, and create a new reference to some other thing that isn't used for anything.
Putting relative paths in require.paths
is... weird.
If you do this:
require.paths.push('./lib');
then it does not add the full resolved path to where ./lib
is on the filesystem. Instead, it literally adds './lib'
,
meaning that if you do require('y.js')
in /a/b/x.js
, then it'll look
in /a/b/lib/y.js
. If you then did require('y.js')
in
/l/m/n/o/p.js
, then it'd look in /l/m/n/o/lib/y.js
.
In practice, people have used this as an ad hoc way to bundle dependencies, but this technique is brittle.
Zero Isolation
There is (by regrettable design), only one require.paths
array used by
all modules.
As a result, if one node program comes to rely on this behavior, it may permanently and subtly alter the behavior of all other node programs in the same process. As the application stack grows, we tend to assemble functionality, and it is a problem with those parts interact in ways that are difficult to predict.
Addenda: Package Manager Tips
The semantics of Node's require()
function were designed to be general
enough to support a number of sane directory structures. Package manager
programs such as dpkg
, rpm
, and npm
will hopefully find it possible to
build native packages from Node modules without modification.
Below we give a suggested directory structure that could work:
Let's say that we wanted to have the folder at
/usr/lib/node/<some-package>/<some-version>
hold the contents of a
specific version of a package.
Packages can depend on one another. In order to install package foo
, you
may have to install a specific version of package bar
. The bar
package
may itself have dependencies, and in some cases, these dependencies may even
collide or form cycles.
Since Node looks up the realpath
of any modules it loads (that is,
resolves symlinks), and then looks for their dependencies in the
node_modules
folders as described above, this situation is very simple to
resolve with the following architecture:
/usr/lib/node/foo/1.2.3/
- Contents of thefoo
package, version 1.2.3./usr/lib/node/bar/4.3.2/
- Contents of thebar
package thatfoo
depends on./usr/lib/node/foo/1.2.3/node_modules/bar
- Symbolic link to/usr/lib/node/bar/4.3.2/
./usr/lib/node/bar/4.3.2/node_modules/*
- Symbolic links to the packages thatbar
depends on.
Thus, even if a cycle is encountered, or if there are dependency conflicts, every module will be able to get a version of its dependency that it can use.
When the code in the foo
package does require('bar')
, it will get the
version that is symlinked into /usr/lib/node/foo/1.2.3/node_modules/bar
.
Then, when the code in the bar
package calls require('quux')
, it'll get
the version that is symlinked into
/usr/lib/node/bar/4.3.2/node_modules/quux
.
Furthermore, to make the module lookup process even more optimal, rather
than putting packages directly in /usr/lib/node
, we could put them in
/usr/lib/node_modules/<name>/<version>
. Then node will not bother
looking for missing dependencies in /usr/node_modules
or /node_modules
.
In order to make modules available to the node REPL, it might be useful to
also add the /usr/lib/node_modules
folder to the $NODE_PATH
environment
variable. Since the module lookups using node_modules
folders are all
relative, and based on the real path of the files making the calls to
require()
, the packages themselves can be anywhere.