0
0
mirror of https://github.com/nodejs/node.git synced 2024-11-29 15:06:33 +01:00
nodejs/doc/api/modules.markdown
Ryan Dahl 23b8931b62 Merge branch 'v0.4'
Conflicts:
	src/node.js
	src/node_version.h
2011-06-29 14:50:03 +02:00

14 KiB

Modules

Node has a simple module loading system. In Node, files and modules are in one-to-one correspondence. As an example, foo.js loads the module circle.js in the same directory.

The contents of foo.js:

var circle = require('./circle.js');
console.log( 'The area of a circle of radius 4 is '
           + circle.area(4));

The contents of circle.js:

var PI = Math.PI;

exports.area = function (r) {
  return PI * r * r;
};

exports.circumference = function (r) {
  return 2 * PI * r;
};

The module circle.js has exported the functions area() and circumference(). To export an object, add to the special exports object.

Variables local to the module will be private. In this example the variable PI is private to circle.js.

Core Modules

Node has several modules compiled into the binary. These modules are described in greater detail elsewhere in this documentation.

The core modules are defined in node's source in the lib/ folder.

Core modules are always preferentially loaded if their identifier is passed to require(). For instance, require('http') will always return the built in HTTP module, even if there is a file by that name.

File Modules

If the exact filename is not found, then node will attempt to load the required filename with the added extension of .js, and then .node.

.js files are interpreted as JavaScript text files, and .node files are interpreted as compiled addon modules loaded with dlopen.

A module prefixed with '/' is an absolute path to the file. For example, require('/home/marco/foo.js') will load the file at /home/marco/foo.js.

A module prefixed with './' is relative to the file calling require(). That is, circle.js must be in the same directory as foo.js for require('./circle') to find it.

Without a leading '/' or './' to indicate a file, the module is either a "core module" or is loaded from a node_modules folder.

Loading from node_modules Folders

If the module identifier passed to require() is not a native module, and does not begin with '/', '../', or './', then node starts at the parent directory of the current module, and adds /node_modules, and attempts to load the module from that location.

If it is not found there, then it moves to the parent directory, and so on, until the root of the tree is reached.

For example, if the file at '/home/ry/projects/foo.js' called require('bar.js'), then node would look in the following locations, in this order:

  • /home/ry/projects/node_modules/bar.js
  • /home/ry/node_modules/bar.js
  • /home/node_modules/bar.js
  • /node_modules/bar.js

This allows programs to localize their dependencies, so that they do not clash.

Folders as Modules

It is convenient to organize programs and libraries into self-contained directories, and then provide a single entry point to that library. There are three ways in which a folder may be passed to require() as an argument.

The first is to create a package.json file in the root of the folder, which specifies a main module. An example package.json file might look like this:

{ "name" : "some-library",
  "main" : "./lib/some-library.js" }

If this was in a folder at ./some-library, then require('./some-library') would attempt to load ./some-library/lib/some-library.js.

This is the extent of Node's awareness of package.json files.

If there is no package.json file present in the directory, then node will attempt to load an index.js or index.node file out of that directory. For example, if there was no package.json file in the above example, then require('./some-library') would attempt to load:

  • ./some-library/index.js
  • ./some-library/index.node

Caching

Modules are cached after the first time they are loaded. This means (among other things) that every call to require('foo') will get exactly the same object returned, if it would resolve to the same file.

Multiple calls to require('foo') may not cause the module code to be executed multiple times. This is an important feature. With it, "partially done" objects can be returned, thus allowing transitive dependencies to be loaded even when they would cause cycles.

If you want to have a module execute code multiple times, then export a function, and call that function.

Module Caching Caveats

Modules are cached based on their resolved filename. Since modules may resolve to a different filename based on the location of the calling module (loading from node_modules folders), it is not a guarantee that require('foo') will always return the exact same object, if it would resolve to different files.

module.exports

The exports object is created by the Module system. Sometimes this is not acceptable, many want their module to be an instance of some class. To do this assign the desired export object to module.exports. For example suppose we were making a module called a.js

var EventEmitter = require('events').EventEmitter;

module.exports = new EventEmitter();

// Do some work, and after some time emit
// the 'ready' event from the module itself.
setTimeout(function() {
  module.exports.emit('ready');
}, 1000);

Then in another file we could do

var a = require('./a');
a.on('ready', function() {
  console.log('module a is ready');
});

Note that assignment to module.exports must be done immediately. It cannot be done in any callbacks. This does not work:

x.js:

setTimeout(function() {
  module.exports = { a: "hello" };
}, 0);

y.js:

var x = require('./x');
console.log(x.a);

All Together...

To get the exact filename that will be loaded when require() is called, use the require.resolve() function.

Putting together all of the above, here is the high-level algorithm in pseudocode of what require.resolve does:

require(X) from module at path Y
1. If X is a core module,
   a. return the core module
   b. STOP
2. If X begins with './' or '/' or '../'
   a. LOAD_AS_FILE(Y + X)
   b. LOAD_AS_DIRECTORY(Y + X)
3. LOAD_NODE_MODULES(X, dirname(Y))
4. THROW "not found"

LOAD_AS_FILE(X)
1. If X is a file, load X as JavaScript text.  STOP
2. If X.js is a file, load X.js as JavaScript text.  STOP
3. If X.node is a file, load X.node as binary addon.  STOP

LOAD_AS_DIRECTORY(X)
1. If X/package.json is a file,
   a. Parse X/package.json, and look for "main" field.
   b. let M = X + (json main field)
   c. LOAD_AS_FILE(M)
2. LOAD_AS_FILE(X/index)

LOAD_NODE_MODULES(X, START)
1. let DIRS=NODE_MODULES_PATHS(START)
2. for each DIR in DIRS:
   a. LOAD_AS_FILE(DIR/X)
   b. LOAD_AS_DIRECTORY(DIR/X)

NODE_MODULES_PATHS(START)
1. let PARTS = path split(START)
2. let ROOT = index of first instance of "node_modules" in PARTS, or 0
3. let I = count of PARTS - 1
4. let DIRS = []
5. while I > ROOT,
   a. if PARTS[I] = "node_modules" CONTINUE
   c. DIR = path join(PARTS[0 .. I] + "node_modules")
   b. DIRS = DIRS + DIR
   c. let I = I - 1
6. return DIRS

Loading from the require.paths Folders

In node, require.paths is an array of strings that represent paths to be searched for modules when they are not prefixed with '/', './', or '../'. For example, if require.paths were set to:

[ '/home/micheil/.node_modules',
  '/usr/local/lib/node_modules' ]

Then calling require('bar/baz.js') would search the following locations:

  • 1: '/home/micheil/.node_modules/bar/baz.js'
  • 2: '/usr/local/lib/node_modules/bar/baz.js'

The require.paths array can be mutated at run time to alter this behavior.

It is set initially from the NODE_PATH environment variable, which is a colon-delimited list of absolute paths. In the previous example, the NODE_PATH environment variable might have been set to:

/home/micheil/.node_modules:/usr/local/lib/node_modules

Loading from the require.paths locations is only performed if the module could not be found using the node_modules algorithm above. Global modules are lower priority than bundled dependencies.

Note: Please Avoid Modifying require.paths

require.paths may disappear in a future release.

While it seemed like a good idea at the time, and enabled a lot of useful experimentation, in practice a mutable require.paths list is often a troublesome source of confusion and headaches.

Setting require.paths to some other value does nothing.

This does not do what one might expect:

require.paths = [ '/usr/lib/node' ];

All that does is lose the reference to the actual node module lookup paths, and create a new reference to some other thing that isn't used for anything.

Putting relative paths in require.paths is... weird.

If you do this:

require.paths.push('./lib');

then it does not add the full resolved path to where ./lib is on the filesystem. Instead, it literally adds './lib', meaning that if you do require('y.js') in /a/b/x.js, then it'll look in /a/b/lib/y.js. If you then did require('y.js') in /l/m/n/o/p.js, then it'd look in /l/m/n/o/lib/y.js.

In practice, people have used this as an ad hoc way to bundle dependencies, but this technique is brittle.

Zero Isolation

There is (by regrettable design), only one require.paths array used by all modules.

As a result, if one node program comes to rely on this behavior, it may permanently and subtly alter the behavior of all other node programs in the same process. As the application stack grows, we tend to assemble functionality, and those parts interact in ways that are difficult to predict.

Accessing the main module

When a file is run directly from Node, require.main is set to its module. That means that you can determine whether a file has been run directly by testing

require.main === module

For a file foo.js, this will be true if run via node foo.js, but false if run by require('./foo').

Because module provides a filename property (normally equivalent to __filename), the entry point of the current application can be obtained by checking require.main.filename.

AMD Compatibility

Node's modules have access to a function named define, which may be used to specify the module's return value. This is not necessary in node programs, but is present in the node API in order to provide compatibility with module loaders that use the Asynchronous Module Definition pattern.

The example module above could be structured like so:

define(function (require, exports, module) {
  var PI = Math.PI;

  exports.area = function (r) {
    return PI * r * r;
  };

  exports.circumference = function (r) {
    return 2 * PI * r;
  };
});
  • Only the last argument to define() matters. Other module loaders sometimes use a define(id, [deps], cb) pattern, but since this is not relevant in node programs, the other arguments are ignored.
  • If the define callback returns a value other than undefined, then that value is assigned to module.exports.
  • Important: Despite being called "AMD", the node module loader is in fact synchronous, and using define() does not change this fact. Node executes the callback immediately, so please plan your programs accordingly.

Accessing the main module

When a file is run directly from Node, require.main is set to its module. That means that you can determine whether a file has been run directly by testing

require.main === module

For a file foo.js, this will be true if run via node foo.js, but false if run by require('./foo').

Because module provides a filename property (normally equivalent to __filename), the entry point of the current application can be obtained by checking require.main.filename.

Addenda: Package Manager Tips

The semantics of Node's require() function were designed to be general enough to support a number of sane directory structures. Package manager programs such as dpkg, rpm, and npm will hopefully find it possible to build native packages from Node modules without modification.

Below we give a suggested directory structure that could work:

Let's say that we wanted to have the folder at /usr/lib/node/<some-package>/<some-version> hold the contents of a specific version of a package.

Packages can depend on one another. In order to install package foo, you may have to install a specific version of package bar. The bar package may itself have dependencies, and in some cases, these dependencies may even collide or form cycles.

Since Node looks up the realpath of any modules it loads (that is, resolves symlinks), and then looks for their dependencies in the node_modules folders as described above, this situation is very simple to resolve with the following architecture:

  • /usr/lib/node/foo/1.2.3/ - Contents of the foo package, version 1.2.3.
  • /usr/lib/node/bar/4.3.2/ - Contents of the bar package that foo depends on.
  • /usr/lib/node/foo/1.2.3/node_modules/bar - Symbolic link to /usr/lib/node/bar/4.3.2/.
  • /usr/lib/node/bar/4.3.2/node_modules/* - Symbolic links to the packages that bar depends on.

Thus, even if a cycle is encountered, or if there are dependency conflicts, every module will be able to get a version of its dependency that it can use.

When the code in the foo package does require('bar'), it will get the version that is symlinked into /usr/lib/node/foo/1.2.3/node_modules/bar. Then, when the code in the bar package calls require('quux'), it'll get the version that is symlinked into /usr/lib/node/bar/4.3.2/node_modules/quux.

Furthermore, to make the module lookup process even more optimal, rather than putting packages directly in /usr/lib/node, we could put them in /usr/lib/node_modules/<name>/<version>. Then node will not bother looking for missing dependencies in /usr/node_modules or /node_modules.

In order to make modules available to the node REPL, it might be useful to also add the /usr/lib/node_modules folder to the $NODE_PATH environment variable. Since the module lookups using node_modules folders are all relative, and based on the real path of the files making the calls to require(), the packages themselves can be anywhere.