Search for '{{search_term}}'

List content of a directory with Node.js

CMOS is the Code-Maven Open Source podcast that also includes video interviews. Subscribe to this feed RSS feed with your Podcast listener app or via iTunes iTunes.

Just as the dir command in MS Windows (or more specifically in DOS), or the ls command on Unix/Linux, we would like to implement a Node.js script, that give a directory, can list the content of the directory with some more information about each entry in the directory.

We already know how to get information from the inode of a file or directory, so if we only need to generated all the entries in a directory and then call fs.stat for each entry.

This script will accept (and even require) a path to a directory on the command line, and then it will list the content of the directory. (Without recursion.)

examples/node/read_dir.js

var fs = require('fs');


if (process.argv.length <= 2) {
    console.log("Usage: " + __filename + " path/to/directory");
    process.exit(-1);
}

var path = process.argv[2];

fs.readdir(path, function(err, items) {
    console.log(items);

    for (var i=0; i<items.length; i++) {
        console.log(items[i]);
    }
});

If you have read the article about the fetching system information for a single file then you already know the first part of the script. The interesting new part is this one:

fs.readdir(path, function(err, items) {
    console.log(items);

    for (var i=0; i<items.length; i++) {
        console.log(items[i]);
    }
});

Here we use the readdir method of the fs class, that gets a path and a callback function as parameters. It will read the content of the directory into memory and when done it will execute the callback with 2 parameters. The first one is the error object in case there was an error. The second one is a callback that will be called when the the operation has finished. If there was an error then the first parameter will hold that information. If everything went well, then the second parameter will be an array with all the items (files, directories, symbolic links, etc.) that were found in the directory.

At that point, inside the callback function we can either just print the whole array - if we just want to enjoy our success or we can iterate over the list with a for loop and do something with each item. For example we could print each item.

The listing will contain everything, except of . which point to the current directory and .. which represents the parent directory.

This is how it looked:

$ node examples/node/read_dir.js ~/work/code-maven.com/examples/

[ 'blocking-read-file.js',
  'node_hello_world.js',
  'node_hello_world_port.js',
  'non-blocking-read-file.js',
  'process_exit.js',
  'raw_command_line_arguments.js',
  'read_dir.js',
  'stats.js' ]
blocking-read-file.js
node_hello_world.js
node_hello_world_port.js
non-blocking-read-file.js
process_exit.js
raw_command_line_arguments.js
read_dir.js
stats.js

Listing the details of each entry

Now that we know how to get the list of entries in a directory, and that we already know how to fetch the details about a file we can combine the two.

examples/node/list_dir_direct.js

var fs = require('fs');

if (process.argv.length <= 2) {
    console.log("Usage: " + __filename + " path/to/directory");
    process.exit(-1);
}

var path = process.argv[2];

fs.readdir(path, function(err, items) {
    for (var i=0; i<items.length; i++) {
        var file = path + '/' + items[i];
        console.log("Start: " + file);

        fs.stat(file, function(err, stats) {
            console.log(file);
            console.log(stats["size"]);
        });
    }
});


This code is quite straight forward. And it is also wrong as we'll see soon.

Inside the callback of the readdir method, we have the for loop. In that for-loop, on every iteration we print out the name of the current file (after we have attached it to the full path of the directory) mostly for debugging purposes, and we call the fs.stat. That method in turn accepts a callback function. In that function we print out the name of the file - this time as part of the result, and then print out the size of the thing. (We could print out all the other details as we did in the other article but the size will be enough for now.

The output:

$ node examples/node/list_dir_direct.js ~/work/code-maven.com/examples/

Start: /home/gabor/work/code-maven.com/examples//blocking-read-file.js
Start: /home/gabor/work/code-maven.com/examples//node_hello_world.js
Start: /home/gabor/work/code-maven.com/examples//node_hello_world_port.js
Start: /home/gabor/work/code-maven.com/examples//non-blocking-read-file.js
Start: /home/gabor/work/code-maven.com/examples//process_exit.js
Start: /home/gabor/work/code-maven.com/examples//raw_command_line_arguments.js
Start: /home/gabor/work/code-maven.com/examples//read_dir.js
Start: /home/gabor/work/code-maven.com/examples//stats.js

/home/gabor/work/code-maven.com/examples//stats.js
97
/home/gabor/work/code-maven.com/examples//stats.js
243
/home/gabor/work/code-maven.com/examples//stats.js
270
/home/gabor/work/code-maven.com/examples//stats.js
151
/home/gabor/work/code-maven.com/examples//stats.js
18
/home/gabor/work/code-maven.com/examples//stats.js
324
/home/gabor/work/code-maven.com/examples//stats.js
27
/home/gabor/work/code-maven.com/examples//stats.js
1382

The debugging printout printed the names as expected, but then inside the callback of fs.stat() we keep printing out the same filename. comparing the results to the output of

$ ls -l ~/work/code-maven.com/examples/
total 64
-rw-r--r--  1 gabor  staff    97 Jan 29 14:26 blocking-read-file.js
-rw-r--r--  1 gabor  staff   243 Jan 27 12:34 node_hello_world.js
-rw-r--r--  1 gabor  staff   270 Jan 27 12:34 node_hello_world_port.js
-rw-r--r--  1 gabor  staff   151 Jan 29 14:26 non-blocking-read-file.js
-rw-r--r--  1 gabor  staff    18 Jan 31 08:24 process_exit.js
-rw-r--r--  1 gabor  staff    27 Jan 29 14:54 raw_command_line_arguments.js
-rw-r--r--  1 gabor  staff   324 Jan 31 15:26 read_dir.js
-rw-r--r--  1 gabor  staff  1382 Jan 31 10:45 stats.js

The sizes seem to match the filenames, because these were printed in the same order as we called fs.stat(), but for some reason the content of the file variable was the same for every callback. This happens because the file variable is just a simple global variable (from the point of view of the callback) and by the time the first callback was executed, the file variable was already assigned the last entry in the directory.

So if we want to combine the name of the file and the result of the fs.stat() call then we need to rely on the order of calls. But can we rely on that? In this particular case of calling stat on items in a single directory, this might work as we can expect the events to be handled one after the other, but if the operation was more complex, and especially if there can be internal callbacks as well, then suddenly we cannot rely on the order of callback calls to be the same as the order of the initial execution was.

So we need to find a way to pass the file parameter to the internal callback.

Generate callbacks

In this solution, instead of adding a hard-coded callback function we will call a function named generate_callback() that will generate a callback for us.

So every time we run fs.stat(), before fs.stat() is actually executed, JavaScript will call the generate_callback() function with the current value of file. The generate_callback will create a new function and will return that function. This newly generated function will become the callback of the fs.stat() method.

examples/node/list_dir_generate.js

var fs = require('fs');

if (process.argv.length <= 2) {
    console.log("Usage: " + __filename + " path/to/directory");
    process.exit(-1);
}

var path = process.argv[2];

fs.readdir(path, function(err, items) {
    for (var i=0; i<items.length; i++) {
        var file = path + '/' + items[i];

        console.log("Start: " + file);
        fs.stat(file, generate_callback(file));
    }
});

function generate_callback(file) {
    return function(err, stats) {
            console.log(file);
            console.log(stats["size"]);
        }
};

The result:

$ node examples/node/list_dir_generate.js ~/work/code-maven.com/examples/
Start: /Users/gabor/work/code-maven.com/examples//blocking-read-file.js
Start: /Users/gabor/work/code-maven.com/examples//node_hello_world.js
Start: /Users/gabor/work/code-maven.com/examples//node_hello_world_port.js
Start: /Users/gabor/work/code-maven.com/examples//non-blocking-read-file.js
Start: /Users/gabor/work/code-maven.com/examples//process_exit.js
Start: /Users/gabor/work/code-maven.com/examples//raw_command_line_arguments.js
Start: /Users/gabor/work/code-maven.com/examples//read_dir.js
Start: /Users/gabor/work/code-maven.com/examples//stats.js

/Users/gabor/work/code-maven.com/examples//blocking-read-file.js
97
/Users/gabor/work/code-maven.com/examples//node_hello_world.js
243
/Users/gabor/work/code-maven.com/examples//node_hello_world_port.js
270
/Users/gabor/work/code-maven.com/examples//non-blocking-read-file.js
151
/Users/gabor/work/code-maven.com/examples//process_exit.js
18
/Users/gabor/work/code-maven.com/examples//raw_command_line_arguments.js
27
/Users/gabor/work/code-maven.com/examples//read_dir.js
324
/Users/gabor/work/code-maven.com/examples//stats.js
1382

The variable file that is now seen by the callback function holds the value of file when the function was generated which is the name of the file the fs.stat() received as parameter.

Nameless function generator

Finally let's look at a solution in which we have eliminated the need to have an external function called generate_callback.

The function is still there, it just does not have a name. Instead of declaring it separately, we have included it in the expression in fs.stat(). I am not sure if I like this or if I prefer the lengthier, but probably more readable version with the generate_callback function.

examples/node/list_dir_noname.js

var fs = require('fs');

if (process.argv.length <= 2) {
    console.log("Usage: " + __filename + " path/to/directory");
    process.exit(-1);
}

var path = process.argv[2];

fs.readdir(path, function(err, items) {
    for (var i=0; i<items.length; i++) {
        var file = path + '/' + items[i];

        console.log("Start: " + file);
        fs.stat(file, function(f) {
            return function(err, stats) {
               console.log(f);
               console.log(stats["size"]);
            }
        }(file));
    }
});

Comments

In the comments, please wrap your code snippets within <pre> </pre> tags and use spaces for indentation.
comments powered by Disqus