file-entry-cache

file-entry-cache

A lightweight cache for file metadata, ideal for processes that work on a specific set of files and only need to reprocess files that have changed since the last run

codecov tests npm npm license

Features

  • Lightweight cache for file metadata
  • Ideal for processes that work on a specific set of files
  • Persists cache to Disk via reconcile() or persistInterval on cache options.
  • Uses checksum to determine if a file has changed
  • Supports relative and absolute paths
  • Ability to rename keys in the cache. Useful when renaming directories.
  • ESM and CommonJS support with Typescript

Table of Contents

Installation

npm install file-entry-cache

Getting Started

import fileEntryCache from 'file-entry-cache';
const cache = fileEntryCache.create('cache1');
let fileDescriptor = cache.getFileDescriptor('file.txt');
console.log(fileDescriptor.changed); // true as it is the first time
fileDescriptor = cache.getFileDescriptor('file.txt');
console.log(fileDescriptor.changed); // false as it has not changed
// do something to change the file
fs.writeFileSync('file.txt', 'new data foo bar');
// check if the file has changed
fileDescriptor = cache.getFileDescriptor('file.txt');
console.log(fileDescriptor.changed); // true

Save it to Disk and Reconsile files that are no longer found

import fileEntryCache from 'file-entry-cache';
const cache = fileEntryCache.create('cache1');
let fileDescriptor = cache.getFileDescriptor('file.txt');
console.log(fileDescriptor.changed); // true as it is the first time
fileEntryCache.reconcile(); // save the cache to disk and remove files that are no longer found

Load the cache from a file:

import fileEntryCache from 'file-entry-cache';
const cache = fileEntryCache.createFromFile('/path/to/cache/file');
let fileDescriptor = cache.getFileDescriptor('file.txt');
console.log(fileDescriptor.changed); // false as it has not changed from the saved cache.

Changes from v9 to v10

There have been many features added and changes made to the file-entry-cache class. Here are the main changes:

  • Added cache object to the options to allow for more control over the cache
  • Added hashAlgorithm to the options to allow for different checksum algorithms. Note that if you load from file it most likely will break if the value was something before.
  • Updated more on using Relative or Absolute paths. We now support both on getFileDescriptor(). You can read more on this in the Get File Descriptor section.
  • Migrated to Typescript with ESM and CommonJS support. This allows for better type checking and support for both ESM and CommonJS.
  • Once options are passed in they get assigned as properties such as hashAlgorithm and currentWorkingDirectory. This allows for better control and access to the options. For the Cache options they are assigned to cache such as cache.ttl and cache.lruSize.
  • Added cache.persistInterval to allow for saving the cache to disk at a specific interval. This will save the cache to disk at the interval specified instead of calling reconsile() to save. (off by default)
  • Added getFileDescriptorsByPath(filePath: string): FileEntryDescriptor[] to get all the file descriptors that start with the path specified. This is useful when you want to get all the files in a directory or a specific path.
  • Added renameAbsolutePathKeys(oldPath: string, newPath: string): void will rename the keys in the cache from the old path to the new path. This is useful when you rename a directory and want to update the cache without reanalyzing the files.
  • Using flat-cache v6 which is a major update. This allows for better performance and more control over the cache.
  • On FileEntryDescriptor.meta if using typescript you need to use the meta.data to set additional information. This is to allow for better type checking and to avoid conflicts with the meta object which was any.

Global Default Functions

  • create(cacheId: string, cacheDirectory?: string, useCheckSum?: boolean, currentWorkingDirectory?: string) - Creates a new instance of the FileEntryCache class
  • createFromFile(cachePath: string, useCheckSum?: boolean, currentWorkingDirectory?: string) - Creates a new instance of the FileEntryCache class and loads the cache from a file.

FileEntryCache Options (FileEntryCacheOptions)

  • currentWorkingDirectory? - The current working directory. Used when resolving relative paths.
  • useCheckSum? - If true it will use a checksum to determine if the file has changed. Default is false
  • hashAlgorithm? - The algorithm to use for the checksum. Default is md5 but can be any algorithm supported by crypto.createHash
  • cache.ttl? - The time to live for the cache in milliseconds. Default is 0 which means no expiration
  • cache.lruSize? - The number of items to keep in the cache. Default is 0 which means no limit
  • cache.useClone? - If true it will clone the data before returning it. Default is false
  • cache.expirationInterval? - The interval to check for expired items in the cache. Default is 0 which means no expiration
  • cache.persistInterval? - The interval to save the data to disk. Default is 0 which means no persistence
  • cache.cacheDir? - The directory to save the cache files. Default is ./cache
  • cache.cacheId? - The id of the cache. Default is cache1
  • cache.parse? - The function to parse the data. Default is flatted.parse
  • cache.stringify? - The function to stringify the data. Default is flatted.stringify

API

  • constructor(options?: FileEntryCacheOptions) - Creates a new instance of the FileEntryCache class
  • useCheckSum: boolean - If true it will use a checksum to determine if the file has changed. Default is false
  • hashAlgorithm: string - The algorithm to use for the checksum. Default is md5 but can be any algorithm supported by crypto.createHash
  • currentWorkingDirectory: string - The current working directory. Used when resolving relative paths.
  • getHash(buffer: Buffer): string - Gets the hash of a buffer used for checksums
  • createFileKey(filePath: string): string - Creates a key for the file path. This is used to store the data in the cache based on relative or absolute paths.
  • deleteCacheFile(filePath: string): void - Deletes the cache file
  • destroy(): void - Destroys the cache. This will also delete the cache file. If using cache persistence it will stop the interval.
  • removeEntry(filePath: string): void - Removes an entry from the cache. This can be relative or absolute paths.
  • reconcile(): void - Saves the cache to disk and removes any files that are no longer found.
  • hasFileChanged(filePath: string): boolean - Checks if the file has changed. This will return true if the file has changed.
  • getFileDescriptor(filePath: string, options?: { useCheckSum?: boolean, currentWorkingDirectory?: string }): FileEntryDescriptor - Gets the file descriptor for the file. Please refer to the entire section on Get File Descriptor for more information.
  • normalizeEntries(entries: FileEntryDescriptor[]): FileEntryDescriptor[] - Normalizes the entries to have the correct paths. This is used when loading the cache from disk.
  • analyzeFiles(files: string[]) will return AnalyzedFiles object with changedFiles, notFoundFiles, and notChangedFiles as FileDescriptor arrays.
  • getUpdatedFiles(files: string[]) will return an array of FileEntryDescriptor objects that have changed.
  • getFileDescriptorsByPath(filePath: string): FileEntryDescriptor[] will return an array of FileEntryDescriptor objects that starts with the path specified.
  • renameAbsolutePathKeys(oldPath: string, newPath: string): void - Renames the keys in the cache from the old path to the new path. This is useful when you rename a directory and want to update the cache without reanalyzing the files.

Get File Descriptor

The getFileDescriptor(filePath: string, options?: { useCheckSum?: boolean, currentWorkingDirectory?: string }): FileEntryDescriptor function is used to get the file descriptor for the file. This function will return a FileEntryDescriptor object that has the following properties:

  • key: string - The key for the file. This is the relative or absolute path of the file.
  • changed: boolean - If the file has changed since the last time it was analyzed.
  • notFound: boolean - If the file was not found.
  • meta: FileEntryMeta - The meta data for the file. This has the following prperties: size, mtime, ctime, hash, data. Note that data is an object that can be used to store additional information.
  • err - If there was an error analyzing the file.

We have added the ability to use relative or absolute paths. If you pass in a relative path it will use the currentWorkingDirectory to resolve the path. If you pass in an absolute path it will use the path as is. This is useful when you want to use relative paths but also want to use absolute paths.

If you do not pass in currentWorkingDirectory in the class options or in the getFileDescriptor function it will use the process.cwd() as the default currentWorkingDirectory.

const fileEntryCache = new FileEntryCache();
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { currentWorkingDirectory: '/path/to/directory' });

Since this is a relative path it will use the currentWorkingDirectory to resolve the path. If you want to use an absolute path you can do the following:

const fileEntryCache = new FileEntryCache();
const filePath = path.resolve('/path/to/directory', 'file.txt');
const fileDescriptor = fileEntryCache.getFileDescriptor(filePath);

This will save the key as the absolute path.

If there is an error when trying to get the file descriptor it will return an ``notFoundanderr` property with the error.

const fileEntryCache = new FileEntryCache();
const fileDescriptor = fileEntryCache.getFileDescriptor('no-file');
if (fileDescriptor.err) {
    console.error(fileDescriptor.err);
}

if (fileDescriptor.notFound) {
    console.error('File not found');
}

Using Checksums to Determine if a File has Changed (useCheckSum)

By default the useCheckSum is false. This means that the FileEntryCache will use the mtime and ctime to determine if the file has changed. If you set useCheckSum to true it will use a checksum to determine if the file has changed. This is useful when you want to make sure that the file has not changed at all.

const fileEntryCache = new FileEntryCache();
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { useCheckSum: true });

You can pass useCheckSum in the FileEntryCache options, as a property .useCheckSum to make it default for all files, or in the getFileDescriptor function. Here is an example where you set it globally but then override it for a specific file:

const fileEntryCache = new FileEntryCache({ useCheckSum: true });
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { useCheckSum: false });

Setting Additional Meta Data

In the past we have seen people do random values on the meta object. This can cause issues with the meta object. To avoid this we have data which can be anything.

const fileEntryCache = new FileEntryCache();
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt');
fileDescriptor.meta.data = { myData: 'myData' }; //anything you want

How to Contribute

You can contribute by forking the repo and submitting a pull request. Please make sure to add tests and update the documentation. To learn more about how to contribute go to our main README https://github.com/jaredwray/cacheable. This will talk about how to Open a Pull Request, Ask a Question, or Post an Issue.

License and Copyright

MIT © Jared Wray