file-entry-cache
Table of Contents
- file-entry-cache
- Features
- Table of Contents
- Installation
- Getting Started
- Changes from v10 to v11
- Changes from v9 to v10
- Global Default Functions
- FileEntryCache Options (FileEntryCacheOptions)
- API
- Get File Descriptor
- Path Security and Traversal Prevention
- Using Checksums to Determine if a File has Changed (useCheckSum)
- Setting Additional Meta Data
- How to Contribute
- License and Copyright
file-entry-cache
A lightweight cache for file metadata, ideal for processes that work on a specific set of files and only need to reprocess files that have changed since the last run
Features
- Lightweight cache for file metadata
- Ideal for processes that work on a specific set of files
- Persists cache to Disk via
reconcile()
orpersistInterval
oncache
options. - Uses
checksum
to determine if a file has changed - Supports
relative
andabsolute
paths with configurable current working directory - Portable cache files when using relative paths
- ESM and CommonJS support with Typescript
Table of Contents
- Installation
- Getting Started
- Changes from v10 to v11
- Changes from v9 to v10
- Global Default Functions
- FileEntryCache Options (FileEntryCacheOptions)
- API
- Get File Descriptor
- Path Security and Traversal Prevention
- Using Checksums to Determine if a File has Changed (useCheckSum)
- Setting Additional Meta Data
- How to Contribute
- License and Copyright
Installation
npm install file-entry-cache
Getting Started
import fileEntryCache from 'file-entry-cache';
const cache = fileEntryCache.create('cache1');
// Using relative paths
let fileDescriptor = cache.getFileDescriptor('./src/file.txt');
console.log(fileDescriptor.changed); // true as it is the first time
console.log(fileDescriptor.key); // './src/file.txt' (stored as provided)
fileDescriptor = cache.getFileDescriptor('./src/file.txt');
console.log(fileDescriptor.changed); // false as it has not changed
// do something to change the file
fs.writeFileSync('./src/file.txt', 'new data foo bar');
// check if the file has changed
fileDescriptor = cache.getFileDescriptor('./src/file.txt');
console.log(fileDescriptor.changed); // true
Save it to Disk and Reconcile files that are no longer found
import fileEntryCache from 'file-entry-cache';
const cache = fileEntryCache.create('cache1');
let fileDescriptor = cache.getFileDescriptor('./src/file.txt');
console.log(fileDescriptor.changed); // true as it is the first time
cache.reconcile(); // save the cache to disk and remove files that are no longer found
Load the cache from a file:
import fileEntryCache from 'file-entry-cache';
const cache = fileEntryCache.createFromFile('/path/to/cache/file');
let fileDescriptor = cache.getFileDescriptor('./src/file.txt');
console.log(fileDescriptor.changed); // false as it has not changed from the saved cache.
Changes from v10 to v11
BREAKING CHANGES:
strictPaths
now defaults totrue
- Path traversal protection is enabled by default for security. To restore v10 behavior, explicitly setstrictPaths: false
NEW FEATURES:
- Added
cwd
option - You can now specify a custom current working directory for resolving relative paths - Added
strictPaths
option - Provides protection against path traversal attacks (enabled by default) - Improved cache portability - When using relative paths with the same
cwd
, cache files are portable across different environments
Changes from v9 to v10
There have been many features added and changes made to the file-entry-cache
class. Here are the main changes:
- Added
cache
object to the options to allow for more control over the cache - Added
hashAlgorithm
to the options to allow for different checksum algorithms. Note that if you load from file it most likely will break if the value was something before. - Migrated to Typescript with ESM and CommonJS support. This allows for better type checking and support for both ESM and CommonJS.
- Once options are passed in they get assigned as properties such as
hashAlgorithm
. For the Cache options they are assigned tocache
such ascache.ttl
andcache.lruSize
. - Added
cache.persistInterval
to allow for saving the cache to disk at a specific interval. This will save the cache to disk at the interval specified instead of callingreconsile()
to save. (off
by default) - Added
getFileDescriptorsByPath(filePath: string): FileEntryDescriptor[]
to get all the file descriptors that start with the path specified. This is useful when you want to get all the files in a directory or a specific path. - Using
flat-cache
v6 which is a major update. This allows for better performance and more control over the cache. - On
FileEntryDescriptor.meta
if using typescript you need to use themeta.data
to set additional information. This is to allow for better type checking and to avoid conflicts with themeta
object which wasany
.
Global Default Functions
create(cacheId: string, cacheDirectory?: string, useCheckSum?: boolean, cwd?: string)
- Creates a new instance of theFileEntryCache
classcreateFromFile(cachePath: string, useCheckSum?: boolean, cwd?: string)
- Creates a new instance of theFileEntryCache
class and loads the cache from a file.
FileEntryCache Options (FileEntryCacheOptions)
useModifiedTime?
- Iftrue
it will use the modified time to determine if the file has changed. Default istrue
useCheckSum?
- Iftrue
it will use a checksum to determine if the file has changed. Default isfalse
hashAlgorithm?
- The algorithm to use for the checksum. Default ismd5
but can be any algorithm supported bycrypto.createHash
cwd?
- The current working directory for resolving relative paths. Default isprocess.cwd()
strictPaths?
- Iftrue
restricts file access to withincwd
boundaries, preventing path traversal attacks. Default istrue
cache.ttl?
- The time to live for the cache in milliseconds. Default is0
which means no expirationcache.lruSize?
- The number of items to keep in the cache. Default is0
which means no limitcache.useClone?
- Iftrue
it will clone the data before returning it. Default isfalse
cache.expirationInterval?
- The interval to check for expired items in the cache. Default is0
which means no expirationcache.persistInterval?
- The interval to save the data to disk. Default is0
which means no persistencecache.cacheDir?
- The directory to save the cache files. Default is./cache
cache.cacheId?
- The id of the cache. Default iscache1
cache.parse?
- The function to parse the data. Default isflatted.parse
cache.stringify?
- The function to stringify the data. Default isflatted.stringify
API
constructor(options?: FileEntryCacheOptions)
- Creates a new instance of theFileEntryCache
classuseCheckSum: boolean
- Iftrue
it will use a checksum to determine if the file has changed. Default isfalse
hashAlgorithm: string
- The algorithm to use for the checksum. Default ismd5
but can be any algorithm supported bycrypto.createHash
getHash(buffer: Buffer): string
- Gets the hash of a buffer used for checksumscwd: string
- The current working directory for resolving relative paths. Default isprocess.cwd()
strictPaths: boolean
- Iftrue
restricts file access to withincwd
boundaries. Default istrue
createFileKey(filePath: string): string
- Returns the cache key for the file path (returns the path exactly as provided).deleteCacheFile(): boolean
- Deletes the cache file from diskdestroy(): void
- Destroys the cache. This will clear the cache in memory. If using cache persistence it will stop the interval.removeEntry(filePath: string): void
- Removes an entry from the cache.reconcile(): void
- Saves the cache to disk and removes any files that are no longer found.hasFileChanged(filePath: string): boolean
- Checks if the file has changed. This will returntrue
if the file has changed.getFileDescriptor(filePath: string, options?: { useModifiedTime?: boolean, useCheckSum?: boolean }): FileEntryDescriptor
- Gets the file descriptor for the file. Please refer to the entire section onGet File Descriptor
for more information.normalizeEntries(files?: string[]): FileDescriptor[]
- Normalizes the entries. If no files are provided, it will return all cached entries.analyzeFiles(files: string[])
will returnAnalyzedFiles
object withchangedFiles
,notFoundFiles
, andnotChangedFiles
as FileDescriptor arrays.getUpdatedFiles(files: string[])
will return an array ofFileEntryDescriptor
objects that have changed.getFileDescriptorsByPath(filePath: string): FileEntryDescriptor[]
will return an array ofFileEntryDescriptor
objects that starts with the path prefix specified.getAbsolutePath(filePath: string): string
- Resolves a relative path to absolute using the configuredcwd
. Returns absolute paths unchanged. WhenstrictPaths
is enabled, throws an error if the path resolves outsidecwd
.getAbsolutePathWithCwd(filePath: string, cwd: string): string
- Resolves a relative path to absolute using a custom working directory. WhenstrictPaths
is enabled, throws an error if the path resolves outside the providedcwd
.
Get File Descriptor
The getFileDescriptor(filePath: string, options?: { useCheckSum?: boolean, useModifiedTime?: boolean }): FileEntryDescriptor
function is used to get the file descriptor for the file. This function will return a FileEntryDescriptor
object that has the following properties:
key: string
- The cache key for the file. This is exactly the path that was provided (relative or absolute).changed: boolean
- If the file has changed since the last time it was analyzed.notFound: boolean
- If the file was not found.meta: FileEntryMeta
- The meta data for the file. This has the following properties:size
,mtime
,hash
,data
. Note thatdata
is an object that can be used to store additional information.err
- If there was an error analyzing the file.
Path Handling and Current Working Directory
The cache stores paths exactly as they are provided (relative or absolute). When checking if files have changed, relative paths are resolved using the configured cwd
(current working directory):
// Default: uses process.cwd()
const cache1 = fileEntryCache.create('cache1');
// Custom working directory
const cache2 = fileEntryCache.create('cache2', './cache', false, '/project/root');
// Or with options object
const cache3 = new FileEntryCache({ cwd: '/project/root' });
// The cache key is always the provided path
const descriptor = cache2.getFileDescriptor('./src/file.txt');
console.log(descriptor.key); // './src/file.txt'
// But file operations resolve from: '/project/root/src/file.txt'
Cache Portability
Using relative paths with a consistent cwd
(defaults to process.cwd()
) makes cache files portable across different machines and environments. This is especially useful for CI/CD pipelines and team development.
// On machine A (project at /home/user/project)
const cacheA = fileEntryCache.create('build-cache', './cache', false, '/home/user/project');
cacheA.getFileDescriptor('./src/index.js'); // Resolves to /home/user/project/src/index.js
cacheA.reconcile();
// On machine B (project at /workspace/project)
const cacheB = fileEntryCache.create('build-cache', './cache', false, '/workspace/project');
cacheB.getFileDescriptor('./src/index.js'); // Resolves to /workspace/project/src/index.js
// Cache hit! File hasn't changed since machine A
Maximum Portability with Checksums
For maximum cache portability across different environments, use checksums (useCheckSum: true
) along with relative paths and cwd
which defaults to process.cwd()
. This ensures that cache validity is determined by file content rather than modification times, which can vary across systems:
// Development machine
const devCache = fileEntryCache.create(
'.buildcache',
'./cache', // cache directory
true // Use checksums for content-based comparison
);
// Process files using relative paths
const descriptor = devCache.getFileDescriptor('./src/index.js');
if (descriptor.changed) {
console.log('Building ./src/index.js...');
// Build process here
}
devCache.reconcile(); // Save cache
// CI/CD Pipeline or another developer's machine
const ciCache = fileEntryCache.create(
'.buildcache',
'./node_modules/.cache',
true, // Same checksum setting
process.cwd() // Different absolute path, same relative structure
);
// Same relative path works across environments
const descriptor2 = ciCache.getFileDescriptor('./src/index.js');
if (!descriptor2.changed) {
console.log('Using cached result for ./src/index.js');
// Skip rebuild - file content unchanged
}
Handling Project Relocations
Cache remains valid even when projects are moved or renamed:
// Original location: /projects/my-app
const cache1 = fileEntryCache.create('.cache', './cache', true, '/projects/my-app');
cache1.getFileDescriptor('./src/app.js');
cache1.reconcile();
// After moving project to: /archived/2024/my-app
const cache2 = fileEntryCache.create('.cache', './cache', true, '/archived/2024/my-app');
cache2.getFileDescriptor('./src/app.js'); // Still finds cached entry!
// Cache valid as long as relative structure unchanged
If there is an error when trying to get the file descriptor it will return a notFound
and err
property with the error.
const fileEntryCache = new FileEntryCache();
const fileDescriptor = fileEntryCache.getFileDescriptor('no-file');
if (fileDescriptor.err) {
console.error(fileDescriptor.err);
}
if (fileDescriptor.notFound) {
console.error('File not found');
}
Path Security and Traversal Prevention
The strictPaths
option provides security against path traversal attacks by restricting file access to within the configured cwd
boundaries. This is enabled by default (since v11) to ensure secure defaults when processing untrusted input or when running in security-sensitive environments.
Basic Usage
// strictPaths is enabled by default for security
const cache = new FileEntryCache({
cwd: '/project/root'
});
// This will work - file is within cwd
const descriptor = cache.getFileDescriptor('./src/index.js');
// This will throw an error - attempts to access parent directory
try {
cache.getFileDescriptor('../../../etc/passwd');
} catch (error) {
console.error(error); // Path traversal attempt blocked
}
// To allow parent directory access (not recommended for untrusted input)
const unsafeCache = new FileEntryCache({
cwd: '/project/root',
strictPaths: false // Explicitly disable protection
});
Security Features
When strictPaths
is enabled:
- Path Traversal Prevention: Blocks attempts to access files outside the working directory using
../
sequences - Null Byte Protection: Automatically removes null bytes from paths to prevent injection attacks
- Path Normalization: Cleans and normalizes paths to prevent bypass attempts
Use Cases
Build Tools with Untrusted Input
// Secure build tool configuration
const cache = fileEntryCache.create(
'.buildcache',
'./cache',
true, // useCheckSum
process.cwd()
);
// Enable strict path checking for security
cache.strictPaths = true;
// Process user-provided file paths safely
function processUserFile(userProvidedPath) {
try {
const descriptor = cache.getFileDescriptor(userProvidedPath);
// Safe to process - file is within boundaries
return descriptor;
} catch (error) {
if (error.message.includes('Path traversal attempt blocked')) {
console.warn('Security: Blocked access to:', userProvidedPath);
return null;
}
throw error;
}
}
CI/CD Environments
// Strict security for CI/CD pipelines
const cache = new FileEntryCache({
cwd: process.env.GITHUB_WORKSPACE || process.cwd(),
strictPaths: true, // Prevent access outside workspace
useCheckSum: true // Content-based validation
});
// All file operations are now restricted to the workspace
cache.getFileDescriptor('./src/app.js'); // ✓ Allowed
cache.getFileDescriptor('/etc/passwd'); // ✗ Blocked (absolute path outside cwd)
cache.getFileDescriptor('../../../root'); // ✗ Blocked (path traversal)
Dynamic Security Control
const cache = new FileEntryCache({ cwd: '/safe/directory' });
// Start with relaxed mode for trusted operations
cache.strictPaths = false;
processInternalFiles();
// Enable strict mode for untrusted input
cache.strictPaths = true;
processUserUploadedPaths();
// Return to relaxed mode if needed
cache.strictPaths = false;
Default Behavior
As of v11, strictPaths
is enabled by default to provide secure defaults. This means:
- Path traversal attempts using
../
are blocked - File access is restricted to within the configured
cwd
- Null bytes in paths are automatically sanitized
Migrating from v10 or Earlier
If you're upgrading from v10 or earlier and need to maintain the previous behavior (for example, if your code legitimately accesses parent directories), you can explicitly disable strict paths:
const cache = new FileEntryCache({
cwd: process.cwd(),
strictPaths: false // Restore v10 behavior
});
However, we strongly recommend keeping strictPaths: true
and adjusting your code to work within the security boundaries, especially when processing any untrusted input.
Using Checksums to Determine if a File has Changed (useCheckSum)
By default the useCheckSum
is false
. This means that the FileEntryCache
will use the mtime
and ctime
to determine if the file has changed. If you set useCheckSum
to true
it will use a checksum to determine if the file has changed. This is useful when you want to make sure that the file has not changed at all.
const fileEntryCache = new FileEntryCache();
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { useCheckSum: true });
You can pass useCheckSum
in the FileEntryCache options, as a property .useCheckSum
to make it default for all files, or in the getFileDescriptor
function. Here is an example where you set it globally but then override it for a specific file:
const fileEntryCache = new FileEntryCache({ useCheckSum: true });
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { useCheckSum: false });
Setting Additional Meta Data
In the past we have seen people do random values on the meta
object. This can cause issues with the meta
object. To avoid this we have data
which can be anything.
const fileEntryCache = new FileEntryCache();
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt');
fileDescriptor.meta.data = { myData: 'myData' }; //anything you want
How to Contribute
You can contribute by forking the repo and submitting a pull request. Please make sure to add tests and update the documentation. To learn more about how to contribute go to our main README https://github.com/jaredwray/cacheable. This will talk about how to Open a Pull Request
, Ask a Question
, or Post an Issue
.