There is a hard limit on the number of files that MyDMS can manage, due to that way in which the original underlying file system structure for storing documents was designed.
Currently, MyDMS creates a new directory as a container for every single file that it will store. Each directory never contains more than one file, and directories are never re-used. However, some file systems, such as UFS (and its derivatives) have a hard-coded limit on the number of sub-directories a directory can contain (in fact it is a limit on the number of hard links, but sub-directories are registered within directory inodes as hard links). For UFS, this is a limit of 32767 directories, including the special cases, “.” and “..”.
In an ideal world, it would be better to tear down the existing mechanism and start again from scratch. However, this inevitably breaks backwards compatibility, and so it is necessary to devise an alternative strategy for managing the directory structure.
A solution proposal has been developed that has a nested structure and uses the database to help control the allocation of directory names.The database has 2 new tables, described as follows:
CREATE TABLE `tblDirPath` (
`dirID` int(11) NOT NULL auto_increment,
`dirPath` varchar(255) NOT NULL,
PRIMARY KEY (`dirPath`,`dirID`)
) ;
CREATE TABLE `tblPathList` (
`id` int(11) NOT NULL auto_increment,
`parentPath` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ;
dirID represents a leaf directory within the content file structure and parentPath is the path to its parent subdirectory, relative to settings->_contentDir. dirPath is equivalent to parentPath. When a request for a new storage location is made, MyDMS reads the last directory entry recorded the database table. If the value of dirID is equal to the directory link limit, then parentPath is updated such that it represents a new parent directory. For example:
limit = 32765;
parentPath = “1/2/3″;
In this case, if dirID = 32765, then parentPath is updated to “1/2/4″ and dirID is reset back to 0.
If parentPath = “32765/32765″ and dirID = 32765, then parentPath would change to “0/0/0″, and dirID would be set to 0.
In addition, the size of the “dir” field in tblDocumentContent has been increased to 255 to allow for longer directory paths:
ALTER TABLE `tblDocumentContent` CHANGE `dir` `dir` VARCHAR( 255 ) NOT NULL ;
A new makeDir() function has also been introduced to allow MyDMS to create complete directory paths, equivalent to “mkdir -p”. This functionality is available in PHP5 , but is not provided in PHP 4.
The new mechanism is currently undergoing testing on one of the large MyDMS deployed within my company. This instance has tens of thousands of documents and they unfortunately hit the file system limit 2 days ago. We introduced a work-around by moving the content directory to a backup location and creating soft links in the original directory that points them at the backups.
Small MyDMS instances (fewer than discrete 32765 files) are not affected by the limitation on hard links within a directory, so the use of the new file structure is optional. A flag has been set in the Settings file that allows users to choose which storage method they prefer. Users with large DMS deployments should seriously consider this new system. Backwards compatibility is completely preserved and there is no loss of existing files or data and no need to create a data migration strategy. Existing files are kept in situ.
This change, along with some other minor alterations, will be packaged up and released as 1.7.0 in due course.