This study of Unix file sizes was performed with the assistance of the Internet community in November 1993. An article was posted to Usenet requesting people run a simple shell script and mail me back the results. I was amazed by the response I received, and I ended up with file size data covering:
Summary:
The following table gives a break down of file sizes and the amount of space they consume.
file size #files %files %files disk space %space %space
(max. bytes) cumm. (Mb) cumm.
0 147479 1.2 1.2 0.0 0.0 0.0
1 3288 0.0 1.2 0.0 0.0 0.0
2 5740 0.0 1.3 0.0 0.0 0.0
4 10234 0.1 1.4 0.0 0.0 0.0
8 21217 0.2 1.5 0.1 0.0 0.0
16 67144 0.6 2.1 0.9 0.0 0.0
32 231970 1.9 4.0 5.8 0.0 0.0
64 282079 2.3 6.3 14.3 0.0 0.0
128 278731 2.3 8.6 26.1 0.0 0.0
256 512897 4.2 12.9 95.1 0.0 0.1
512 1284617 10.6 23.5 566.7 0.2 0.3
1024 1808526 14.9 38.4 1442.8 0.6 0.8
2048 2397908 19.8 58.1 3554.1 1.4 2.2
4096 1717869 14.2 72.3 4966.8 1.9 4.1
8192 1144688 9.4 81.7 6646.6 2.6 6.7
16384 865126 7.1 88.9 10114.5 3.9 10.6
32768 574651 4.7 93.6 13420.4 5.2 15.8
65536 348280 2.9 96.5 16162.6 6.2 22.0
131072 194864 1.6 98.1 18079.7 7.0 29.0
262144 112967 0.9 99.0 21055.8 8.1 37.1
524288 58644 0.5 99.5 21523.9 8.3 45.4
1048576 32286 0.3 99.8 23652.5 9.1 54.5
2097152 16140 0.1 99.9 23230.4 9.0 63.5
4194304 7221 0.1 100.0 20850.3 8.0 71.5
8388608 2475 0.0 100.0 14042.0 5.4 77.0
16777216 991 0.0 100.0 11378.8 4.4 81.3
33554432 479 0.0 100.0 11456.1 4.4 85.8
67108864 258 0.0 100.0 12555.9 4.8 90.6
134217728 61 0.0 100.0 5633.3 2.2 92.8
268435456 29 0.0 100.0 5649.2 2.2 95.0
536870912 12 0.0 100.0 4419.1 1.7 96.7
1073741824 7 0.0 100.0 5004.5 1.9 98.6
2147483647 3 0.0 100.0 3620.8 1.4 100.0
A number of observations can be made:
Such a heavily skewed distribution of file sizes suggests that, if one were to design a file system from scratch, it might make sense to employ radically different strategies for small and large files.
The seductive power of mathematics allows us treat a 200 byte and a 2MB file in the same way. But do we really want to? Are there any problems in engineering where the same techniques would be used in handling physical objects that span 6 orders of magnitude?
A quote from sci.physics that has stuck with me: `When things change by 2 orders of magnitude, you are actually dealing with fundamentally different problems'.
People I trust say they would have expected the tail of the above distribution to have been even longer. There are at least some files in the 1-2G range. They point out that DBMS shops with really large files might have been less inclined to respond to a survey like this than some other sites. This would bias the disk space figures, but it would have no appreciable effect on file counts. The results gathered would still be valuable because many static disk layout issues are determined by the distribution of small files and are largely independent of the potential existence of massive files.
The following historical values for the design of the BSD FFS are given in `Design and implementation of the 4.3BSD Unix operating system':
fragment size overhead
(bytes) (%)
512 4.2
1024 9.1
2048 19.7
4096 42.9
Files have clearly gotten larger since then; I obtained the following results:
fragment size overhead
(bytes) (%)
128 0.3
256 0.6
512 1.1
1024 2.5
2048 5.4
4096 12.3
8192 27.8
16384 61.2
By default the BSD FFS typically uses a 1k fragment size. Perhaps this size is no longer optimal and should be increased.
It is interesting to note that even though most files are less than 2k in size, having a 2k block size wastes very little space, because disk space consumption is so totally dominated by large files.
It is important not to run out of inodes since any remaining disk space is then effectively wasted. Despite this allocating 1 inode for every 2K is excessive.
For each file system studied I worked out the minimum sized disk it could be placed on. Most disks needed to be only marginally larger than the size of their files, but a few disks, having much smaller files than average, needed a much larger disk---a small disk had insufficient inodes.
bytes per overhead inode (%) 1024 12.5 2048 6.3 3072 4.5 4096 4.2 5120 4.4 6144 4.9 7168 5.5 8192 6.3 9216 7.2 10240 8.3 11264 9.5 12288 10.9 13312 12.7 14336 14.6 15360 16.7 16384 19.1 17408 21.7 18432 24.4 19456 27.4 20480 30.5
Clearly, the current default of one inode for every 2k of data is too small. Earlier results suggested that allocating one inode for every 5-6k was in some sense optimal, and allocating one inode for every 8k would only be 0.4% worse. The new data suggests one inode for every 4k is optimal, and allocating one inode for every 8k would be 2.1% worse.
The analysis technique I used is very sensitive to even a few file systems with very small files.
The main source of file systems with lots of small files would appear to be netnews servers. The typical Usenet message would appear to be 1-2k in length. Ignoring such file systems would drastically alter the conclusions I reach. If, as I believe might already be the case, news servers are manually tuned to have a lower than normal bytes per inode ratio, it would then be possible to justify setting the default ratio much higher.
Clearly it is best if the file system dynamically allocate inodes; I believe AIX does this for instance. Systems that statically allocate inodes should probably increase the bytes per inode ratio, but it is not clear to exactly what value. The engineer in me says `it is important to play this one conservatively: stick to 6k', the artist goes `as Chris Torek says: aesthetics count, 8k'.
[Another way to have analyzed this data would have been based on a histogram of the bytes per inode ratio for each file system. I don't have time to do this right now, but if anyone does I would be interested in seeing the results.]