Unix File Size Survey - 1993

Last updated 1994-09-11.

This study of Unix file sizes was performed with the assistance of the Internet community in November 1993. An article was posted to Usenet requesting people run a simple shell script and mail me back the results. I was amazed by the response I received, and I ended up with file size data covering:

Summary:


File sizes

There is no such thing as an average file system. Some file systems have lots of little files. Others have a few big files. However as a mental model the notion of an average file system is invaluable.

The following table gives a break down of file sizes and the amount of space they consume.

   file size       #files  %files  %files   disk space  %space  %space
(max. bytes)                        cumm.         (Mb)           cumm.
           0       147479     1.2     1.2          0.0     0.0     0.0
           1         3288     0.0     1.2          0.0     0.0     0.0
           2         5740     0.0     1.3          0.0     0.0     0.0
           4        10234     0.1     1.4          0.0     0.0     0.0
           8        21217     0.2     1.5          0.1     0.0     0.0
          16        67144     0.6     2.1          0.9     0.0     0.0
          32       231970     1.9     4.0          5.8     0.0     0.0
          64       282079     2.3     6.3         14.3     0.0     0.0
         128       278731     2.3     8.6         26.1     0.0     0.0
         256       512897     4.2    12.9         95.1     0.0     0.1
         512      1284617    10.6    23.5        566.7     0.2     0.3
        1024      1808526    14.9    38.4       1442.8     0.6     0.8
        2048      2397908    19.8    58.1       3554.1     1.4     2.2
        4096      1717869    14.2    72.3       4966.8     1.9     4.1
        8192      1144688     9.4    81.7       6646.6     2.6     6.7
       16384       865126     7.1    88.9      10114.5     3.9    10.6
       32768       574651     4.7    93.6      13420.4     5.2    15.8
       65536       348280     2.9    96.5      16162.6     6.2    22.0
      131072       194864     1.6    98.1      18079.7     7.0    29.0
      262144       112967     0.9    99.0      21055.8     8.1    37.1
      524288        58644     0.5    99.5      21523.9     8.3    45.4
     1048576        32286     0.3    99.8      23652.5     9.1    54.5
     2097152        16140     0.1    99.9      23230.4     9.0    63.5
     4194304         7221     0.1   100.0      20850.3     8.0    71.5
     8388608         2475     0.0   100.0      14042.0     5.4    77.0
    16777216          991     0.0   100.0      11378.8     4.4    81.3
    33554432          479     0.0   100.0      11456.1     4.4    85.8
    67108864          258     0.0   100.0      12555.9     4.8    90.6
   134217728           61     0.0   100.0       5633.3     2.2    92.8
   268435456           29     0.0   100.0       5649.2     2.2    95.0
   536870912           12     0.0   100.0       4419.1     1.7    96.7
  1073741824            7     0.0   100.0       5004.5     1.9    98.6
  2147483647            3     0.0   100.0       3620.8     1.4   100.0

A number of observations can be made:

Such a heavily skewed distribution of file sizes suggests that, if one were to design a file system from scratch, it might make sense to employ radically different strategies for small and large files.

The seductive power of mathematics allows us treat a 200 byte and a 2MB file in the same way. But do we really want to? Are there any problems in engineering where the same techniques would be used in handling physical objects that span 6 orders of magnitude?

A quote from sci.physics that has stuck with me: `When things change by 2 orders of magnitude, you are actually dealing with fundamentally different problems'.

People I trust say they would have expected the tail of the above distribution to have been even longer. There are at least some files in the 1-2G range. They point out that DBMS shops with really large files might have been less inclined to respond to a survey like this than some other sites. This would bias the disk space figures, but it would have no appreciable effect on file counts. The results gathered would still be valuable because many static disk layout issues are determined by the distribution of small files and are largely independent of the potential existence of massive files.


Block sizes

The last block of a file is normally only partially occupied. As disk block sizes are increased so too will the the amount of wasted disk space.

The following historical values for the design of the BSD FFS are given in `Design and implementation of the 4.3BSD Unix operating system':

fragment size   overhead
   (bytes)        (%)
      512         4.2
     1024         9.1
     2048        19.7
     4096        42.9

Files have clearly gotten larger since then; I obtained the following results:

fragment size   overhead
   (bytes)        (%)
      128         0.3
      256         0.6
      512         1.1
     1024         2.5
     2048         5.4
     4096        12.3
     8192        27.8
    16384        61.2

By default the BSD FFS typically uses a 1k fragment size. Perhaps this size is no longer optimal and should be increased.

It is interesting to note that even though most files are less than 2k in size, having a 2k block size wastes very little space, because disk space consumption is so totally dominated by large files.


Inode ratios

The BSD FFS statically allocates inodes. By default one inode is allocated for every 2K of disk space. Since an inode consumes 128 bytes this means that by default 6.25% of disk space is consumed by inodes.

It is important not to run out of inodes since any remaining disk space is then effectively wasted. Despite this allocating 1 inode for every 2K is excessive.

For each file system studied I worked out the minimum sized disk it could be placed on. Most disks needed to be only marginally larger than the size of their files, but a few disks, having much smaller files than average, needed a much larger disk---a small disk had insufficient inodes.

bytes per   overhead
  inode       (%)
   1024      12.5
   2048       6.3
   3072       4.5
   4096       4.2
   5120       4.4
   6144       4.9
   7168       5.5
   8192       6.3
   9216       7.2
  10240       8.3
  11264       9.5
  12288      10.9
  13312      12.7
  14336      14.6
  15360      16.7
  16384      19.1
  17408      21.7
  18432      24.4
  19456      27.4
  20480      30.5

Clearly, the current default of one inode for every 2k of data is too small. Earlier results suggested that allocating one inode for every 5-6k was in some sense optimal, and allocating one inode for every 8k would only be 0.4% worse. The new data suggests one inode for every 4k is optimal, and allocating one inode for every 8k would be 2.1% worse.

The analysis technique I used is very sensitive to even a few file systems with very small files.

The main source of file systems with lots of small files would appear to be netnews servers. The typical Usenet message would appear to be 1-2k in length. Ignoring such file systems would drastically alter the conclusions I reach. If, as I believe might already be the case, news servers are manually tuned to have a lower than normal bytes per inode ratio, it would then be possible to justify setting the default ratio much higher.

Clearly it is best if the file system dynamically allocate inodes; I believe AIX does this for instance. Systems that statically allocate inodes should probably increase the bytes per inode ratio, but it is not clear to exactly what value. The engineer in me says `it is important to play this one conservatively: stick to 6k', the artist goes `as Chris Torek says: aesthetics count, 8k'.

[Another way to have analyzed this data would have been based on a histogram of the bytes per inode ratio for each file system. I don't have time to do this right now, but if anyone does I would be interested in seeing the results.]


File size data

All the file size data that was collected is publically available.
Consider this data the property of the Internet and feel guilty if you don't share the results of any analysis you perform with the rest of the community.
Have any questions or comments? thanks!
To gordoni's page.