Using SFS on OS/161

about OS/161 | about System/161 | project news | publications | license | credits
download | guides and resources | documentation | for instructors | bugs | mailing lists

Disk Devices and Images

The disks you configure in System/161 are probed by OS/161 and numbered in the order they are found on the system bus. The first disk is lhd0 (LAMEbus Hard Drive #0), the second is lhd1, etc. The data for each disk is kept in the file named by sys161.conf; in the default configuration the data for lhd0 appears in LHD0.img and the data for lhd1 appears in LHD1.img. (Older default configurations used DISK1.img and DISK2.img instead, which was unnecessarily confusing.) The disk image files contain a header created by System/161, which is supposed to make it hard to trash files by accidentally using them as disk images.

Creating Disk Images

If the image files do not exist, use disk161 to create them, e.g.:

   disk161 create LHD0.img 5M
and similarly, to change the size:
   disk161 resize LHD0.img 10M
Alternatively, for compatibility with older usage, if you specify the size in sys161.conf, System/161 creates them when it starts up. If you are using System/161 prior to 1.99.09, there is no disk161 tool; instead, set the size in sys161.conf. In these older versions, the disk size is taken from sys161.conf and the size of the file itself is ignored entirely.

Formatting the Disk

Before you can use SFS on a disk you need to format the disk to prepare the on-disk structures. The program mksfs does this task. It can be run either from inside OS/161 or from outside. You need to give it a volume name; this can be used to address the volume later from inside OS/161. This handout will assume your volume is going to be called "mydisk".

To format lhd1 from inside OS/161, do this:

   os161$ /sbin/mksfs lhd1raw: mydisk
The "raw" addendum to lhd1 gives mksfs the raw physical blocks of the disk. The pathname lhd1: names the logical volume mounted on the disk instead, which won't work before it's formatted or mounted. See the section below on naming for further discussion.

To format from outside OS/161, use the native-host-compiled version of mksfs, host-mksfs, which is compiled as part of the system build and placed in the hostbin directory. This knows about System/161's disk headers, but it doesn't know how to read sys161.conf or compute OS/161's device names, so you need to give it the name of the disk image file you want to put a filesystem on.

   hostbin/host-mksfs LHD1.img mydisk

Mounting

Once you have a filesystem, you can mount it from inside OS/161 using the "mount" command in the OS/161 menu:

   OS/161 kernel [? for menu]: mount sfs lhd1
The first argument sfs is the filesystem type (currently only sfs, but one could add others), and the second argument is the device name. OS/161 will print
   vfs: Mounted mydisk: on lhd1
and you're in business.

Accessing

At this point you can access your SFS filesystem. To make this easier it is often convenient to make the root directory of your filesystem the current directory, which you can do with cd from either the shell or the menu. Both the names lhd1: and mydisk: name this directory, so you can do either

   OS/161 kernel [? for menu]: cd lhd1:
or
   OS/161 kernel [? for menu]: cd mydisk:
Likewise, files on the filesystem can be named with either lhd1: or mydisk: as a prefix. The command
   os161$ /bin/cp sys161.conf mydisk:sys161.conf
copies sys161.conf to the root directory of your new SFS volume. For further discussion, see the section on naming below.

SFS as Root Directory

OS/161 has an extra concept called the "bootfs", which is the volume used when you give a path name beginning with a slash. This allows using Unix-style paths like /bin/sh as well as OS/161-style paths like emu0:bin/sh.

The default bootfs is emu0:. You can change it from the kernel menu:

   OS/161 kernel [? for menu]: bootfs lhd1
Unless you have copied the userland programs to your SFS volume, this will make paths like /bin/cp stop working. Use emu0:bin/sh instead.

Copying the userland programs to your SFS volume is a hassle, but doing so and then setting your filesystem as the bootfs is a good way to test it.

Unmounting

To make sure the state of your filesystem on disk is consistent, it should always be unmounted when you are done with it. This can be done from the kernel menu:

   OS/161 kernel [? for menu]: unmount lhd1
and OS/161 will say
   vfs: Unmounted lhd1:
If you do a clean shutdown (with q or /sbin/poweroff and friends) the VFS code will attempt to unmount all mounted filesystems. However, if you ^C out of System/161 or if your kernel dies, you are out of luck.

You cannot unmount a volume that has files still open on it; it will complain that it is busy. If this is happening to you, and you think the volume is really not still in use, first make sure that the filesystem is not (or is no longer) the bootfs and that it is not the current directory of any process or thread, including the menu. If it still refuses to unmount, it is time to go looking for bugs in your vnode reference counting.

Note that you can usually avoid the worst consequences of failure to unmount by running sync to flush the buffer cache and other structures back to disk. This can be done either from the menu or the shell.

Crash Recovery and Checking

OS/161 now includes a checking tool for SFS, called sfsck. It can be run either from inside OS/161 or outside, in a similar manner to mksfs and the other SFS tools. It will check the filesystem image for various problems and for various kinds of inconsistency and report what it finds. Sometimes it can fix the problems it finds; sometimes not. Its primary purpose is as a diagnostic tool.

After crashing with an SFS volume mounted, always run sfsck on that volume. Because OS/161 by default makes no attempt to guarantee that the state of the filesystem after a crash will be recoverable, and (in the baseline form of the assignments at least; your course may vary) you are not required to implement such guarantees either, sfsck may or may not be able to patch up what it finds, and what remains afterward may or may not still contain useful data. However, if sfsck manages to clean up the volume, you can probably at least mount it again without getting in trouble. Mounting a corrupt or inconsistent SFS volume will often lead to panics. It is possible to waste a lot of time trying to figure out what went wrong if you forget this. Don't get caught by it. (This is true of many real file systems as well, although it is obviously an undesirable property.)

Moreover, when you are actively working on FS code, always fsck your volume after testing and unmounting. Your filesystem should be consistent on disk after it is unmounted cleanly. If it is not, it means you have a bug, possibly a serious bug.

Note that because of limitations in the SFS on-disk structure, many forms of filesystem corruption that you may have seen reported in the past by fsck tools for other file system types, such as unreferenced inodes, will show up as "blocks erroneously shown used (or free) in the bitmap". Unlike in e.g. FFS, in SFS such bitmap problems often reflect something seriously wrong.

Further note: the mksfs we ship you does not create . and .. directories entries in the root directory, because the SFS we ship you does not know about directories. sfsck will complain about this if you run it on a fresh volume and you have not adjusted mksfs.

Diagnosis/Debugging Tips

The dumpsfs program prints selected portions from an SFS volume. and root directory of an SFS volume. This can be very helpful when debugging.

It accepts a number of options to select what to dump. The default is to dump inode 1 (the root directory inode). The other options are:
  -sDump the superblock.
-bDump the free block bitmap.
-i numDump specified inode.
-IDump indirect blocks when dumping inodes.
-fDump file contents when dumping inodes.
-dDump directory contents when dumping inodes.
-rRecurse into directory contents.
-aWhole volume; equivalent to -sbdfr -i 1.

Also, to dump a particular block from the volume in raw hex, you can do this (for block 10 in this example):

   cluster% dd if=DISK2.img bs=512 skip=11 count=1 | hexdump -C
(The number of blocks to skip over is one greater than you'd expect because of the System/161 disk header.)

Naming

As has been mentioned previously, OS/161 uses device:path names rather than Unix's pure (and singly-rooted) path names.

There are two reasons for this. The first is that to use more than one volume in a singly rooted tree one needs to implement mount points, magic nodes in the tree where you make the root directory of one volume appear as a subdirectory somewhere inside another volume. Implementing mount points is not that difficult; however, because it impacts directly on the hierarchical directory implementation in the file system assignment, you would have to either implement mount points in VFS before you could even touch an SFS volume, which would be a nuisance... or have to give up on emufs entirely in order to use SFS, which would be a huge hassle.

The second reason is that with a singly-rooted tree the root is a special case. In most Unixes there is a whole bunch of extra magic to allow mounting the root directory during boot... and even more magic to allow unmounting it during shutdown. All of this can be bypassed in OS/161; instead of a root directory there's a table of device names, which is simple and easy to manage.

A third minor advantage of this format is that when opening a hardware device that requires open-time parameters, such as an audio driver, these parameters can be passed as a "path" instead of needing to be either set later via ioctl or encoded by having lots of extra device names. However, nothing in OS/161 takes advantage of this.

When given a path name, the VFS code first splits the device name off the front. The device name is looked up in the table of device names (this happens in vfslist.c) and the vnode for the device in question, or for the root directory of the volume if it is a filesystem name (a volume label, or the non-raw name of a disk device) is used as a starting point for path lookup.

If there is no leading slash and no device either, the path is taken to be relative to the current directory, and the current thread/process's current directory is used as the starting point for lookup.

Path lookup then continues on the filesystem involved. (You may be writing this code for SFS in the file system assignment.)

In Unix, paths are always either relative to the current directory, or "absolute", starting from /. In order to use Unix software (and Unix conventions) that all assume this single root, OS/161 has a thing called the bootfs, which is an extra vnode reference stored in the VFS code. It is used as the starting point when a path begins with / and has no device at the front. The default bootfs is emu0:; the bootfs can be changed from the kernel menu.

Unix devices are by convention found in the /dev directory, where special files (yes, they are formally called "special files", and they really exist on disk as inodes) are created that hold a code number that tells the kernel which device driver to access. For example, in Linux the null device looks like this:

   crw-rw-rw-  1 root root 1, 3 Mar  3 14:59 /dev/null
It is a character device (c) and it has "major number" 1 and "minor number" 3. This tells the kernel that accesses to this name should be handled by character device driver number 1, with argument 3. (If it seems fragile to you to have magic numbers like this lying around, you aren't alone. There is now a 15-20 year history of unsuccessful attempts to tidy this up.)

In traditional Unix (but not Linux) there are typically also two names for each disk device. (Or, more accurately, each partition of a disk device; OS/161 does not have partitions.) However, unlike in OS/161 where the non-"raw" name is used as an alias for the volume name in the file name space, the non-raw name in Unix is a hook that exists only to name the device for mounting. So if you have e.g. wd0h (partition h of the first wd device) you mount it using the name /dev/wd0h, but all other operations (mkfs/newfs, fsck, dumpfs, debugfs, whatever) should be done on the raw device /dev/rwd0h. And then in Unix once the mount is complete the volume appears in a designated place in the directory tree, rather than being accessible via the device name.