webpointmorpheus total solution web design

webpointmorpheus Linux Info
Filesystems - Management
& Administration

©2005 - material compiled by Bob Carnaghi, www.webpointmorpheus.com

Introduction     Top of Page
Within any defined structure, there must be some form of organization to store and access the contents of the structrual unit. Computer systems are no exception to this rule. The computer must be able to store and retrieve data, and this is typically done from some type of disk: hard drive, CD-ROM, floppy, USB stick, etc. In order to place the data on the drive, there must be some type of filesystem on the drive. The filesystem is no more than a set of rules for how the data is to be organized, stored, and retrieved. Historically, there have been different filesystem types based upon proprietary concerns. In the current day and age of computing, there has been an attempted effort to get these filesystem types to at least recognize, if not work with, each other. The table below shows different filesystem types, each with its own set of advantages. Once there is a filesystem on the disk, then the files can be placed within the filesystem for retrieval and use.
Establishing a Filesystem     Top of Page
When a new disk device is to be brought online into the system and used for filesystem interaction, it must first be formatted with a particular filesystem. The table below outlines different filesystems and their use. In Linux, the fdisk command will show and manipulate the disk partition table.
Hard disks can be divided into one or more logical disks called partitions. The partition table can found in sector 0 of the disk. A file allocation table contains the inode table and defines where the files are stored, their permission and owner information, and how they are to be accessed.
The mkfscommand is used to build a Linux file system on a device, usually a hard disk partition, after the disk has been formatted as outlined above. This command has several variants such as mkdosfs, mke2fs, mkfs.bfs, mkfs.ext2, mkfs.ext3, mkfs.minix, mkfs.msdos, mkfs.vfat, mkfs.xfs, mkfs.xiafs which directly create the different filesystems.
Linux Filesystems & the FHS     Top of Page
The Linux operating system operates from a filesystem structure that is based upon a common standard known as the Filesystem Hierarchy Standard (FHS). This standard makes possible an attempt at uniformity among different distributions of the operating system such that software written for one distro can be easily adapted to another distry. The filesystem itself has as its base (or root) the '/' directory, from which all files are branched. There are no drive letters in the Linux filesystem structure, instead there are 'mount points', where additional devices and file systems are mounted. Indeed, any other hard drives or hard drive partitions are mounted to the filesystem structure in such a manner that they are accessed from what appears to be a single directory.
FHS Directory Definitions     Top of Page
  • /bin - Location of binary commands used by all users
  • /boot - Location of the Linux kernel and bootloader files
  • /dev - Location of device files
  • /etc - Location of system configuration files
  • /home - Location (default) for user data files
  • /lib - Location for shared program libraries (/bin & /sbin) and kernel modules
  • /mnt - Location of mounted devices and filesystems
  • /opt - Location for additional software that is installed on the system
  • /proc - Location of kernel and process information
  • /root - Home directory for the root user
  • /sbin - Location of system administration binary command programs
  • /tmp - Location of temporary files
  • /usr - Location of most system commands and utilities, typically with the following subdirectories
    1. /usr/bin - Location of user binary commands
    2. /usr/games - Location of educational and entertainment type games
    3. /usr/include - Location of C program header files
    4. /usr/lib - Location of program libraries
    5. /usr/local - Location of most additional programs installed by the system
    6. /usr/sbin - Location of system binary commands
    7. /usr/share - Location of files that independent of the system architecture
    8. /usr/src - Location of source code for various programs
    9. /usr/X11R - Location of the X Window system
  • /var - Location of log files, print spooler, and other data that tends to vary
Filesystem Types     Top of Page
Various Linux filesystem types are listed below.
Linux Filesystems
Filesystem Type Filesystem Description
bfs The Boot filesystem. Typically a UNIX filesystem that is small & bootable, which holds system startup files.
cdfs The Compact Disk filesystem. A filesystem that makes possible the viewing of all tracks and data on a CD-ROM as normal files. For additional info, read about the El Torito Specification.
ext2 The second version of the extended filesystem, which is the traditional native Linux filesystem. Based upon the Minix filesystem.
ext3 The third version of the extended filesystem, which offers a journaling enhancement over the ext2 filesystem.
hfs The Hierarchical File System, which is native to Apple Macintosh computers.
hpfs The High Performance File System, which was initially a collaborative effort between IBM and Microsoft. This filesystem is now proprietart to IBM systems, and is used on large disk volumes.
iso9660 The filesystem that is used to access data on CD-ROMs. Read the El Torito Specification for more info.
minix The original Linux filesystem that was in place during the initial kernel development.
msdos MS DOS fat filesystem.
ntfs New Technology File System, the Microsoft proprietary outcome of the collaborative effort with IBM that created the hpfs. This filesystem was initiated during the Windows NT 4.0 version, and continues to evolve today.
reiserfs the REISERFS filesystem which is similar to ext3 filesystem that offers journaling and database support.
udf The Universal Disk Format filesystem, which is used to write to CD-R, CD-W, and DVD devices.
vfat The MS DOS FAT filesystem with long name support.
vxfs The Veritas File System, which is common on UNIX systems for it's journalizing, large file support, and access control list capabilities.
 
File Types & Directories     Top of Page
Note that there is a difference between filesystem types and file types. The filesystem type is constant through the disk or partition that it is applied to. File types are individual to each file, and a single file may contain other files of different types. Listed immediately below are the major file types.
  • Text files - typically configuration or plain data files
  • Binary data files - graphics files, or compiled files that may store functions
  • Executable program files - files that are capable of acting as a program or daemon
  • Directory files - a special type file that serves to collect or organize other files into an orderly unit
  • Linked files - files that have an association with one another (shortcuts)
  • Special device files - files that represent devices such as hard drives, serial ports, (typically in the /dev directory)
  • Named pipe and socket files - special files that allow a process on another computer to write to the file while a different process reads the information in the file
Device Files     Top of Page
All devices on a Linux system are represented by a device file. Device files are located in the /dev directory. This allows the specification of the device on the system by using the pathname to the file that represents it in the /dev directory. Devices are typically character or block devices, which defines how data is transferred to the device. A listing of the devices will offer a major number and a minor number. The major number refers to the device driver as referenced by the kernel. Major numbers are not necessarily unique, as the same release of driver may work for more than one device. The minor number is the reference number for the device itself, and is unique. Most devices listed in /dev are never used. To see a listing of devices that are used or mounted, refer to /proc/devices
Linux Device Files
Device File Device Description Block/
Character
/dev/fd0 The first floppy drive on the system. Block
/dev/fd1 The second floppy drive on the system. Block
/dev/hda1 The primary master drive on the system IDE controller, typically the first partition on the primary IDE hard disk. Block
/dev/hdb1 The primary slave drive on the system IDE controller, typically the first partition on the second IDE hard disk. Block
/dev/hdc1 The secondary master drive on the system IDE controller, typically the first partition on the third IDE hard disk. Block
/dev/hdd1 The secondary slave drive on the system IDE controller, typically the first partition on the fourth IDE hard disk. Block
/dev/sda1 The first primary partition on the first SCSI (Small Computer Systems Interface) drive. Block
/dev/sdb1 The first primary partition on the second SCSI (Small Computer Systems Interface) drive. Block
/dev/tty1 The first local terminal on the system, accessed with (Ctrl+Alt+F1). Character
/dev/tty2 The second local terminal on the system, accessed with (Ctrl+Alt+F2). Character
/dev/ttyS0 The first serial port on the system (COM1). Character
/dev/ttyS1 The second serial port on the system (COM2). Character
/dev/psaux PS/2 mouse port. Character
/dev/lp0 The first parallel port on the system (LPT1). Character
/dev/null The big pit. The device file that represents nothing. Data sent to /dev/null is sent into the Great Cyber Nothingness. Character
/dev/st0 First SCSI tape device on the system. Character
/dev/usb/* USB device files. Character
 
Directories
Directories are essentially files that are used as containers to organize other files. In order to view the contents of a directory, the 'execute' permission must be set. Everything on the Linux system, indeed in computer filesystems in general, is based upon a relative or absolute location, as outlined below.
  • Relative pathname - the path is relative to the current directory. To navigate up and over, the path is such: ../new location, etc.
  • Absoulte pathname - the path is relative to the root of the filesystem, such: /directory 1/directory 2/current location, etc.
The BASH shell contains a variable that is known as the PATH variable. Type echo $PATH at the command prompt to see this variable contents. The listing shows the directories that are to be searched for binary or script files when commands are entered.
Home directories     Top of Page
Each user in Linux land has a home directory. The full collection of home directories are typically located in the /home directory, and are given the name of the user. The path to a user's home directory would look like /home/'username'. The 'root' home directory is treated separately, and is in the /root location. Often the entire collection of home directories is kept on a separate drive or hard drive, and mounted such that backing up the directory is easy. This method also accommodates a system wherein there are many users with a large amount of data that is written to disk and changes frequently. The tilde '~' symbol is often used to refer to a user's home directory.
Mounting Devices and Filesystems     Top of Page
Filesystems and devices must be 'mounted' to the existing Linux root filesystem. The mount command serves to attach the file system found on some device to a specific location in the Linux file tree. The umount command will detach it again.> Typically, the /mnt directory will have mountpoints for externally mounted filesystem. There will also be instances where entire directories (such as the /home directory) will be a mounted filesystem, often contained on a separate drive. Filesystems can be automatically mounted by the system when entered in the /etc/fstab file.
Monitoring & Maintaining Filesystems     Top of Page
Filesystems can and will develop problems. The most common type of filesystem problem is when the computer powers down without unmounting the filesystems, which leave open file handles across the disk. Other problems can develop when disks get old, become fragmented, are subjected to high temperatures, etc.
The fsck is used to check and optionally repair one or more Linux file systems. The filesystem can be a device name such as /dev/hda1, /dev/sdc2, a mount point such as  / , /usr, /home, etc. Typically, the fsck program will try to run filesystems on different physical disk drives in parallel to reduce the total amount time to check all of the filesystems. The filesystem must first be unmounted in order to be checked.
File Permissions     Top of Page
Permissions have two general categories of consideration: when applied to users, there are three classifications of the owner, the group, and others. When applied to files and directories, the owner, group, and other can have read, write, and/or execute permissions. When showing a listing of the contents of a directory, the permissions are stated as -{123}{456}{789}, with the group {123} applied to the owner, the group {456} applied to the group, and the group {789} applied to others. The permissions as stated #1,4,& 7 when applied, will give read permission to the file or directory, and will have an octal value of 4. The permissions as stated #2,5,& 8 when applied, will give write permission to the file or directory, and will have an octal value of 2. The permissions as stated #3,6,& 9 when applied, will give execute permission to the file or directory, and will have an octal value of 1.
When a file or directory is created, the creator becomes the owner of that file or directory, and the primary group of the creator becomes the group owner of the file or directory. The default permissions are given to new files and directories when created are such that files are automatically given rw-rw-rw-, and directories are given rwxrwxrwx. The umask variable (also a command that shows and/or sets the variable to a new value) takes away certain permissions from these default permissions as given. The most typical umask value is 022, which takes away write permission for group and other on files, rendering the file read only to all but the owner. That same umask value takes away the write permission on directories, which gives read and execute permissions to the group and others, thereby permitting all to navigate the directory by default, but not write to it.
Special Permissions     Top of Page
SUID (st user id) - no effect on a directory. When placed on a file, the person who executes the file becomes the owner of the file during its execution. SUID can only be applied to binary programs, not shell scripts or regurlar files.
SGID (set group id) - affects both files and directories. When applied to a binary file, the person who executes the file will become part of the group that owns the file during its execution. Therefore, the group permissions for the file are applied during execution, not the group permissions of the person executing the file. When applied to a directory, SGID sets the group owner for files created in the directory to the group of the directory, not to the group of the creator.
Sticky bit - applied to directories, the sticky bit gives the option for several members to write to the directory, yet only delete those files in the directory that they have created. Good for common project directories with files from several users.
This permission, as given, have an octal value of 0777. Note the dash in front of the permission string, which states that the permission is for a regular file. If the - (dash) is a 'd', the listing applies to a directory. The special permissions will show up as follows:
  • -rw-r--r-- default listing for a regular file with umask in place
  • drwxr-xr-x default listing for a directory with umask in place
  • -rwSr-xr-x listing for a file with SUID set
  • -rwxr-Sr-x listing for a file with SGID set
  • drwxrwxrwT listing for a directory with the stick bit set
  • crw------- listing for a character device**
  • brw-rw---- listing for a block device**
  • lrwxrwxrwx listing for a link
  • * specifies that data is transferred to the device character by individual character
  • ** specifies that data is transferred to the device in chunks or blocks, using physical memory to buffer the data
When setting the special permissions using the octal format, SUID is given a value of 4 in the first place of the 4 digit octal string: 4777. SGID is given a value of 2 in the first place of the 4 digit octal string: 2777. The stiky bit is given a value of 1 in the first place of the 4 digit octal string 1777.
Filename Extensions     Top of Page
The Linux system doesn't necessarily depend on a filename extension to know what program is used or created the file. Filename extensions are typically a convenience to the user, and will show in a listing what the file is for or about. The table below lists some of the more common filename extensions.
Some files on the filesystem are 'hidden' from normal view, and only appear when using the command 'ls -a'. These files start with a 'dot' - '.', such as '.bashrc', etc. In order to view the hidden files with a full listing of the directory contents, use the ls -al command.
Common Filename Extensions
Extension Description
.c C source code files
.cc, .cpp C++ source code files
.html, .htm, .shtml HTML files
.php, .asp, .cfm Program specific PHP, ASP, Cold Fusion files
.ps Post Script
.txt Text files
.tar Archive files, typically by the tar utility
.gz, .bz2, .Z Compressed files
.tar.gz, .tgz, .tar.bz2, .tar.Z Files that are archived and compressed
.conf, .cfg Configuration files, typically text files
.so Shared object library files
.o Object files that are compressed
.pl PERL files
.tcl TCL (Tool Command Language) files
.jpg, .jpeg, .png, .tif, .tiff, .xpm, .gif, .psd Binary files that contain graphical images
.sh Scripts that are configured to run as shell scripts
 
File & Filesystem Commands     Top of Page
Listed below are some of the most basic and common Linux file & filesystem commands.
Essential Linux File & Filesystem Commands
Command Description
General & Movement Commands
cd change directory command
ls list command, shows contents of the requested directory
pwd 'print working directory', shows the current directory
touch creates a file, or updates its modification time
file analyzes files and returns information about the file
Displaying the contents of files
cat abbreviation for concatenate - prints the contents of the requested file to the screen - often the output is piped to another command such as less or more
tac opposite of the cat command, prints the file to the screen in reverse
head shows the first portion of a requested file
tail shows the end of a requested file - note the tail -f option, which shows updated log files in real time.
more displays content to the screen in increments
less displays content to the screen in increments
Display Contents of Binary Files
strings print the strings of printable characters in files, typically for determining the contents of non-text files
od dump files in octal and other formats
Directory Commands
mkdir make directory command
rmdir remove directory command
mv move command
cp copy command
rm remove command
alias creates command aliases that can be used at the command line in a terminal
ln (-s option) link command, will create hard or symbolic links
ls (-a, -F, -l, -i, etc.) list command, shows files and directories with any requested information
whoami shows current logged in username
groups lists the groups to which a user belongs
chown changes ownership of files and directories
chgrp changed group ownership of files and directories
chmod changes permissions on files and directories
umask a command and variable that works on newly created files and directories.
Finding Files on Disk
find searches for files in a directory hierarchy - lots of options, including actions to be taken when specific files are found.
locate finds files by name by reading a pre-compiled database prepared by the updatedb uility
Notes:
  1. The BASH shell offers a tab completion feature that will attempt to fill in partial file names and path info. Type a few letters at the command prompt, then press tab. Press tab twice for a menu of choices.
  2. Most shell commands are capable of 'wildcard expansion', as outlined below..
  3. Most text editors are capable of using 'regular expressions', which are outlined below.
 
Wildcard Metacharacters     Top of Page
Wildcard metacharacters are used to match certain portions of filenames. By using the wildcards as listed, searches and multiple listings are possible based upon a wide range of capabilities. For more info, see the BASH page, particularly the BASH resources.
  •  *  - Will match 0 or more characters in a filename
  •  ?  - Will match only one character in a filename
  • [xyz] - Will match only one character in a filename provided it is x,y,or z
  • [x-z] - Will match only one character in a filename provided it is between x and z
  • [!x-z] - Will match only one character in a filename provided it is not between x and z
Searching Within Text Files     Top of Page
Within the Linux world of searching for files and filtering them, the grep command is the Swiss Army Knife. The grep command searches a named input file, or standard input, for lines within the file that contain a match to a pattern given as an input parameter. By default, grep prints the matching lines that are found within the file. There are two variant programs egrep and fgrep which are available. egrep is the same as grep -E, and fgrep is the same as grep -F.
sed command
Regular Expressions     Top of Page
Regular expressions are a means of finding a match to certain strings or portions of strings within files.
Editing Text Files     Top of Page
The vi editor is the oldest (some will say the 'crankiest) text editor available in Linux. It takes a little getting used to, but is a great utility. The newer version, vim, is an enhanced text editor that is upwards compatible to Vi. It can be used to edit all kinds of plain text as well as script files and programs. There is an inherent tutorial that is available on any system that contains the vi program - type vimtutor at the command line. Check out www.vim.org for the latest version and info.
There are several other text editors availiable and common to Linux. Most of these depend upon whether they which was installed initially. See the list below.
  1. mcedit (Midnight Commander Editor) text editor
  2. GNU Emacs (Editor MACroS) editor
  3. nedit text editor
  4. gedit text editor - typically installed with the GNOME environment
  5. kedit text editor - typically installed with the KDE environment
Finding Files on the Filesystem     Top of Page
To find files that are located somewhere on the filesystem, use the locate command. The locate command is actually a shortcut to the slocate command, and searches a pre-populated database of the files on the system which is indexed for fast searching. The database for the locate command is typically updated daily, and is located at '/var/lib/slocate/slocate.db'. To update the database manually, run updatedb, or slocate -u.
A slower yet more versatile method of finding files on the system is to use the find command. find does not search a pre-populated database, but searches directories recursively. The format for the find command is 'find 'where-to-look' '-criteria' 'what to find'. A typical find command would be thus: find /etc -name httpd.conf. The find command would start at the /etc location and search each directory recursively until the file is found, or there is no result to return. There are several search criteria available for use with the find command, and wildcards are permitted. Enclose the wildcard characters within quotes to prevent interpretation by the shell. There is a criteria flag (-regexp) that will pass regular expressions to the find command.
The which command will locate executable files only, and only searches the directories in the system PATH. Type echo $PATH to see the directories that are listed in the PATH system variable.
Linking Files     Top of Page
There are typically two types of linked files: symbolic or soft links, and hard links. When considering linking files, the following definitions are important to understand:
  • superblock - the filesystem location that contains definitions about the filesystem itself: number of inodes and data blocks, how big is the allocation unit for the filesystem, etc.
  • inode table - a collection of information nodes that store information in single inodes about files that are located on the filesystem. The information that is contained in the inodes is as follows: unique inode number, file size, data block locations, last date modified, permissions, and ownership.
  • data blocks - allocation units that contain data that is stored on the filesystem.
A symbolic link is only a pointer to another file. Symbolic links can span partitions and filesystems. The actual data that is stored in the data blocks is the same, regardless of the number of symbolic links or their locations.
A hard link creates a situation wherein the data is actually shared between two files. Hard linked files must be on the same system partition. Hard links share the same inode number, but have different copies of the data on the disk. When one copy of the data is changed, the other copy is updated.
Other Documents in this Series      Top of Page
  1. Introduction and History
  2. Installation, Advanced Installation, and Usage
  3. The Linux Kernel and the Boot Process
  4. Filesystems - Management & Administration
  5. The BASH and Other Shells
  6. System Initialization and the X Environment
  7. Linux Processes
  8. Linux Administration, Peripherals, and Hardware
  9. Software Installation and Management
  10. Backups and Log Files
  11. Performance and Problems
  12. Network Configuration
  13. Security
  14. Key Linux Commands
  15. Essential Linux Definitions
webpointmorpheus Home       Technical Pages Site Map      This page was last modified: Wednesday January 03, 2007 10:53 AM