FreeBSD Version 10 released

News from BSD

Virtualization

The new version of the FreeBSD operating system offers its own virtualization solution: Bhyve [6], a type 2 virtualization software. That is, it is based on a full operating system and uses its device drivers. In contrast, a type 1 hypervisor runs directly on the hardware.

Bhyve (Figure 2) is a hardware virtual machine (HVM). Thus far, the hypervisor only supports Intel's VT-x technology; support for Secure Virtual Machine (SVM) by AMD is still work in progress. Bhyve uses Intel's Extended Page Tables to manage the memory addresses of virtual machines.

Figure 2: The structure of the new FreeBSD hypervisor, Bhyve.

The FreeBSD hypervisor emulates I/O APIC (Advanced Programmable Interrupt Controller) and thus supports APIC for guest systems as well as an AHCI emulation, which currently only partially works. The developers are currently working primarily on non-blocking read and write access and support for suspend and resume.

Bhyve consists of a kernel module, vmm.ko, the libvmmapi.so library, and the applications bhyve(8), bhyveload(8), and byhvectrl(8). The components can manage with as little as 250KB of memory.

The bhyveload(8) tool loads a FreeBSD guest directly on the virtual machine. This is done quickly with a simple command that starts FreeBSD from an ISO image:

bhyveload -m 1024 -d ./freebsd.iso freebsd-vm

The performance of FreeBSD as a guest in a virtualized environment is improved by the virtio driver package, which has made its way from the ports to the base system in the new version of FreeBSD. The kernel module contained in the virtio package, virtio-kmod, provides virtualized FreeBSD direct access to the host resources, thanks to paravirtualized APIs. Without virtio-kmod, the host would need to emulate a network card, hard disk controllers, and other hardware components for the guest operating system. Emulating this functionality and implementing the back end consumes time and resources and, thus, slows down the guest.

The virtio package also supports memory ballooning. This technique provides the memory released by the guest system to other guests.

The entries shown in Listing 1 from the /boot/loader.conf file enable virtio. This changes the names of the virtual disks and network cards to, say, /dev/vtbd0 and /dev/vtnet0, which means adjusting the /etc/fstab file to match. Figure 3 shows the typical kernel boot messages.

Listing 1

Virtio in /boot/loader.conf

#Init VirtIO package
virtio_load="YES"
virtio_pci_load="YES"
# Block devices
virtio_blk_load="YES"
# network hardware
if_vtnet_load="YES"
# Memory Ballooning
virtio_balloon_load="YES"
# SCSI support
virtio_scsi_load="YES"
Figure 3: Boot messages from the kernel when using virtio with a network card.

VirtualBox also works with virtio. To do this, you need to enable the VirtualBox Manager in the advanced network settings as a Paravirtualized Network (virtio-net) --type adapter (Figure 4).

Figure 4: Thanks to the Paravirtualized Network option, virtio works in VirtualBox.

FreeBSD 10 also introduces some innovations in storage medium management. In addition to performance optimizations, useful tools from ports have made their way into the base system. This includes the growfs(8) command. It lets users change the size of a UFS2 filesystem – the default on FreeBSD. This tool is especially useful if you need to transfer a backup of a smaller slice to a bigger one. growfs then offers the option of growing the filesystem up to the slice boundary without unmounting. Of course, a backup is recommended before the change.

Even the iSCSI system (Internet Small Computer System Interface) has made its way into the base system in the form of a kernel module. The system consists of an iSCSI target and initiator.

iSCSI transports SCSI data over IP networks, encapsulating the data in TCP/IP packets and using ports 860 and 3260. iSCSI enables access to a storage area network via a virtual point-to-point connection without setting up separate storage devices. Also, existing network switches can be used for iSCSI; iSCSI does not require special hardware for node connections.

Disk access is blockwise, which makes it suitable for databases. Access via iSCSI is also transparent: For example, iSCSI devices show up on FreeBSD as normal SCSI block devices (/dev/da*) and can be used as local SCSI disks.

FUSE (Filesystem in Userspace) has also been promoted from the ports to the base system. FUSE is a kernel module that shifts filesystem drivers from kernel mode to user mode. This approach allows non-privileged users to mount their own filesystems.

Because FUSE uses user mode, just like any other application program, a variety of drivers have emerged. Some of these filesystem drivers create completely different data structures in the form of filesystems instead of hard disks and other storage media.

FUSE is used on FreeBSD to mount the following filesystems:

  • Windows NTFS-3G: This is a FUSE implementation of the Windows NTFS filesystem, which is used by Windows XP and Windows Server 2003, among others. The FUSE driver supports read and write operations, as well as almost all POSIX filesystem functions. Only file permission and ownership changes are not provided.
  • Linux-ext4: FreeBSD supports the popular Linux ext4 filesystem with FUSE; however, the module only offers access.
  • FUSEPod: This extension can be used to mount an Apple iPod or iPhone and provide access to all the files on the device.

ZFS

ZFS, as a filesystem with integrated volume management, has been part of FreeBSD since version 7; it has been suitable for production use for some time [7]. FreeBSD 10 introduces the ZFS NOP (No Write operation) function, which substantially increases the speed of the filesystem.

The performance gain is a result of saved writes. Without NOP, ZFS creates a checksum for each data block during the write operation, even if the content of the data block remains the same. With NOP, however, the system compares the checksum of the block to be written with the existing block on disk. If the checksums are identical, the system does not write the data block. Cryptographic methods are used here to create checksums, thereby enhancing data integrity.

Furthermore, the ZFS developers have added data compression to the Level 2 cache. The Level 2 cache (L2ARC) provides fast read and write access. For optimal coverage of a ZPool – a kind of virtual grouping of several block devices on ZFS – the size of the L2ARC grows in proportion to the ZPool size. On very large systems, however, this quickly results in storage capacity bottlenecks. For this reason, FreeBSD 10 compresses the data in the cache and thus reduces its size.

ZFS already compressed the stored data in previous versions of FreeBSD on request, previously using the Lempel Ziv Jeff Bonwick method (lzjb). FreeBSD 10 uses the faster LZ4 algorithm with significant performance gains: For easily compressible data, the compression speed is 50 percent higher, and during decompression, an 80 percent boost is achieved compared with lzjb. When processing non-compressible data, the newly implemented algorithm is still about three times faster. On systems with slower CPUs, especially, the speed gains are clearly noticeable.

The following command enables compression in line with the new standard for a ZPool named users:

# zfs set compression=lz4 pool/users

Fast solid state disks (SSDs) are especially well suited to caching. To fully benefit from these drives, ZFS implements support for the ATA trim command. ZFS uses trim to tell a solid state disk that deleted or otherwise vacant blocks are no longer used. Without trim, ZFS only uses the administrative structures to show that the corresponding regions are available again, but the SSD controller does not receive this information.

The ATA trim command tells the drive to mark the affected blocks as invalid and that the data is obsolete when deleting files. As such, the data is not written, which reduces access to the SSD and its wear effect. The labeled blocks are then finally released during the next delete action.

ZFS support for trim includes the following configuration options (sysctl Management Information Base, or MIB):

  • vfs.zfs.trim.enabled: If this sysctl MIB is set to zero, trim support is disabled. It is enabled by default.
  • vfs.zfs.trim.max_interval: This sysctl MIB defines how many seconds must elapse between two trim calls.
  • vfs.zfs.trim.timeout: This parameter sets a delay value in seconds for the first execution of trim.
  • vfs.zfs.trim.txg_delay: Specifies the maximum period the data of a Transaction Group (TXG) can remain in memory before it is written. This is an important value for ZFS because the filesystem delays write operations until sufficient data is available. Initially, this value was set to 30 seconds; however, this caused problems on slow systems with only one hard disk. Writing of data and processing of the trim command coincided frequently, thus blocking the system.

Network

The FreeBSD developers have also done justice to the rise of multicore and multiprocessor systems by adapting the kernel and drivers. To exploit the performance potential of the future, they have revised the pf packet filter.

This tool, which was originally ported from OpenBSD, is designed for single CPU systems and briefly stopped a data stream at the beginning of the filtering process. The filter rules were then applied and, at the end of the process, the data stream was released. The SMP-friendly version changes this sequential process. Now, multiple parallel threads process the data stream, which increases the processing speed significantly and reduces system load.

FreeBSD 10 not only eliminates old problems with multiprocessor operation in the field of wireless LAN but also adds new hardware components to the driver for Atheros wireless cards. ath(4) now supports all Atheros PCI/PCIe network adapters up to and including the AR9287 chipset. However, support for network cards with AR5513 MIMO 802.11abg chips and AR5523/AR5212 chips, which are used on USB WiFi sticks and plugin cards, are still missing, as are the AR7010 and AR9271 series, which are used for USB WiFi sticks.

FreeBSD 10 also includes a number of enhancements for the IEEE 802.11n standard – the latest version of the WiFi standard. These enhancements also serve as the basis for meshed WLANs, which is a network of communicating wireless nodes as per the IEEE802.11s standard, also known as an ad hoc network.

Ad hoc networks connect mobile devices (nodes), such as mobile phones and notebooks, without relying on a fixed infrastructure, such as wireless access points. The data are passed from node to node until they reach their destination. Thus, the data load in such networks is distributed better than in networks with a central node. Special routing methods are used to implement this principle; they allow the network to adapt continuously as nodes move, join, or leave (Figure 5).

Figure 5: The structure of a wireless mesh network is similar to a cellular network.

Nothing prevents practical use of the implementation of the IEEE 802.11s in FreeBSD 10; however, Linux compatibility has not yet been added.

In terms of support for the IEEE 802.11s standard, the following picture emerges: Drivers for wireless cards with the Atheros chipset (ath(4)), Ralink chipset (ral(4)), and chips by Marvell (mwl(4)) support WLAN meshes.

Things are not as rosy for WiFi adapters with IntelPRO wireless chipsets, because much of their functionality lies in closed-source firmware, which also prevents improvements looking forward. This situation also applies to drivers with Intel PRO wireless chipsets ipw(4), iwi(4), iwn(4), and wi(4). The developers are working hard on all other WiFi drivers to make them fit for wireless meshes.

The following example shows how just a few commands can be used to build a wireless mesh. The commands run on each node and set up a mesh that runs on channel 36 and goes by the name admin-mag-mesh:

# ifconfig wlan0 create wlandev ath0 wlanmode mesh channel 36 meshid admin-mag-mesh
# ifconfig wlan0 <IP address/netmask>

The following command prints a list of all nodes in the mesh:

# ifconfig wlan0 list sta
ADDR   CHAN ... STATE RATE ...
<MAC0> 36   ... IDLE  0M   ...
<MAC1> 36   ... ESTAB 6M   ... WME MESHCONF
<MAC2> 36   ... ESTAB 6M   ... WMEMESHCONF
<MAC3> 36   ... ESTAB 6M   ... WMEMESHCONF

The <MAC0> line shows your own mesh node; the other lines list the other nodes.

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus