Exploring SQL Server on Linux
Data Dance
On November 16, 2016, Microsoft announced the beginning of a new story: SQL Server is now supported on Linux and Docker. Of course, Microsoft emerged long ago from its anti-Linux "dark age," and they have been moving steadily toward more support for open source since releasing the first pieces of .NET to open source in 2006. Since Satya Nadella took over as CEO in 2014, the pace of change has accelerated. Microsoft is now a Platinum sponsor of several open source foundations, including the famous Linux Foundation and the MariaDB Foundation, and the Azure cloud platform now comes with support for a number of important open source services and tools.
Whether you believe Microsoft's embrace of open source is good or bad news, from a technical standpoint, MS SQL Server on Linux has arrived, and it is an important development for many sys admins who operate Linux servers in heterogeneous environments. This article describes how Microsoft got SQL Server to run on Linux and show you how to set up your own SQL Server instance on a Linux system.
Cross-Platform SQL Server
How Microsoft managed to make SQL Server a cross-platform product is a very interesting story. The story started in 2008, with the launch of the Drawbridge project [1]. Drawbridge proposed some concepts and architecture considerations that have some similarities with what we know today as containers. The project proposed a new form of virtualization (or containerization) that combines the two core architectures shown in the Figure 1: the picoprocess (a process-based isolation container with a minimal kernel API surface) and a Library OS (an enlightened version of Windows) to efficiently run any application.
The Drawbridge architecture helped Microsoft to bring SQL Server to Linux with minimal effort – and with the long-term objective to keep identical semantics and performance across different platforms. The story would be different if Microsoft had needed to fork SQL Server with separate versions, each with a separate resource-management API. With Drawbridge as a base concept, an isolated application can benefit from system-like functionality such as threading, virtual memory management, and I/O through both LibOS and the ABI boundary, which, in turn, interacts with the underlying kernel using only 50 API calls.
But that's not all. SQL Server is a complex and performance-critical application with its own resource-management layer called SQLOS. The SQL Server team tried to avoid using the Windows operating system as much as possible, but the cost was a strong dependency with a low-level feature from the Windows kernel itself. Thus, they decided to redesign SQLOS and to merge it with the initial LibOS code from Drawbridge, producing the SQL Platform Abstraction Layer (SQLPAL), which is shown in the Figure 2.
The host extension provides a bridge between the underlying platform (either Windows or Linux). Everything above the host extension is the common code running in Windows PE format, whereas Linux understands ELF format. As explained by Bob Dorr (Microsoft principal software engineer) this is not a big deal as long as you can abstract logic for the binary formats and provide ABI/API functionality [2]. You just need to run CPU assembly instructions that are the same on Linux as Windows.
However, system resources and latency-sensitive codes, such as disk I/O, have been simplified between SQL Server and underlying operation APIs, enabling SQL Server to make more efficient and direct calls on the host extension (reducing code translations). Currently, the merging of LibOS and SQLOS is still in progress, and I expect to see a reduction in the legacy layer over the time. At the time of writing, some core-based engine feature limitations remain, and only the SQL Server integration service is supported on Linux, as shown in Figure 3. Other BI modules are currently out of scope.
SQL Server and Linux Distributions
You can install SQL Server on several different Linux distributions. The main prerequisite is the distribution must include systemd [3].
However, if you really want to stay safe, you might want to rely on one of the three distributions officially supported by Microsoft:
- RHEL 7.3+
- SELS v12 SP2+
- Ubuntu 16.04+
In my case, CentOS is definitely my favorite distribution for testing SQL Server on Linux, and I have never faced any compatibility issues so far. The official filesystems supported by SQL Server are ext4 and XFS, as well as NFS 4.2+ if you plan to use network storage. There are also some minimal resource requirements for SQL Server on Linux. You'll need at least 2 CPU cores (x64 only) and 2GB of RAM. The last point may likely draw your attention, because Linux requires more memory than Windows. (Linux requires 2GB, and even 3.25GB with the first CTP versions of SQL Server 2017, whereas Windows only needs 1GB.) This difference in the memory requirements is due to the current PAL architecture, which includes some Windows binaries and services that the SQL Server team will try to eliminate over time. As a first step, the required memory has already been reduced by 38 percent from the old CTP versions to the current one.
Installing SQL Server on Linux
Microsoft made the choice of package-oriented installation for SQL Server that provides several advantages. First, you may use your preferred package manager and their related repository with rpm
, dpkg
, yum
, apt-get
, or zypper
commands to download and install SQL Server components. You also have the choice of the installation method (offline or online). Applying cumulative updates (remember that service packs no longer exist) remains an easy task using package manager commands. But this installation architecture has also introduced some challenges, and the first main side effect is likely named instances that are no longer supported on Linux on the same host; the official workaround is to rely on Docker containers. For an online installation, you will need to configure two repos for the SQL Server binaries and the SQL Server command-line tools. On my Linux distribution (CentOS), my configured repos are as shown in Listing 1.
Listing 1
Configured Repos
01 $ sudo yum repolist | grep microsoft 02 packages-microsoft-com-mssql-server-2017 packages-microsoft-com-mssql-ser 30 03 packages-microsoft-com-prod packages-microsoft-com-prod 165
Another advantage of package-oriented installation is that the installation itself benefits from a modular approach. You could choose to install only the mssql-server.x86_64 package for the relational database engine or to add to the installation list the mssql-server-ha.x86_64 package, if you plan to use high-availability features like SQL Server AlwaysOn FCIs or SQL Server AlwaysOn availability groups. In other words, you don't have to install all SQL Server packages if you don't need them. A quick search of mssql-server packages on my Linux server shows 5 packages that are self-explanatory (Listing 2).
Listing 2
Packages
01 $ yum search mssql-server 02 Loaded plugins: fastestmirror 03 Determining fastest mirrors 04 * base: centos.quelquesmots.fr 05 * extras: mirrors.atosworldline.com 06 * updates: centos.crazyfrogs.org 07 ============================================ N/S matched: mssql-server ============================================= 08 mssql-server.x86_64 : Microsoft SQL Server Relational Database Engine 09 mssql-server-agent.x86_64 : Microsoft SQL Server Agent 10 mssql-server-fts.x86_64 : Microsoft SQL Server Full Text Search 11 mssql-server-ha.x86_64 : High Availability support for Microsoft SQL Server Relational Database Engine 12 mssql-server-is.x86_64 : Microsoft SQL Server Integration Services
To install the SQL Server relational database engine on Linux, first download the package and then install it on the system, along with the package dependencies bizp2
, libsss
, and gdb
(see Listing 3).
Listing 3
Installing SQL Server
01 $ yum install mssql-server -y 02 ? 03 Running transaction test 04 Transaction test succeeded 05 Running transaction 06 Installing : bzip2-1.0.6-13.el7.x86_64 1/4 07 Installing : libsss_nss_idmap-1.16.0-19.el7.x86_64 2/4 08 Installing : gdb-7.6.1-110.el7.x86_64 3/4 09 Installing : mssql-server-14.0.3026.27-2.x86_64 4/4 10 +--------------------------------------------------------------+ 11 Please run 'sudo /opt/mssql/bin/mssql-conf setup' 12 to complete the setup of Microsoft SQL Server 13 +--------------------------------------------------------------+ 14 SQL Server needs to be restarted in order to apply this setting. Please run 15 'systemctl restart mssql-server.service'. 16 Verifying : mssql-server-14.0.3026.27-2.x86_64 1/4 17 Verifying : gdb-7.6.1-110.el7.x86_64 2/4 18 Verifying : libsss_nss_idmap-1.16.0-19.el7.x86_64 3/4 19 Verifying : bzip2-1.0.6-13.el7.x86_64 4/4 20 21 Installed: 22 mssql-server.x86_64 0:14.0.3026.27-2 23 24 Dependency Installed: 25 bzip2.x86_64 0:1.0.6-13.el7 gdb.x86_64 0:7.6.1-110.el7 libsss_nss_idmap.x86_64 0:1.16.0-19.el7 26 27 Complete!
The installation process will install two main directories and the corresponding files:
/opt/mssql
(for SQL Server binaries and configuration tools)/var/opt/mssql
(for SQL Server database files, logs, and secrets)
The next step is configuring SQL Server settings through the mssql-conf
utility (counterpart of the configuration manager on Windows). To run the configuration tool, use:
$ sudo /opt/mssql/bin/mssql-conf setup
This tool can be used either in interactive mode or in addition to preconfigured environment variables [4], providing another way to install SQL Server with automated installation tools such as Ansible.
After configuring the SQL Server instance, check the SQL Server service state. Because SQL Server relies on systemd, you can use the systemctl
command, as shown in Listing 4.
Listing 4
Checking the Service State
01 $ sudo systemctl status mssql-server 02 ? mssql-server.service - Microsoft SQL Server Database Engine 03 Loaded: loaded (/usr/lib/systemd/system/mssql-server.service; enabled; vendor preset: disabled) 04 Active: active (running) since Wed 2018-06-06 16:55:41 CEST; 5min ago 05 Docs: https://docs.microsoft.com/en-us/sql/linux 06 Main PID: 3082 (sqlservr) 07 CGroup: /system.slice/mssql-server.service 08 ??3082 /opt/mssql/bin/sqlservr 09 ??3103 /opt/mssql/bin/sqlservr
As Listing 4 shows, my SQL Server instance is up, with the main PID 3082. To connect to the SQL Server instance, you need to install a client tool. You now have the choice of using cross-platform tools, including an old-generation tool (C/C++ based) or a new-generation tool (mainly based on Python). Several of these tools are open source and available on GitHub. The choice of a tool depends on different factors, such as the target operating system and what kind of use you expect. At the time of writing, some of the tools are intended to replace old tools, but they are still limited enough to prevent full adoption. But things are changing very quickly nowadays, and whatever I tell you now could be out of date soon. The good news is you will be able to continue using existing client tools, although the new tools will offer some advantages once the full range of features are implemented. Figure 4 shows a list of command-line interface (CLI) tools you could use to manage SQL Server on different platforms.
Both sqlcmd
and bcp
are part of the basic client tools, which you can install with a single command. Note that there are no OLEDB drivers on Linux, and they have been replaced by ODBC:
$ sudo yum install mssql-tools unixODBC-devel -y
After installing all the packages, make sqlcmd
/bcp
accessible from the Bash shell for both login sessions and interactive/non-login sessions (Listing 5).
Listing 5
Making SQL Server Accessible from Bash
01 $ echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile 02 $ echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc 03 $ source ~/.bashrc 04 05 $ sqlcmd -S . -Usa -PXxXxXx 06 1> select @@version 07 2> go 08 09 ------------------------------------------------------------------- 10 11 Microsoft SQL Server 2017 (RTM-CU7) (KB4229789) - 14.0.3026.27 (X64) 12 May 10 2018 12:38:11 13 Copyright (C) 2017 Microsoft Corporation 14 Developer Edition (64-bit) on Linux (CentOS Linux 7 (Core))
The basic installation is now complete, but a lot of work remains to make it production-ready. Concerning the configuration of the SQL Server instance, most of the existing best practices are still valuable on Linux. However, at the system level, the story is a little bit different. See Microsoft's recommendations for configuring SQL Server on Linux [5].
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.