Lead Image © yewkeo, 123RF.com

Lead Image © yewkeo, 123RF.com

Exploring SQL Server on Linux

Data Dance

Article from ADMIN 48/2018
By
SQL Server runs on Linux now. We'll show you how Microsoft developers made their massive database system Linux ready, and we'll help you get started with setting up SQL Server on your own Linux system.

On November 16, 2016, Microsoft announced the beginning of a new story: SQL Server is now supported on Linux and Docker. Of course, Microsoft emerged long ago from its anti-Linux "dark age," and they have been moving steadily toward more support for open source since releasing the first pieces of .NET to open source in 2006. Since Satya Nadella took over as CEO in 2014, the pace of change has accelerated. Microsoft is now a Platinum sponsor of several open source foundations, including the famous Linux Foundation and the MariaDB Foundation, and the Azure cloud platform now comes with support for a number of important open source services and tools.

Whether you believe Microsoft's embrace of open source is good or bad news, from a technical standpoint, MS SQL Server on Linux has arrived, and it is an important development for many sys admins who operate Linux servers in heterogeneous environments. This article describes how Microsoft got SQL Server to run on Linux and show you how to set up your own SQL Server instance on a Linux system.

Cross-Platform SQL Server

How Microsoft managed to make SQL Server a cross-platform product is a very interesting story. The story started in 2008, with the launch of the Drawbridge project [1]. Drawbridge proposed some concepts and architecture considerations that have some similarities with what we know today as containers. The project proposed a new form of virtualization (or containerization) that combines the two core architectures shown in the Figure 1: the picoprocess (a process-based isolation container with a minimal kernel API surface) and a Library OS (an enlightened version of Windows) to efficiently run any application.

Figure 1: A Windows Library OS in context.

The Drawbridge architecture helped Microsoft to bring SQL Server to Linux with minimal effort – and with the long-term objective to keep identical semantics and performance across different platforms. The story would be different if Microsoft had needed to fork SQL Server with separate versions, each with a separate resource-management API. With Drawbridge as a base concept, an isolated application can benefit from system-like functionality such as threading, virtual memory management, and I/O through both LibOS and the ABI boundary, which, in turn, interacts with the underlying kernel using only 50 API calls.

But that's not all. SQL Server is a complex and performance-critical application with its own resource-management layer called SQLOS. The SQL Server team tried to avoid using the Windows operating system as much as possible, but the cost was a strong dependency with a low-level feature from the Windows kernel itself. Thus, they decided to redesign SQLOS and to merge it with the initial LibOS code from Drawbridge, producing the SQL Platform Abstraction Layer (SQLPAL), which is shown in the Figure 2.

Figure 2: SQL Server PAL architecture.

The host extension provides a bridge between the underlying platform (either Windows or Linux). Everything above the host extension is the common code running in Windows PE format, whereas Linux understands ELF format. As explained by Bob Dorr (Microsoft principal software engineer) this is not a big deal as long as you can abstract logic for the binary formats and provide ABI/API functionality [2]. You just need to run CPU assembly instructions that are the same on Linux as Windows.

However, system resources and latency-sensitive codes, such as disk I/O, have been simplified between SQL Server and underlying operation APIs, enabling SQL Server to make more efficient and direct calls on the host extension (reducing code translations). Currently, the merging of LibOS and SQLOS is still in progress, and I expect to see a reduction in the legacy layer over the time. At the time of writing, some core-based engine feature limitations remain, and only the SQL Server integration service is supported on Linux, as shown in Figure 3. Other BI modules are currently out of scope.

Figure 3: SQL Server services supported by the new PAL architecture.

SQL Server and Linux Distributions

You can install SQL Server on several different Linux distributions. The main prerequisite is the distribution must include systemd [3].

However, if you really want to stay safe, you might want to rely on one of the three distributions officially supported by Microsoft:

  • RHEL 7.3+
  • SELS v12 SP2+
  • Ubuntu 16.04+

In my case, CentOS is definitely my favorite distribution for testing SQL Server on Linux, and I have never faced any compatibility issues so far. The official filesystems supported by SQL Server are ext4 and XFS, as well as NFS 4.2+ if you plan to use network storage. There are also some minimal resource requirements for SQL Server on Linux. You'll need at least 2 CPU cores (x64 only) and 2GB of RAM. The last point may likely draw your attention, because Linux requires more memory than Windows. (Linux requires 2GB, and even 3.25GB with the first CTP versions of SQL Server 2017, whereas Windows only needs 1GB.) This difference in the memory requirements is due to the current PAL architecture, which includes some Windows binaries and services that the SQL Server team will try to eliminate over time. As a first step, the required memory has already been reduced by 38 percent from the old CTP versions to the current one.

Installing SQL Server on Linux

Microsoft made the choice of package-oriented installation for SQL Server that provides several advantages. First, you may use your preferred package manager and their related repository with rpm, dpkg, yum, apt-get, or zypper commands to download and install SQL Server components. You also have the choice of the installation method (offline or online). Applying cumulative updates (remember that service packs no longer exist) remains an easy task using package manager commands. But this installation architecture has also introduced some challenges, and the first main side effect is likely named instances that are no longer supported on Linux on the same host; the official workaround is to rely on Docker containers. For an online installation, you will need to configure two repos for the SQL Server binaries and the SQL Server command-line tools. On my Linux distribution (CentOS), my configured repos are as shown in Listing 1.

Listing 1

Configured Repos

01 $ sudo yum repolist | grep microsoft
02 packages-microsoft-com-mssql-server-2017 packages-microsoft-com-mssql-ser    30
03 packages-microsoft-com-prod              packages-microsoft-com-prod        165

Another advantage of package-oriented installation is that the installation itself benefits from a modular approach. You could choose to install only the mssql-server.x86_64 package for the relational database engine or to add to the installation list the mssql-server-ha.x86_64 package, if you plan to use high-availability features like SQL Server AlwaysOn FCIs or SQL Server AlwaysOn availability groups. In other words, you don't have to install all SQL Server packages if you don't need them. A quick search of mssql-server packages on my Linux server shows 5 packages that are self-explanatory (Listing 2).

Listing 2

Packages

01 $ yum search mssql-server
02 Loaded plugins: fastestmirror
03 Determining fastest mirrors
04  * base: centos.quelquesmots.fr
05  * extras: mirrors.atosworldline.com
06  * updates: centos.crazyfrogs.org
07 ============================================ N/S matched: mssql-server =============================================
08 mssql-server.x86_64 : Microsoft SQL Server Relational Database Engine
09 mssql-server-agent.x86_64 : Microsoft SQL Server Agent
10 mssql-server-fts.x86_64 : Microsoft SQL Server Full Text Search
11 mssql-server-ha.x86_64 : High Availability support for Microsoft SQL Server Relational Database Engine
12 mssql-server-is.x86_64 : Microsoft SQL Server Integration Services

To install the SQL Server relational database engine on Linux, first download the package and then install it on the system, along with the package dependencies bizp2, libsss, and gdb (see Listing 3).

Listing 3

Installing SQL Server

01 $ yum install mssql-server -y
02 ?
03 Running transaction test
04 Transaction test succeeded
05 Running transaction
06   Installing : bzip2-1.0.6-13.el7.x86_64                                                                        1/4
07   Installing : libsss_nss_idmap-1.16.0-19.el7.x86_64                                                            2/4
08   Installing : gdb-7.6.1-110.el7.x86_64                                                                         3/4
09   Installing : mssql-server-14.0.3026.27-2.x86_64                                                               4/4
10 +--------------------------------------------------------------+
11 Please run 'sudo /opt/mssql/bin/mssql-conf setup'
12 to complete the setup of Microsoft SQL Server
13 +--------------------------------------------------------------+
14 SQL Server needs to be restarted in order to apply this setting. Please run
15 'systemctl restart mssql-server.service'.
16   Verifying  : mssql-server-14.0.3026.27-2.x86_64                                                               1/4
17   Verifying  : gdb-7.6.1-110.el7.x86_64                                                                         2/4
18   Verifying  : libsss_nss_idmap-1.16.0-19.el7.x86_64                                                            3/4
19   Verifying  : bzip2-1.0.6-13.el7.x86_64                                                                        4/4
20
21 Installed:
22   mssql-server.x86_64 0:14.0.3026.27-2
23
24 Dependency Installed:
25   bzip2.x86_64 0:1.0.6-13.el7       gdb.x86_64 0:7.6.1-110.el7       libsss_nss_idmap.x86_64 0:1.16.0-19.el7
26
27 Complete!

The installation process will install two main directories and the corresponding files:

  • /opt/mssql (for SQL Server binaries and configuration tools)
  • /var/opt/mssql (for SQL Server database files, logs, and secrets)

The next step is configuring SQL Server settings through the mssql-conf utility (counterpart of the configuration manager on Windows). To run the configuration tool, use:

$ sudo /opt/mssql/bin/mssql-conf setup

This tool can be used either in interactive mode or in addition to preconfigured environment variables [4], providing another way to install SQL Server with automated installation tools such as Ansible.

After configuring the SQL Server instance, check the SQL Server service state. Because SQL Server relies on systemd, you can use the systemctl command, as shown in Listing 4.

Listing 4

Checking the Service State

01 $ sudo systemctl status mssql-server
02 ? mssql-server.service - Microsoft SQL Server Database Engine
03    Loaded: loaded (/usr/lib/systemd/system/mssql-server.service; enabled; vendor preset: disabled)
04    Active: active (running) since Wed 2018-06-06 16:55:41 CEST; 5min ago
05      Docs: https://docs.microsoft.com/en-us/sql/linux
06  Main PID: 3082 (sqlservr)
07    CGroup: /system.slice/mssql-server.service
08            ??3082 /opt/mssql/bin/sqlservr
09            ??3103 /opt/mssql/bin/sqlservr

As Listing 4 shows, my SQL Server instance is up, with the main PID 3082. To connect to the SQL Server instance, you need to install a client tool. You now have the choice of using cross-platform tools, including an old-generation tool (C/C++ based) or a new-generation tool (mainly based on Python). Several of these tools are open source and available on GitHub. The choice of a tool depends on different factors, such as the target operating system and what kind of use you expect. At the time of writing, some of the tools are intended to replace old tools, but they are still limited enough to prevent full adoption. But things are changing very quickly nowadays, and whatever I tell you now could be out of date soon. The good news is you will be able to continue using existing client tools, although the new tools will offer some advantages once the full range of features are implemented. Figure 4 shows a list of command-line interface (CLI) tools you could use to manage SQL Server on different platforms.

Figure 4: A summary of actual CLI tools to connect to SQL Server.

Both sqlcmd and bcp are part of the basic client tools, which you can install with a single command. Note that there are no OLEDB drivers on Linux, and they have been replaced by ODBC:

$ sudo yum install mssql-tools unixODBC-devel -y

After installing all the packages, make sqlcmd/bcp accessible from the Bash shell for both login sessions and interactive/non-login sessions (Listing 5).

Listing 5

Making SQL Server Accessible from Bash

01 $ echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile
02 $ echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
03 $ source ~/.bashrc
04
05 $ sqlcmd -S . -Usa -PXxXxXx
06 1> select @@version
07 2> go
08
09 -------------------------------------------------------------------
10
11 Microsoft SQL Server 2017 (RTM-CU7) (KB4229789) - 14.0.3026.27 (X64)
12         May 10 2018 12:38:11
13         Copyright (C) 2017 Microsoft Corporation
14         Developer Edition (64-bit) on Linux (CentOS Linux 7 (Core))

The basic installation is now complete, but a lot of work remains to make it production-ready. Concerning the configuration of the SQL Server instance, most of the existing best practices are still valuable on Linux. However, at the system level, the story is a little bit different. See Microsoft's recommendations for configuring SQL Server on Linux [5].

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus