« Previous 1 2 3 Next »
PostgreSQL 9.5 – What's new; what's better?
An Eye for Details
Sampling
One focus of the work on PostgreSQL 9.5 was to improve the way it handles very large tables. One contribution is the TABLESAMPLE
feature, which is known from the SQL standard and a number of commercial databases. It lets programmers simply read a random selection of lines from a table. Therefore, queries against very large tables can be substantially accelerated if approximate results are sufficient.
For example, you could use the approximate average price of all articles on the system:
SELECT avg(preis) FROM article TABLESAMPLE SYSTEM(1);
The statement only reads 1% of the lines in the article
table and uses the SYSTEM
random sampling method. This method reads randomly selected blocks of the table and omits all others. Alternatively, you could use the BERNOULLI
method, which is slower because it needs to read the entire table – although the random selections are superior. In typical PostgreSQL style, you can define your own random sampling methods using extensions.
Indexing
A new BRIN (Block Range Index) indexing method is designed for very large tables. A BRIN stores the minimum and maximum values for each block range. When queried with a range query (e.g., x < 10
), it only needs to consider those blocks whose boundary values include the sought-after range. Although this is not as efficient as the well-known B-tree, which points directly to the matching rows, because BRIN only points to a block range, a BRIN is much smaller than a legacy B-tree index.
For example, a table with an int4
column populated with the numbers 1 through 100 million is about 3.5GB. The B-tree index for this would take about 2.1GB, whereas a BRIN only occupies 100KB. The benefit of BRINs is particularly felt in tables that are far larger than the amount of available RAM, and where even the B-tree indexes would be too large for RAM.
BRIN is best suited to data that are naturally stored in ascending order (e.g., timestamps in a continuous process or automatically generated identification numbers). In this case, the BRIN can be used in a query such as time > 'yesterday'
; it skips large parts of the table because only very few contiguous blocks contain the matching data. However, BRIN is pretty much useless for randomly distributed values because more or less all the block ranges contain matching values, and an index scan is pretty much the same as a sequential scan.
Row-Level Security
The system of access permissions in SQL only defines access at the table or column level because these are the units envisaged for this purpose by the Data Definition Language (DDL). In some applications, managing access to individual rows of data is interesting and can be done manually, as is so often the case in PostgreSQL, using special views and triggers.
Now in PostgreSQL 9.5, you have a simple, built-in solution. Listing 9 demonstrates a typical use case, wherein each row stores the user who created the row. Reading the table then only shows the rows that belong to the user in question. This means that every member of staff can manage the data for which they are responsible, but the programmer does not have to worry about giving users undesirable access to data.
Listing 9
Row-Level Access Controls
> CREATE TABLE orders ( id int, product text, number int, employee text ); > ALTER TABLE orders ENABLE ROW LEVEL SECURITY; > CREATE POLICY p1 ON orders FOR ALL TO PUBLIC USING (employee = current_user); > GRANT ALL ON TABLE orders TO PUBLIC; > SET SESSION AUTHORIZATION m1; > INSERT INTO orders VALUES (1, 'Screw', 10, 'm1'); > INSERT INTO orders VALUES (2, 'Nail', 5, 'm2'); -- <Error: not allowed> > SET SESSION AUTHORIZATION m2; > INSERT INTO orders VALUES (3, 'Hammer', 1, current_user); > SET SESSION AUTHORIZATION m1; > SELECT * FROM orders; +----+----------+--------+-------------+ | id | product | number | employee | +----+----------+--------+---------- --+ | 1 | Screw | 10 | m1 | +----+----------+--------+-------------+ > SET SESSION AUTHORIZATION postgres; > SELECT * FROM orders; +----+----------+--------+-------------+ | id | product | number | employee | +----+----------+--------+-------------+ | 1 | Screw | 10 | m1 | | 3 | Hammer | 1 | m2 | +----+----------+--------+-------------+
Data to which access is not granted are practically invisible. Row-level security thus differs from legacy privilege management systems that throw an error with missing access privileges. With this system, you can, for example, simply implement a multitenancy system, which means an application can be provided to multiple, separate customers who use the same table internally – without running the risk of one customer seeing another customer's data.
Previously, you had to use separate tables to achieve this, but it could easily have caused problems if you had many thousands of tables, and the only alternative would have been an implementation based on troublesome and error-prone triggers and views.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)