OrientDB document and graph database

Linked Worlds

Relationship Questions

Typical relationship questions can be easily clarified using questions such as "Who knows this person?" or "What is this activity dependent on?" The statement in line 6 (Listing 2) thus finds all characters and books linked via an edge by #11:0. Moreover, the desired edge types and directions can be specified explicitly and even linked to a path description.

The statement in line 5 tracks all links that point to #11:2 against their direction (in) and then follows the Appear links for the objects it found. In this way, you can find all books featuring characters who know Sam Jr. One real-world application for such a statement is to display items purchased in the past by a customer currently reviewing a product at a mail-order website.

When using in(), out(), and both(), the paths to be considered are part of the request; the inquirer only ever receives a portion of the start object as a result. Recursive execution is similarly important to receive all linked nodes and edges; this is handled by the traverse statement in OrientDB; it requires information about the edges to be tracked and about the conditions on the traversed nodes and edges.

Thanks to the any() keyword, the query in line 8 tracks all edges; it therefore returns the entire graph, including the root object starting at #11:0. A condition that checks node/edge or edge/node at each transition can be stipulated using while to provide more control. In the simplest case, this restricts the search depth. Additionally, you can access the attributes of the currently viewed nodes and edges and use them for classification or for limiting validity.

In the example, the relationships have a type and a start time (from); however, it is also possible to map resource allocations for projects or the allocation of components to specific production batches. Line 7 only identifies the members of a family. To this end, the while condition checks when traversing to see whether the relationship edges have a family relationship or whether they are Person type nodes.

As with a relational database, it is possible to nest statements. Lines 8-10 show this with a combination of select and traverse statements. In doing so, traverse determines the desired objects, and select acts as a filter to output only the names of the books found. Such combinations allow efficient queries, but this elegant query language tempts users to grab too many objects in the beginning and then to limit the objects to the actual search scope later with another select statement.

Querying in this manner can have disastrous consequences for large graphs; for the traverse statement in particular, it makes sense to reduce the number of objects traversed by limiting the edges or tweaking the while condition. In this way you reduce run time by a good third, even for this small example. The benefits would be even greater with larger graphs, of course; otherwise, the usual methods of optimization with OrientDB help (e.g., indexing frequently requested attributes and using the profile command to investigate queries).

As the few examples here demonstrate, OrientDB allows both a compact description of the data model and – with the advanced select and the new traverse statements – elegant querying of object relationships. Relational databases require more effort: A separate table would be needed for the list or map attributes, and traversing is also much more complex: Recursive descents are not available across databases, are unreadable, and are usually slow.

Interfaces

A choice of several interfaces is available for your own OrientDB applications. The first candidate is Java, with which the database itself is implemented. Using the client API, you can address the server from any program. Doing so also reveals the fast performance of OrientDB: Creating a tree structure with 8,000 nodes and edges each in a transaction takes less than five seconds on a small web server (Core I3, 4GB RAM) and recursive execution is done in one-third of a second.

Accessing OrientDB is not a problem beyond the Java world, either; native drivers are available for scripting languages such as Python, PHP, and Perl; the C and C# drivers open up access to these language families. The OrientDB Server also provides an HTTP interface. With GET and POST requests, the server can be used much like the console. Listing 3 contains a sequence of URLs the user can use to connect to the server, request information about the server and the Person class, and then search for objects. The results are returned in JSON format; updates go to the server in the same format.

Listing 3

HTTP Interface

http://localhost:2480/connect/discworld
http://localhost:2480/database/discworld
http://localhost:2480/class/discworld/Person
http://localhost:2480/query/discworld/sql/select from Person where last='Vimes'
http://localhost:2480/disconnect

The HTTP interface enables the use of any programming languages, especially directly from the JavaScript code of a web page. Furthermore, thanks to the JavaScript interpreter in Java, the scripting language can also be used to program database functions, which then run directly in the database server.

Conclusions

In this article I only demonstrate the rudimentary capabilities of OrientDB, but even these examples demonstrate the flexible data model and the elegant approach to querying. The initial hurdles are low because the query language is usually based on the well-known SQL. Other features of OrientDB, such as transactional security, access rights management, distributed databases, and direct data transfer from relational databases are only mentioned here. Databases with billions of records and gigabytes of disk space are in productive use.

Thanks to its liberal Apache license, OrientDB can also be used for commercial applications. The company behind the development, Orient Technologies, offers commercial support when needed. In addition to further development of the database, the money is also used for the freely available documentation. In just two years, OrientDB has moved from being almost totally useless to being quite useful.

The Author

Carsten Zerbst manages a team of specialists working in the CAD and PDM environment with customers from the automotive, aviation, aerospace, and shipbuilding industries. He advises customers on process questions and creates software solutions for design, integration, and data exchange.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus