Photo by Charles Forerunner on Unsplash

Photo by Charles Forerunner on Unsplash

Where does job output go?

Data Depot

Article from ADMIN 78/2023
By
Where does your job data go? The answer is fairly straightforward, but I add some color by throwing in a little high-level background about what resource managers are doing and evolve the question to include a discussion of where data "should" or "could" go.

The second question I need to answer from the top three storage questions [1] a friend sent me is "How do you know where data is located after a job is finished?" This is an excellent question that HPC users who use a resource manager (job scheduler) should contemplate. The question is straightforward to answer, but the question also opens a broader, perhaps philosophical question: Where "should" or "could" your data be located when running a job (application)?

To answer the question with a little background, I'll start with the idea of a "job." Assume you run a job with a resource manager such as Slurm. You create a script that runs your job – this script is generically referred to as a "job script" – and submit the job script to the resource manager with a simple command, creating a "job." The job is then added to the job queue controlled by the resource manager. Your job script can define the resources you need; set up the environment; execute commands, including defining environment variables; execute the application(s); and so on. When the job finishes or the time allowed is exceeded, the job stops and releases the resources.

As resources change in the system (e.g., nodes become available), the resource manager checks the resource requirements of the job, along with any internal rules that have been defined about job priorities, and determines which job to run next. Many times, the rule is simply to run the jobs in the order they were submitted – first in, first out (FIFO).

When you submit your job script to the "queue," creating the job, the resource manager holds the details of the job script and a few other items, such as details of the submit command that was used. After creating the job, you don't have to stay logged in to the system. The resource manager runs the job (job script) on your behalf when the

...
Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Where Does Job Output Go?

    Where does your job data go? The answer is fairly straightforward, but I add  some  color  by  throw ing  in a little high-level background about what resource managers are doing and evolve the question to include a discussion of where data “should” or “could” go.

  • Highly available storage virtualization
    Implementing highly available SAN data storage virtualization.
  • Storage pools and storage spaces in Windows
    Storage spaces and storage pools combine a variety of storage technologies into a single logical unit, ensuring high availability and a choice of resiliency capability.
  • High-performance backup strategies
    A sound backup strategy with appropriate hardware and software ensures you can backup and restore your data rapidly and reliably.
  • Storage innovations in Windows Server 2016
    The upcoming release of Windows Server 2016 introduces major innovations in the field of storage. With built-in storage replication, Storage Spaces Direct, and traffic shaping for storage access via QoS, Windows Server looks like a good candidate for employee of the month.
comments powered by Disqus