torque-roll-userguide

本文档详细介绍 Torque-Roll 批处理系统的安装、配置及使用方法。包括串行与并行作业提交、MPI 库设置、作业调度策略调整等内容。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

TORQUE ROLL DOCUMENTATION

Author:Roy Dragseth, roy.dragseth@uit.no

Introduction

The torque-roll provides a batch system for the Rocks Cluster Distribution.

The batch system consists of the Torque resource manager and the Maui scheduler which together provides an alternative to the Sun Grid Engine (SGE) that comes as the default batch system with Rocks. The torque roll will not work on a system that has an active sge-roll installed. The best solution is to reinstall your frontend with the torque-roll instead of the sge-roll.

Roll basics

Support

The rocks mailing list is the preferred place to get help and discuss the torque-roll as there are a lot of people on this list with hands-on experience from using the torque-roll on Rocks. Before posting questions to the list you should search the list archives for the terms pbs or torque, as the answer to your problems might be there already.

Installation

It is assumed that you know how to install a Rocks roll on a frontend, see the main Rocks documentation for an intro to installation of a Rocks cluster. You can either burn the roll iso on a CD or install from a central server, both methods are equivalent.

User guide

When the rocks frontend is installed with the torque-roll it will have a functioning batch system, but you will not be able to run any jobs until you have installed some compute nodes. As you detect and install new compute nodes with insert-ethers they will automatically be included in the node list and start receiving jobs as soon as they are up and running.

Running jobs

The normal way of using a batch system is through submitting jobs as scripts that get executed on the compute nodes. A job script can be any shell (bash, csh, zsh), python, perl or whatever supports the # comment character. The most common is though to use sh or csh as job script syntax. The job script is a regular script with some special comments that is meaningful to the batch system. In torque all lines beginning with #PBS are interpreted by the batch system. You submit the job with the qsub command:

qsub runscript.sh

A serial job

It is useful to give info about expected walltime and the number of cpus the job needs. Here is how runscript.sh could look like for a single cpu job:

#!/bin/sh
#PBS -lwalltime=1:00:00
#PBS -lnodes=1

./do-my-work

This script asks for 1 hour runtime and will run on one cpu. The job will terminate when the script exits or will be terminated by the batch system if it passes the 1 hour runtime limit. The #PBS directives can also be given as commandline arguments to qsub like:

qsub -lnodes=1,walltime=1:00:00 runscript.sh

Commandline arguments takes precedence over runscript directives. Note that #PBS must be given exactly like this as the first characters on the line, no extra #s or spaces. All #PBS directives must come before any shell statements or else they will be ignored by the batch system.

When the job is finished you will get back two files with the standard output and standard error for the job in the same directory you submitted the job from. See man qsub.

A parallel job

If you have a parallel application using MPI you can run parallel jobs within the batch system. Let us take a look at the following script:

#!/bin/sh
#PBS -lwalltime=1:00:00
#PBS -lnodes=10
#PBS -lpmem=2gb
#PBS -N parallel_simulation

mpirun ./do-my-work

Note: this runscript will probably not work in its current form as different MPI-implementations need different commands to start the application, see below.

The runscript above is a parallel job that asks for 10 cpus and 2 gigabytes of memory per cpu, the scheduler will then make sure these resources are available to the job before it can start. The runscript will be run on the first node in the nodelist assigned to this job and mpirun will take care of launching the parallel programme named do-my-work on all of the cpus assigned to this jobs, possibly on several compute nodes. If you ask for more resources than is possibly available on a node the job will either be rejected at submit time or will never start.

Different kinds of MPI libraries

Since quite a few implementations of the MPI libraries exist, both free and commercial, it is not possible to cover all possible ways to start any MPI-application in this document. The focus will be on the ones that ships with Rocks: OpenMPI and MPICH2.

OpenMPI

Rocks comes with a it's own compilation of OpenMPI installed in /opt/openmpi/. This is the system-wide default and is used by the mpicc/mpif90 compilers in the default path. Although OpenMPI has support for the torque tm-interface (tm=taskmanager) it is not compiled into the library shipped with Rocks (the reason for this is that the OpenMPI build process needs to have access to libtm from torque to enable the interface). The best workaround is to recompile OpenMPI on a system with torque installed. Then the mpirun command can talk directly to the batch system to get the nodelist and start the parallel application using the torque daemon already running on the nodes. Job startup times for large parallel applications is significantly shorter using the tm-interface that using ssh to start the application on all nodes. If you recompile OpenMPI you can use the above runscript example as-is.

If however you for some reason do not rebuild the OpenMPI library you can use a workaround provided with the torque-roll. The torque roll contains a python-wrapper script named pbsdsh-wrapper that will makepbsdsh behave like ssh. pbsdsh can run arbitratry commands under the taskmanager on remote nodes participating in the job.

All that is needed is to setup a few environment variables for OpenMPI:

#!/bin/sh
#PBS -lwalltime=1:00:00
#PBS -lnodes=10
#PBS -lpmem=2gb
#PBS -N parallel_simulation

cd $PBS_O_WORKDIR

. /opt/torque/etc/openmpi-setup.sh

mpirun ./do-my-work

The openmpi-setup.sh takes care of setting a few enviroment variables to make mpirun use the pbsdshwrapper to start the application. The runscript itself can be found here and in /var/www/html/roll-documentation/torque/runscript.sh on the frontend.

MPICH2

The basic Rocks installation also contain MPICH2. This library has a different startup mechanism than OpenMPI. MPICH2 is installed in /opt/mpich2/gnu/ and has its own mpif90/mpicc wrappers. The torque-roll provides the mpiexec jobs launcher that provides the tight binding to the taskmanager. mpiexec is a stand-alone product installed in /opt/mpiexec/ and must not be confused with mpiexec from OpenMPI. The safest way to use it is to use the explicit path in the runscript:

#!/bin/sh
#PBS -lwalltime=1:00:00
#PBS -lnodes=10
#PBS -lpmem=2gb
#PBS -N parallel_simulation

cd $PBS_O_WORKDIR

/opt/mpiexec/bin/mpiexec ./do-my-work

mpiexec can start applications using several other MPI implementations like INTEL MPI and MVAPICH2.

For more info see the links in the Included software section.

Inspecting the jobs in the queue

There are several commands that will give you detailed information about the jobs in the batch system.

CommandTaskuseful flags
showqList jobs in queue

-r -- only running jobs

-i -- only idle jobs

-b -- only blocked jobs

-u username -- this user only

qstatList jobs in queue

-f jobid -- list details

-n-- list nodes assigned to job

While both showq and qstat do the same task the output is quite different, showq has the nice feature of sorting the jobs with respect to time to completion which makes it easy to see when resources will become available.

Administrator guide

In it's default configuration the batch system is set up as a FIFO system, but it is possible to change this to accomodate almost any scheduling policy. Maui can schedule on cpus, walltime, memory, disk size, network topology and more. See the maui and torque documentation for a full in-depth understanding of how to tune the batch system.

Setting node properties.

Node properties provides the possibility to flag nodes as having special features. As clusters have a tendency to grow inhomogeneous over time it is useful to have way to group nodes with similar features. Node properties are only text strings and their names do not need to have any logical resemblance with what they actually describe. But a user might have a better understanding of what a node with the "fast" property is over a "xyz" property.

Pre torque-roll 5.3

As the command rocks sync config would overwrite the torque node-list the only way to make node properties persistent was to turn off automatic updates of the node list by editing /etc/torque-roll.conf. This method still works for torque-roll v5.3 and upwards.

Torque-roll 5.3 and onwards

As of torque-roll v5.3 and up node properties can be set using the rocks concept of node attributes with the rocks command line tool. This is best illustrated by an example:

# rocks set host attr compute-0-0  torque_properties fast
# rocks set host attr compute-0-1  torque_properties slow
# rocks report pbsnodes | sh

This method will make the node properties sticky and automatic node list updates will still work.

The node properties will now appear in the node info and users can now submit jobs to only run on either fast or slow nodes:

$ pbsnodes compute-0-0
$ pbsnodes compute-0-1
$ qsub -lnodes=1:fast runscript.sh
$ qsub -lnodes=1:slow runscript.sh

If no flag on the qsub command is given then scheduling will be done as if the node properties were not set.

Each node can have more than one property. Names are separated by commas, for instance:

# rocks set host attr compute-0-0  torque_properties fast,highmem

Useful scheduling parameters

Some answers to often asked questions on the mailing list.

Maui vs torque

Torque is the resource manager, its task is to collect info about the state of the compute nodes and jobs. Maui is the scheduler, its task is to decide when and where to run the jobs submitted to torque.

Most things can be achieved by modifying /opt/maui/maui.cfg. Maui needs a restart after changing the config file:

service maui restart

Advice: If you can achieve the same thing by changing either torque or maui, use maui. Restarting maui is rather lightweight operation, and seldom causes problems for live systems. Restarting pbs_server can make the system oscillatory for a few minutes as pbs_server needs to contact all pbs_moms to get back in state.

Needed job info

To make the maui scheduler able to make informed decisions on how to prioritize jobs and on what nodes they should be started on it needs info about the jobs. The minimum requirement is the number of cpus and walltime. Information about memory requirements for the job is also useful. For instance:

#PBS -lwalltime=HH:MM:SS
#PBS -lnodes=10:ppn=8
#PBS -lpmem=1gb

Memory handling on linux

torque/maui supports two memory specification types, (p)mem and (p)vmem on linux.

  • pmem is not enforced, it is used only as information to the scheduler.
  • pvmem is enforced, procs that exceed the limit will be terminated. The pbs_mom daemon limits vmem size by setting the equivalent of ulimit -v on the processes it controls.

It is currently not possible to limit the amount of physical memory a process can allocate on a linux system. One can only limit the amount of virtual memory. Virtual memory is the physical memoroy + swap. Seeman pbs_resources_linux for details.

Tuning the batch system

Torque is installed in /opt/torqueqmgr is the torque management command

Friendly advice: backup your working config before modifying the setup:

# qmgr -c “print server” > /tmp/pbsconfig.txt

Roll back to escape from a messed up system:

# qterm; pbs_server -t create
# qmgr < /tmp/pbsconfig.txt

This will bring you back to where you started. Remark: this will wipe the whole queue setup and all currently queued and running jobs will be lost!

The default batch configuration from the torque-roll is saved in /opt/torque/pbs.default. Do this to get back the original setup that came with the torque-roll:

# qterm; pbs_server -t create
# qmgr < /opt/torque/pbs.default

Prioritizing short jobs

Often it is useful to give shorter jobs higher priority. It is recommended to use the XFACTOR feature in maui rather than torque queues with different priorites.:

XFACTORWEIGHT 1000

XFACTOR is defined as:

XFACTOR=(walltime+queuetime)/walltime

XFACTOR will increase faster for shorter walltimes thus giving higher priorities for short jobs. Depends on users giving reasonable walltime limits.

Prioritizing large jobs (maui)

In a cluster with a diverse mix of jobs it is often desirable to prioritize the large jobs and make the smaller ones fill in the gaps.:

CPUWEIGHT 1000
MEMWEIGHT 100

This should be combined with fairshare to avoid starving users falling outside this prioritization.

Fairshare (maui)

Also known as

“Keeping all users equally unhappy”

Can be done on several levels users, groups.....

Set a threshold:

USERCFG[DEFAULT] FSTARGET=10
FSWEIGHT 100

Users having used more than 10% will get reduced priority and vice versa.

Adjusting your policy

You can play with the weights to fine-tune your scheduling policies:

XFACTORWEIGHT 100
FSWEIGHT 1000
RESWEIGHT 10
CPUWEIGHT 1000
MEMWEIGHT 100

Analyze the prioritization with diagnose -p

Job node distribution

Default is MINRESOURCE Run on the nodes which gives the least unused resources.

Spread or pack?:

NODEALLOCATIONPOLICY PRIORITY

Select the most busy nodes first:

NODECFG[DEFAULT] PRIORITYF=JOBCOUNT

Select the least busy nodes first:

NODECFG[DEFAULT] PRIORITYF=-1.0*JOBCOUNT

Node access policy

Default access policy is SHARED Can choose to limit this to SINGLEJOB or SINGLEUSER, for instance:

NODEACCESSPOLICY SINGLEUSER

Single user access prevents users from stepping on each others toes while allowing good utilization for serial jobs.

Throttling policies

Sometimes one needs to limit the user from taking over the system:

MAXPROC, MAXPE, MAXPS, MAXJOB, MAXIJOB

All can be set for all or individual users and groups:

USERCFG[DEFAULT], USERCFG[UserA] etc.

Debugging and analyzing

Lot of tools:

pbsnodes      -- node status
qstat -f              -- all details of a job
diagnose -n   -- node status from maui
diagnose -p   -- job priority calculation
showres -n    -- job reservation per node
showstart     -- obvious
checkjob/checknode – also pretty obvious..

Example: express queue

Goal: Supporting development and job script testing, but prevent misuse

Basic philosophy:

  • Create a separate queue
  • Give it the highest priority
  • Throttle it so it is barely usable

Create the queue with qmgr:

create queue express
set queue express queue_type = Execution
set queue express resources_max.walltime = 08:00:00
set queue express resources_default.nodes = 1:ppn=8
set queue express resources_default.walltime = 08:00:00
set queue express enabled = True
set queue express started = True

Increase the priority and limit the usage:

CLASSWEIGHT             1000
CLASSCFG[express] PRIORITY=1000 MAXIJOB=1  MAXJOBPERUSER=1 QLIST=express QDEF=express
QOSCFG[express] FLAGS=IGNUSER

This will allow users to test job scripts and run interactive jobs with good turnaround by submitting to the express queue, qsub -q express ........ At the same time misuse is prevented since only 1 running job is allowed per user.

Appendix.

Building the roll from source

This is only relevant if you want to change something in how the torque-roll is built. The default build should cover most needs.

Clone the repository into the rocks build tree on a frontend:

cd /opt/rocks/share/devel/roll/src/
hg clone http://devsrc.cc.uit.no/hg/torque/

Building is a three step process:

cd torque/src/torque
make rpm
cd ../..
rpm -i RPMS/x86_64/torque*.rpm
make roll

You should now have a torque iso file that you can install on a frontend.

The torque rpm build depends on readline-devel and tclx-devel rpms being installed.

A complete job session.

A hands on session including compiling the program and running it in the queue.

Log in and prepare the source:

[royd@hpc2 ~]$ cp /opt/mpi-tests/src/mpi-verify.c .
[royd@hpc2 ~]$ mpicc mpi-verify.c -o mpi-verify.openmpi.x

We have a runscript ready with the correct setup for OpenMPI:

[royd@hpc2 ~]$ cat run-openmpi.sh
#!/bin/sh
#PBS -lnodes=2:ppn=2,walltime=1000

# list the name of the nodes participating in the job. pbsdsh can run
# any command in parallel
pbsdsh uname -n

. /opt/torque/etc/openmpi-setup.sh

mpirun mpi-verify.openmpi.x

date

Submit the job with qsub, it will print the jobid upon successful submission:

[royd@hpc2 ~]$ qsub run-openmpi.sh
15.hpc2.cc.uit.no

List the jobs in the queue, as you can see the job has already started:

[royd@hpc2 ~]$ showq
ACTIVE JOBS--------------------
JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME

15                     royd    Running     4    00:16:40  Tue Jan 26 10:11:32

     1 Active Job        4 of    6 Processors Active (66.67%)
                         2 of    3 Nodes Active      (66.67%)

IDLE JOBS----------------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME


0 Idle Jobs

BLOCKED JOBS----------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME


Total Jobs: 1   Active Jobs: 1   Idle Jobs: 0   Blocked Jobs: 0

You can also use qstat to view the jobs in the queue:

[royd@hpc2 ~]$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
15.hpc2                   run-openmpi.sh   royd                   0 R default

When the job finishes you will get two files back to where you submitted the job from. One with stdout and one for stderr of the job. Very useful for debugging jobscripts:

[royd@hpc2 ~]$ ls
mpi-verify.c  mpi-verify.openmpi.x  run-openmpi.sh  run-openmpi.sh.e15  run-openmpi.sh.o15
[royd@hpc2 ~]$ cat run-openmpi.sh.e15
Process 0 on compute-0-2.local
Process 1 on compute-0-2.local
Process 2 on compute-0-1.local
Process 3 on compute-0-1.local
[royd@hpc2 ~]$ cat run-openmpi.sh.o15
compute-0-2.local
compute-0-2.local
compute-0-1.local
compute-0-1.local
Tue Jan 26 10:11:33 CET 2010
[royd@hpc2 ~]$

Now, try this yourself...

标题SpringBoot基于Web的图书借阅管理信息系统设计与实现AI更换标题第1章引言介绍图书借阅管理信息系统的研究背景、意义、现状以及论文的研究方法和创新点。1.1研究背景与意义分析当前图书借阅管理的需求和SpringBoot技术的应用背景。1.2国内外研究现状概述国内外在图书借阅管理信息系统方面的研究进展。1.3研究方法与创新点介绍本文采用的研究方法和系统设计的创新之处。第2章相关理论技术阐述SpringBoot框架、Web技术和数据库相关理论。2.1SpringBoot框架概述介绍SpringBoot框架的基本概念、特点和核心组件。2.2Web技术基础概述Web技术的发展历程、基本原理和关键技术。2.3数据库技术应用讨论数据库在图书借阅管理信息系统中的作用和选型依据。第3章系统需求分析对图书借阅管理信息系统的功能需求、非功能需求进行详细分析。3.1功能需求分析列举系统应具备的各项功能,如用户登录、图书查询、借阅管理等。3.2非功能需求分析阐述系统应满足的性能、安全性、易用性等方面的要求。第4章系统设计详细介绍图书借阅管理信息系统的设计方案和实现过程。4.1系统架构设计给出系统的整体架构,包括前后端分离、数据库设计等关键部分。4.2功能模块设计具体阐述各个功能模块的设计思路和实现方法,如用户管理模块、图书管理模块等。4.3数据库设计详细介绍数据库的设计过程,包括表结构、字段类型、索引等关键信息。第5章系统实现与测试对图书借阅管理信息系统进行编码实现,并进行详细的测试验证。5.1系统实现介绍系统的具体实现过程,包括关键代码片段、技术难点解决方法等。5.2系统测试给出系统的测试方案、测试用例和测试结果,验证系统的正确性和稳定性。第6章结论与展望总结本文的研究成果,指出存在的问题和未来的研究方向。6.1研究结论概括性地总结本文的研究内容和取得的成果。6.2展望对图书借阅管理
摘 要 基于SpringBoot的电影院售票系统为用户提供了便捷的在线购票体验,覆盖了从注册登录到观影后的评价反馈等各个环节。用户能够通过系统快速浏览和搜索电影信息,包括正在热映及即将上映的作品,并利用选座功能选择心仪的座位进行预订。系统支持多种支付方式如微信、支付宝以及银行卡支付,同时提供积分兑换和优惠券领取等功能,增强了用户的购票体验。个人中心允许用户管理订单、收藏喜爱的影片以及查看和使用优惠券,极大地提升了使用的便利性和互动性。客服聊天功能则确保用户在遇到问题时可以即时获得帮助。 后台管理人员,系统同样提供了全面而细致的管理工具来维护日常运营。管理员可以通过后台首页直观地查看销售额统计图,了解票房情况并据此调整策略。电影信息管理模块支持新增、删除及修改电影资料,确保信息的准确与及时更新。用户管理功能使得管理员可以方便地处理用户账号,包括导入导出数据以供分析。订单管理模块简化了对不同状态订单的处理流程,提高了工作效率。优惠券管理和弹窗提醒管理功能有助于策划促销活动,吸引更多观众。通过这样的集成化平台,SpringBoot的电影院售票系统不仅优化了用户的购票体验,也加强了影院内部的管理能力,促进了业务的发展和服务质量的提升。 关键词:电影院售票系统;SpringBoot框架;Java技术
内容概要:本文介绍了2025年中国网络安全的十大创新方向,涵盖可信数据空间、AI赋能数据安全、ADR(应用检测与响应)、供应链安全、深度伪造检测、大模型安全评估、合规管理与安全运营深度融合、AI应用防火墙、安全运营智能体、安全威胁检测智能体等。每个创新方向不仅提供了推荐的落地方案和典型厂商,还详细阐述了其核心能力、应用场景、关键挑战及其用户价值。文中特别强调了AI技术在网络安全领域的广泛应用,如AI赋能数据安全、智能体驱动的安全运营等,旨在应对日益复杂的网络威胁,提升企业和政府机构的安全防护能力。 适合人群:从事网络安全、信息技术、数据管理等相关工作的专业人士,尤其是负责企业信息安全、技术架构设计、合规管理的中高层管理人员和技术人员。 使用场景及目标:①帮助企业理解和应对最新的网络安全威胁和技术趋势;②指导企业选择合适的网络安全产品和服务,提升整体安全防护水平;③协助企业构建和完善自身的网络安全管理体系,确保合规运营;④为技术研发人员提供参考,推动技术创新和发展。 其他说明:文章内容详尽,涉及多个技术领域和应用场景,建议读者根据自身需求重点关注相关章节,并结合实际情况进行深入研究和实践。文中提到的多个技术和解决方案已在实际应用中得到了验证,具有较高的参考价值。此外,随着技术的不断发展,文中提及的部分技术和方案可能会有所更新或改进,因此建议读者保持关注最新的行业动态和技术进展。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值