Wang Zheng Yuan: September 2014

Monday, September 22, 2014

qmgr commands

Run-time node changes

TORQUE can dynamically add nodes with the qmgr command. For example, the following command will add node node003:

> qmgr -c "create node node003"

The above command appends the $TORQUE_HOME/server_priv/nodes file with:

node003

Nodes can also be removed with a similar command:

> qmgr -c "delete node node003"

Typically, an administrator will want to change the state of a node instead of remove it (for details, see Changing node state).

It is highly recommended that you restart the pbs_server after you make any node changes, or just edit the nodes file manually and restart it.

Numerous pbs_server errors in /var/log/messages

The problem:

On supercomputer's management node we receive numerous errors such as:

pbs_server: LOG_ERROR::is_request, bad attempt to connect from 10.10.0.254:1023

(address not trusted - check entry in server_priv/nodes)

And after them nearly every minute follows this one:

last message repeated 16 times

where repeat's count vary from time to time.
Mentioned address 10.10.0.254 is one of management node's addresses. Port 1023 according to "netstat -pa | grep 1023" is related to pbs_mom.
It turns out that management node several times per minute tries to connect with itself and can't do it. Advice from error text doesn't help much, management node should not be in "nodes" file as far as I understand.
Could anybody suggest how to solve this problem?

Answer:
Your management node is not defined as a node in pbs. Open up qmgr and run "create node [hostname without brackets]". The other options is to kill pbs_mom since you probably don't want to run compute jobs on your head node.

#service pbs_mom stop

Saturday, September 20, 2014

command tutorial on vi, the editor

Cursor movement

h - move left
j - move down
k - move up
l - move right
w - jump by start of words (punctuation considered words)
W - jump by words (spaces separate words)
e - jump to end of words (punctuation considered words)
E - jump to end of words (no punctuation)
b - jump backward by words (punctuation considered words)
B - jump backward by words (no punctuation)
0 - (zero) start of line
^ - first non-blank character of line
$ - end of line
G - Go To command (prefix with number - 5G goes to line 5)

Note: Prefix a cursor movement command with a number to repeat it. For example, 4j moves down 4 lines.

Insert Mode - Inserting/Appending text

i - start insert mode at cursor
I - insert at the beginning of the line
a - append after the cursor
A - append at the end of the line
o - open (append) blank line below current line (no need to press return)
O - open blank line above current line
ea - append at end of word
Esc - exit insert mode

Editing

r - replace a single character (does not use insert mode)
J - join line below to the current one
cc - change (replace) an entire line
cw - change (replace) to the end of word
c$ - change (replace) to the end of line
s - delete character at cursor and subsitute text
S - delete line at cursor and substitute text (same as cc)
xp - transpose two letters (delete and paste, technically)
u - undo
. - repeat last command

Marking text (visual mode)

v - start visual mode, mark lines, then do command (such as y-yank)
V - start Linewise visual mode
o - move to other end of marked area
Ctrl+v - start visual block mode
O - move to Other corner of block
aw - mark a word
ab - a () block (with braces)
aB - a {} block (with brackets)
ib - inner () block
iB - inner {} block
Esc - exit visual mode

Visual commands

> - shift right
< - shift left
y - yank (copy) marked text
d - delete marked text
~ - switch case

Cut and Paste

yy - yank (copy) a line
2yy - yank 2 lines
yw - yank word
y$ - yank to end of line
p - put (paste) the clipboard after cursor
P - put (paste) before cursor
dd - delete (cut) a line
dw - delete (cut) the current word
x - delete (cut) current character

Exiting

:w - write (save) the file, but don't exit
:wq - write (save) and quit
:q - quit (fails if anything has changed)
:q! - quit and throw away changes

Search/Replace

/pattern - search for pattern
?pattern - search backward for pattern
n - repeat search in same direction
N - repeat search in opposite direction
:%s/old/new/g - replace all old with new throughout file
:%s/old/new/gc - replace all old with new throughout file with confirmations

Working with multiple files

:e filename - Edit a file in a new buffer
:bnext (or :bn) - go to next buffer
:bprev (of :bp) - go to previous buffer
:bd - delete a buffer (close a file)
:sp filename - Open a file in a new buffer and split window
ctrl+ws - Split windows
ctrl+ww - switch between windows
ctrl+wq - Quit a window
ctrl+wv - Split windows vertically

Another good vim commands cheatsheet and a vi introduction using the "cheat sheet" method

Friday, September 19, 2014

GlusterFS Setup Tutorial

Many websites or projects experiencing exponential growth face distributed data storage issues. GlusterFS is an unified, poly-protocol, scale-out filesystem, capable of serving PBs of data at lightning speeds and turns common hardware into a high-performance scalable storage solution.

In this tutorial, we will review a basic replication setup between two nodes which allows instant synchronization of a specific directory, as well their related content, permissions changes, etc. If certain terms used are unfamiliar, please consult the the GlusterFS documentation.

Disk Partitioning
For a replication setup, GlusterFS requires an identical disk partition present on each node. We will use apollo and chronos as nodes, with one GlusterFS volume and brick replicated across the nodes.

From my experience, the most important part of a GlusterFS setup is planing ahead your disks partitioning. If you have a proper layout, it will be very easy to create a designated GlusterFS volume group and logical volume for each node.

We will use as example the following partitioning, present in both nodes:

# df -ah /mnt/gvol0
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_gluster-lv_gvol0
                       10G  151M  9.2G   2% /mnt/gvol0

The GlusterFS volume naming conventions are:

/<base directory>/<volume name>/<brick name>/brick

For Selinux compatibility, we will use /mnt as base directory, gvol0 as volume name and brick0 as brick name. Create the volume path on each node:

# ls -lah /mnt/gvol0
total 28K
drwxr-xr-x. 4 root root 4.0K Sep  9 18:52 .
drwxr-xr-x. 3 root root 4.0K Sep  9 18:52 ..
drwx------. 2 root root  16K Sep  9 17:33 lost+found
# install -d -m 0755 /mnt/gvol0/brick0/brick

We are done with the initial setup, let's proceed to GlusterFS configuration.

GlusterFS Setup
Install GlusterFS into each node, by running the following commands:

# yum --enablerepo=axivo install glusterfs-server
# service rpcbind start

Enterprise Linux 6:

# chkconfig glusterd on
# service glusterd start

Enterprise Linux 7:

# systemctl enable glusterd.service
# systemctl start glusterd.service

Before probing the nodes, open the required firewall ports. GlusterFS uses the following ports:

111 (tcp and udp) - rpcbind
2049 (tcp) - nfs
24007 (tcp) - server daemon
38465:38469 (tcp) - nfs related services
49152 (tcp) - brick

We used the following iptables configuration file, for each node:

# cat /etc/sysconfig/iptables-glusterfs
-A INPUT -m state --state NEW -m tcp -p tcp -s 192.168.1.0/24 --dport 111         -j ACCEPT
-A INPUT -m state --state NEW -m udp -p udp -s 192.168.1.0/24 --dport 111         -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp -s 192.168.1.0/24 --dport 2049        -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp -s 192.168.1.0/24 --dport 24007       -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp -s 192.168.1.0/24 --dport 38465:38469 -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp -s 192.168.1.0/24 --dport 49152       -j ACCEPT

If you use Firewalld, add the following rules for each node:

# firewall-cmd --permanent --zone=public --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port port="111"         protocol="tcp" accept'
# firewall-cmd --permanent --zone=public --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port port="111"         protocol="udp" accept'
# firewall-cmd --permanent --zone=public --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port port="2049"        protocol="tcp" accept'
# firewall-cmd --permanent --zone=public --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port port="24007"       protocol="tcp" accept'
# firewall-cmd --permanent --zone=public --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port port="38465-38469" protocol="tcp" accept'
# firewall-cmd --permanent --zone=public --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port port="49152"       protocol="tcp" accept'

You will need to add a port for each additional brick. Since we use only one brick, 49152 is sufficient.
Make sure each node name resolves properly in your DNS setup and probe each node:

[root@apollo ~]# gluster peer probe chronos
peer probe: success.
[root@chronos ~]# gluster peer probe apollo
peer probe: success.

On apollo only, run the following command to create the replication volume:

[root@apollo ~]# gluster volume create gvol0 replica 2 {apollo,chronos}:/mnt/gvol0/brick0/brick
volume create: gvol0: success: please start the volume to access data

Breaking down the above command, we told GlusterFS to create a replica volume and keep a copy of the data on at least 2 bricks at any given time. Since we only have two bricks, this means each server will house a copy of the data. Lastly, we specify which nodes and bricks to use.

Verify the volume information and start the volume:

[root@apollo ~]# gluster volume info
Volume Name: gvol0
Type: Replicate
Volume ID: 2b9c2607-9569-48c3-9138-08fb5d8a213f
Status: Created
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: apollo:/mnt/gvol0/brick0/brick
Brick2: chronos:/mnt/gvol0/brick0/brick

[root@apollo ~]# gluster volume start gvol0
volume start: gvol0: success

We are done with the server setup, let's proceed to GlusterFS replication.

Replication Setup
We will use /var/www/html as replication directory across the two nodes. Please make sure the directory does not contain any files. Once the directory is mounted as GlusterFS type, any previous file present into directory will not be available anymore.

Execute the following commands, to mount /var/www/html as GlusterFS type:

[root@apollo ~]# install -d -m 0755 /var/www/html
[root@apollo ~]# cat >> /etc/fstab << EOF
apollo:/gvol0     /var/www/html    glusterfs    defaults    0 0
EOF
[root@apollo ~]# mount -a
[root@apollo ~]# mount -l -t fuse.glusterfs
apollo:/gvol0 on /var/www/html type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)

[root@chronos ~]# install -d -m 0755 /var/www/html
[root@chronos ~]# cat >> /etc/fstab << EOF
chronos:/gvol0    /var/www/html    glusterfs    defaults    0 0
EOF
[root@chronos ~]# mount -a
[root@chronos ~]# mount -l -t fuse.glusterfs
chronos:/gvol0 on /var/www/html type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)

List the GlusterFS pool, to verify any node connectivity issues:

[root@apollo ~]# gluster pool list
UUID                                    Hostname        State
5cb470ae-3c88-46fa-8cb9-2dd29d26e104    chronos         Connected
188975c7-6d69-472a-b421-641286551d28    localhost       Connected

To test the GlusterFS replication, we installed Nginx into both nodes, created a file on apollo and changed its content as well the file permissions randomly:

[root@apollo ~]# ls -lah /var/www/html
total 8K
drwxr-xr-x. 3 root  root 4.0K Sep  9 19:38 .
drwxr-xr-x. 3 root  root 4.0K Sep  9 19:38 ..
[root@apollo ~]# yum --enablerepo=axivo install nginx
[root@chronos ~]# yum --enablerepo=axivo install nginx
[root@apollo ~]# ls -lah /var/www/html
total 12K
drwxr-xr-x. 3 root root 4.0K Sep  9 20:01 .
drwxr-xr-x. 3 root root 4.0K Sep  9 19:38 ..
-rw-r--r--. 1 root root  535 Oct 11  2009 404.html
-rw-r--r--. 1 root root  543 Oct 11  2009 50x.html
-rw-r--r--. 1 root root  198 May  6  2006 favicon.ico
-rw-r--r--. 1 root root  528 Oct 11  2009 index.html
-rw-r--r--. 1 root root  377 May  6  2006 nginx.gif
[root@apollo ~]# vi /var/www/html/info.php
[root@chronos ~]# chown nginx /var/www/html/info.php
[root@apollo ~]# cat /var/www/html/info.php
<?php
phpinfo();

The file was instantly replicated on chronos, with identical content and permissions on both nodes:

[root@apollo ~]# ls -lah /var/www/html
total 13K
drwxr-xr-x. 3 root  root 4.0K Sep  9 20:06 .
drwxr-xr-x. 3 root  root 4.0K Sep  9 19:38 ..
-rw-r--r--. 1 root  root  535 Oct 11  2009 404.html
-rw-r--r--. 1 root  root  543 Oct 11  2009 50x.html
-rw-r--r--. 1 root  root  198 May  6  2006 favicon.ico
-rw-r--r--. 1 root  root  528 Oct 11  2009 index.html
-rw-r--r--. 1 nginx root   18 Sep  9 20:06 info.php
-rw-r--r--. 1 root  root  377 May  6  2006 nginx.gif
[root@chronos ~]# cat /var/www/html/info.php
<?php
phpinfo();

You are currently running a high-performance, scalable replication system.

Troubleshooting
The logs are the best place to start your troubleshooting, examine any encountered errors into /var/log/glusterfs/var-www-html.log file. Please don't turn off your firewall just because you think it is blocking your setup, the firewall adds an important security layer and should never be disabled. Instead, study the logs and find the source of problems.

An easy way to determine which ports are used in GlusterFS is by running a volume status check:

# gluster volume status
Status of volume: gvol0
Gluster process                                         Port    Online  Pid
------------------------------------------------------------------------------
Brick apollo:/mnt/gvol0/brick0/brick                    49152   Y       1220
Brick chronos:/mnt/gvol0/brick0/brick                   49152   Y       1363
NFS Server on localhost                                 2049    Y       1230
Self-heal Daemon on localhost                           N/A     Y       1235
NFS Server on chronos                                   2049    Y       1371
Self-heal Daemon on chronos                             N/A     Y       1375

Task Status of Volume gvol0
------------------------------------------------------------------------------
There are no active volume tasks

You can also run netstat -tulpn | grep gluster to examine further the used ports.

This tutorial covered an infinitesimal part of GlusterFS capabilities. You need to understand that rushing through the tutorial without proper understanding or reading the documentation will result in a failure. Once you understand how GlusterFS works, you are welcome to ask any related questions into our support forums.

Tuesday, September 16, 2014

new HPC system

HPC system: 628 cores, 1.5TB memory, 70TB storage