2014-12-16

RHEL 6.4 kernel 2.6.32-358.23.2, Mellanox OFED 2.1-1.0.6, and Lustre client 2.5.0

I am planning some upgrades for the cluster that I manage. As part of the updates, it would be good to have MVAPICH2 with GDR (GPU-Direct RDMA -- yes, that's an acronym of an acronym). MVAPICH2-GDR, which is provided only as binary RPMs, only supports Mellanox OFED 2.1.

Now, our cluster runs RHEL6.4, but with most non-kernel and non-glibc packages updated to whatever is in RHEL6.5. The plan is to update everything to whatever is in RHEL6.6, except for the kernel, leaving that at 2.6.32-358.23.2 which is the last RHEL6.4 kernel update. The reason for staying with that version of the kernel is because of Lustre.

We have a Terascala Lustre filesystem appliance. The latest release of TeraOS uses Lustre 2.5.0. Upgrading the server is pretty straightforward, according to the Terascala engineers. Updating the client is a bit trickier. Currently, the Lustre support matrix says that Lustre 2.5.0 is supported only on RHEL6.4.

The plan of attack is this:

  1. Update a base node with all RHEL packages, leaving the kernel at 2.6.32-358.23.2
  2. Upgrade Mellanox OFED from 1.9 to 2.1
  3. Build lustre-client-2.5.0 and upgrade the Lustre client packages

Updating the base node is straightforward. Just use "yum update", after commenting out the exclusions in /etc/yum.conf. If you had updated the <tt>redhat-release-server-6server<tt> package, which defines which RHEL release you have, you can downgrade it. (See RHEL Knowledgebase, subscription required.) First, install the last (as of 2014-12-15) RHEL6.4 kernel, and then do the downgrade:
# yum install kernel-2.6.32-358.23.2.el6
# reboot
# yum downgrade redhat-release-server-6server

Check with <tt>cat /etc/redhat-release</tt>.

Next, install Mellanox OFED 2.1-1.0.6. You can install it directly using the provided installation script, or if you are paranoid like me, you can use the provided script to build RPMs against the exact kernel update you have installed.

Get the tarball directly from Mellanox. Extract, and make new RPMs:
# tar xf MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64.tgz
# cd MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64
# ./mlnx_add_kernel_support.sh -m .
...
# cp /tmp/MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64-ext.tgz .
# tar xf MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64-ext.tgz
# cd MLNX_OFED_LINUX-2.1-1.0.6-rhel6.4-x86_64-ext
# ./mlnxofedinstall
# reboot

Strictly speaking, the reboot is unnecessary: you can stop and restart a couple of services and the new OFED will load.

Next, for Lustre. Get the SRPM from Intel (who bought WhamCloud). You will notice that it is for kernel 2.6.32-358.18.1. Not mentioned is the fact that by default, it uses the generic OFED that RedHat rolls into its distribution. To use the Mellanox OFED, a slightly different installation method must be used.

# rpm -Ivh lustre-client-2.5.0-2.6.32_358.18.1.el6.x86_64.src.rpm
# cd ~/rpmbuild/SOURCES
# cp lustre-2.5.0.tar.gz ~/tmp
# cd ~/tmp
# tar xf lustre-2.5.0.tar.gz
# cd lustre-2.5.0
# ./configure --disable-server --with-o2ib=/usr/src/ofa_kernel/default
# make rpms
# cd ~/rpmbuild/RPMS/x86_64
# yum install lustre-client-2.5.0-2.6.32_358.23.2.el6.x86_64.x86_64.rpm \
lustre-client-modules-2.5.0-2.6.32_358.23.2.el6.x86_64.x86_64.rpm \
lustre-client-tests-2.5.0-2.6.32_358.23.2.el6.x86_64.x86_64.rpm \
lustre-iokit-2.5.0-2.6.32_358.23.2.el6.x86_64.x86_64.rpm
To make the lustre module load at boot, I have a kludge: to /etc/init.d/netfs right after the line
STRING=$"Checking network-atttached filesystems"
add
modprobe lustre
Reboot, and then check:
# lsmod | grep lustre
lustre                921744  0
lov                   516461  1 lustre
mdc                   199005  1 lustre
ptlrpc               1295397  6 mgc,lustre,lov,osc,mdc,fid
obdclass             1128062  41 mgc,lustre,lov,osc,mdc,fid,ptlrpc
lnet                  343705  4 lustre,ko2iblnd,ptlrpc,obdclass
lvfs                   16582  8 mgc,lustre,lov,osc,mdc,fid,ptlrpc,obdclass
libcfs                491320  11 mgc,lustre,lov,osc,mdc,fid,ko2iblnd,ptlrpc,obdclass,lnet,lvfs


2014-11-06

Mounting a HTC One on Ubuntu 14.04 Trusty; Re-flashing the ROM

Since I unlocked my HTC One (M7), I do not have a voicemail app since the phone was tied to a specific provider. Now, I'm trying to figure out how to re-flash the device to generic.

The first thing was to try to connect it to my Linux machine. However, upon plugging the phone into the USB port, an error appeared: "Unable to find matching udev device".

Apparently, this is fairly common and happens with other devices. The error boils down to a bug in Ubuntu's default Media Transfer Protocol (MTP) library. The fix is to install a later version.

First, add a new repository:
$ sudo add-apt-repository ppa:webupd8team/unstable
$ sudo apt-get update
Then, install the mtpfs package.

UPDATE: HTC has a free bootloader unlocking utility. You just have to sign up for their free developer program. Then, follow the instructions here: http://www.htcdev.com/bootloader/unlock-instructions

The one thing they do not mention is that you have to have root privileges for the fastboot utility to work, so:
$ sudo ./fastboot oem get_identifier_token
Then, they email you a file for unlocking. Next, re-flash the phone with a vendor-appropriate ROM from here: http://www.htcdev.com/devcenter/downloads

For AT&T (my phone's original ROM) and T-Mobile (my current provider), there is no binary image for flashing. I didn't bother looking to see if you could compile the source. Instead, I went with the Android Ice Cold Project (AICP). I discovered it via this step-by-step article. They also have a download of the full Google Apps.

2014-10-15

Another SSL vulnerability - The POODLE Attack

From the Mozilla Security Blog:
SSL version 3.0 is no longer secure. Browsers and websites need to turn off SSLv3 and use more modern security protocols as soon as possible, in order to avoid compromising users’ private information.
Under RHEL 6.5 with Apache httpd, edit /etc/httpd/conf.d/ssl.conf and make sure the protocol line disables both SSLv2 and SSLv3:
SSLProtocol all -SSLv2 -SSLv3
or you can just specify TLS only:
SSLProtocol +TLSv1 +TLSv1.1 +TLSv1.2
 Ars Technica has a good explanation.

Scott Helme has a good run down on how to fix this issue, for various servers and browsers.

2014-09-01

Python's with statement

Old habits die hard. I learned a long time ago (Python 1.x) this pattern for opening and operating on files:

    try:    
        f = open("filename.txt", "ro")
        try:
            for l in f:
                print l
        finally:
            f.close()
    except IOError as e:
        print "I/O error({0}): {1}".format(e.errno, e.strerror)

Since Python 2.6, the with statement does this automatically:

    with open("filename.txt", "ro") as f:
        for l in f:
            print l

The with statement works with some other classes, too.

PS Blogger really needs a code block style.

2014-07-24

Ganglia

Word to the wise: do not enable the multiplecpu multicpu module. It doesn't get disabled even if you append ".disabled" to the file name. Now, I have 265 CPU metrics.

2014-07-02

Limiting logins under SSSD

Under SSSD, you can pretty easily limit logins to specific users or groups. The syntax is different from that of /etc/security/access.conf, and is actually easier. Red Hat has some documentation (may require login). There is also a man page for sssd.conf(5).

Under the your domain, add some lines to configure "simple" access control:
[domain/default]  
access_provider = simple 
simple_allow_users = topbanana 
simple_allow_groups = bunchofbananas,wheel