Featured Post

Linux daemon using Python daemon with PID file and logging

The python-daemon package ( PyPI listing , Pagure repo ) is very useful. However, I feel it has suffered a bit from sparse documentation, an...

2021-03-01

Building NCBI NGS tools (NGS SDK, VDB, SRA Tools) on RHEL 8

I had some trouble building SRA Tools from source on RHEL8. After short message thread on GitHub, I went back and tried again, this time setting the “--relative-build-out-dir” option on configure for all components of the NCBI NGS suite. That fixed it.

My full write-up of the build process is a GitHub gist.

2021-02-03

Newly discovered malware called Kobalos targeting HPC

 From Ars Technica:

High-performance computer networks, some belonging to the world’s most prominent organizations, are under attack by a newly discovered backdoor that gives hackers the ability to remotely execute commands of their choice, researchers said on Tuesday.

Kobalos, as researchers from security firm Eset have named the malware, is a backdoor that runs on Linux, FreeBSD, and Solaris, and code artifacts suggest it may have once run on AIX and the ancient Windows 3.11 and Windows 95 platforms. The backdoor was released into the wild no later than 2019, and the group behind it was active throughout last year.


2021-01-28

MediaWiki with PostgreSQL using Buildah and Podman on RHEL7

NOTE I started working on this some months ago, and then had to stop working on it due to other stuff coming up. So, this is an incomplete example. Most of it is tested, and should work, but I suggest working on this in a throwaway VM as an instructional example. I am posting as is mainly for my own reference. I may update it along the way as I get time to work through it in more detail.

This is a “port” of the Examining container performance on RHEL 8 with PCP and pmda-podman example at the Red Hat Blog to RHEL7. Except that the focus here will be more on getting PostgreSQL, Apache, and MediaWiki running, rather than the performance analysis.

Performance Co-Pilot (PCP) for podman does not seem to provide a podman monitoring feature, so we will not be doing that part of the example.

The Red Hat example uses RHEL8, and there are enough differences with RHEL7 that the Red Hat example cannot be used directly.

We also update to MediaWiki 1.34.2 since the 1.32 series is no longer supported.

Red Hat has a podman command line reference. (It's part of their RHEL8 documentation.) For an overview of Podman and Buildah, this post at the Red Hat Developers Blog is good.

What we will run:
  • RHEL 7.8
  • PostgreSQL 9.2.24-4.el7_8
  • Apache 2.4 (via Red Hat Software Collections)
  • PHP 7.3 (required by MediaWiki; via Red Hat Software Collections)
  • MediaWiki 1.34.2
There will be two containers:
  • one with PostgreSQL
  • another with Apache, PHP, and MediaWiki
NOTE do not use tmux on your host machine to work through this example since we will need to use tmux in one of the containers. But if you know how to handle nested tmux sessions, go for it.

CAUTION there are official PostgreSQL container images from Red Hat. They should be already set up such that the kluges below (modifying the postgresql-setup script, and PostgreSQL config files) should not be necessary. See this one for PostgreSQL 10 on RHEL8. Do "podman search postgresql" to see what is available.

In the following, the prompts will indicate which machine or container we are on: the host machine will have a prompt "[root@host ~]#" The containers will have some arbitrary string of hexadecimal digits as the hostname. However, for clarity, this example will use the container names, instead.

OUTLINE

  • Build two local images with buildah: one for PostgreSQL, one for Apache + PHP-FPM +  MediaWiki
  • Run containers using local images
  • Cleanup

BEFORE WE BEGIN

Here is a quick list of some of the commands that will be run in order, from getting an image and creating a container, showing all containers, removing the container, and removing the image:
  • container=$( buildah from image_url )
  • buildah containers
  • buildah rm $container
  • buildah rmi image_id

BUILD CONTAINERS

First of all, install buildah to manage container images, and podman to run them.

[root@host ~]# yum install buildah podman

Login to the Red Hat container registry -- you must have an existing Red Hat account:

[root@host ~]# buildah login registry.redhat.io

Logging in to the container registry allows us to download base images which our local images will be based on.

Our containers will use the RHEL7 image registry.access.redhat.com/rhel7 as a starting point.

PostgreSQL

We create a container based on the rhel7 image. Then, copy the repo file from the host to the image, and install postgresql-server (plus a few other packages).

[root@host ~]# container=$(buildah from registry.access.redhat.com/rhel7)
[root@host ~]# echo $container
rhel7-working-container
[root@host ~]# buildah copy $container /etc/yum.repos.d/redhat.repo \
    /etc/yum.repos.d/redhat.repo
1f302312276b6f60ca1189181159d8c8eba378d3ff76a6aff651220c8f8250f2


Run a shell in the container to install PostgreSQL and some other packages:
 
[root@host ~]# buildah run $container /bin/bash
[root@psql /]# yum -y install postgresql-server tmux psmisc nc vim
...  
Complete!
[root@psql /]# yum -y update
Loaded plugins: ovl, product-id, search-disabled-repos, subscription-manager
No packages marked for update
[root@psql /]# yum clean all
Loaded plugins: ovl, product-id, search-disabled-repos, subscription-manager
Cleaning repos: rhel-7-server-extras-rpms rhel-7-server-optional-rpms rhel-7-server-rpms rhel-server-rhscl-7-rpms


Next, modify the postgresql-setup2 script because the container will not be using systemd. In general, systemd cannot run in containers.

[root@psql /]# cp /usr/bin/postgresql-setup \
    /usr/bin/postgresql-setup2

Edit /usr/bin/postgresql-setup2: Comment out (or delete) lines 111-113 which define the PGDATA variable. In its place, add this at line:

PGDATA=/var/lib/pgsql/data

This defines the location of the PostgreSQL config and data files.

Next, comment out (or delete) lines 119-121 which define the PGPORT variable. Replace it with this at line 122:

PGPORT=5432

This defines the port number that PostgreSQL will respond on.

Then, as the "postgres" user, do the PostgreSQL setup:

[root@psql /]# su - postgres
-bash-4.2$ /usr/bin/postgresql-setup2 initdb  
Initializing database ... OK
-bash-4.2$ exit

Fix up the PostgreSQL server config: modify the authentication method, and the network addresses on which to listen:

[root@psql /]# sed -i 's/^host/#host/' /var/lib/pgsql/data/pg_hba.conf
[root@psql /]# echo "host all all all md5" >> /var/lib/pgsql/data/pg_hba.conf
[root@psql /]# echo "listen_addresses = '*'" >> /var/lib/pgsql/data/postgresql.conf
[root@psql /]# exit    # exit container

On the host, configure the PostgreSQL container to run postmaster as the postgres user on startup:

[root@host ~]# buildah config --cmd "su - postgres -c \
        '/usr/bin/postmaster -D /var/lib/pgsql/data'" $container

Commit the image to the local repository, as “localhost/postgres-test”:

[root@host ~]# buildah commit $container localhost/postgres-test
Getting image source signatures
Copying blob cacea99e9a8c skipped: already exists
Copying blob f15a9d9f7ab3 skipped: already exists
Copying blob d3e8e97ad524 done
Copying config 7614d3233c done
Writing manifest to image destination
Storing signatures
7614d3233c71651cfba0ba4aa149424dd349db55bee18cf762aef7b37e691a31

See a list of images -- the one just created should appear:

[root@host ~]# buildah images
REPOSITORY                         TAG      IMAGE ID       CREATED              SIZE
localhost/postgres-test            latest   8d75ec494b55   About a minute ago   340 MB
registry.access.redhat.com/rhel7   latest   1a9b6d0a58f8   6 weeks ago          215 MB

Run the newly-created container detached, i.e. in the background:

[root@host ~]# podman run -p 5432:5432 --name psql \
    --hostname psql --detach postgres-test
...outputs container id...

Check that it is running:
 
[root@host ~]# podman ps
CONTAINER ID  IMAGE                           COMMAND               CREATED        STATUS            PORTS                   NAMES
8651efee175f  localhost/postgres-test:latest  su - postgres -c ...  4 seconds ago  Up 4 seconds ago  0.0.0.0:5432->5432/tcp  psql

“Login” to the running psql container and set up PostgreSQL account and db for the wiki:

[root@host ~]# podman exec --interactive --tty psql bash
[root@psql ~]# su - postgres
[postgres@psql ~]$ createuser -S -D -R -P -E wikiuser # remember the password you use here
[postgres@psql ~]$ createdb -O wikiuser wikidb
[postgres@psql ~]$ exit # exit user postgres
[root@psql ~]# exit # exit container

Now, in the host system, connect to the running container’s PostgreSQL server, and set up the database for the wiki. The PostgreSQL server is a container running on localhost 127.0.0.1 and the host’s port 5432 is mapped to the container. Remember the db name (wikidb), the db user name (wikiuser), and the password that you use.

[root@host ~]# psql -h 127.0.0.1 -W wikidb wikiuser
Password for user wikiuser: 
psql (9.2.24)
Type "help" for help.

wikidb=> 

That is all for the PostgreSQL setup.

Apache HTTPD, PHP, and MediaWiki

Next, make another container for Apache + PHP + MediaWiki.  This runs httpd and php-fpm on the same container. It should also be possible to run php-fpm on a separate container.

[root@host ~]# container=$( buildah from \
        registry.access.redhat.com/rhel7)
[root@host ~]# echo $container
rhel7-working-container-1

MediaWiki requires PHP >= 7.2.9. However, it is NOT compatible with PHP 7.4.0 to 7.4.2 due to an upstream issue.

Because we need PHP 7, we could get it from EPEL.  You can copy the epel.repo file just as you did with the redhat.repo file in the PostgreSQL container, above.

Alternatively, install from Red Hat Software Collections. This makes things a little more complicated than using EPEL, but not terribly so. Some guidance here. To do this, we also need to use httpd24 from the Software Collections.

[root@host ~]# buildah copy $container /etc/yum.repos.d/redhat.repo \
    /etc/yum.repos.d/redhat.repo
1f302312276b6f60ca1189181159d8c8eba378d3ff76a6aff651220c8f8250f2

Do this if you want to use EPEL:

[root@host ~]# buildah copy $container /etc/yum.repos.d/epel.repo \
    /etc/yum.repos.d/epel.repo
15a7fc2ebe4c5260256294d2c890bc1ccb5f8097b1a25aa0c38f9b996fa5fc5b

Run a bash inside the container, and install Apache, PHP, MediaWiki; httpd24 is the Apache httpd 2.4 from the Software Collections:

[root@host ~]# buildah run $container -- /usr/bin/bash
[root@apache /]# yum install -y  wget less procps-ng lsof psmisc \
    tmux openssl httpd24 httpd24-httpd httpd24-mod_ssl

Install PHP-7.3 from Software Collections: 

[root@apache /]# yum install -y rh-php73 rh-php73-php \
    rh-php73-php-gd rh-php73-php-gmp rh-php73-php-intl \
    rh-php73-php-mbstring rh-php73-php-pgsql rh-php73-php-opcache \
         rh-php73-php-fpm

Check PHP version:

[root@apache tmp]# scl enable rh-php73 /bin/bash
[root@apache tmp]# which php
/opt/rh/rh-php73/root/usr/bin/php
[root@apache tmp]# php --version
PHP 7.3.11 (cli) (built: Oct 31 2019 08:30:29) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.3.11, Copyright (c) 1998-2018 Zend Technologies
    with Zend OPcache v7.3.11, Copyright (c) 1999-2018, by Zend Technologies


Update the tzdata package to address a possible bug:

[root@apache tmp]# yum update -y tzdata

Download and install MediaWiki into /opt/rh/httpd24/root/var/www/html/testwiki:

[root@apache tmp]# wget https://releases.wikimedia.org/mediawiki/1.34/mediawiki-1.34.2.tar.gz
[root@apache tmp]# cd /opt/rh/httpd24/root/var/www/html
[root@apache tmp]# tar xvf /tmp/mediawiki-1.34.2.tar.gz
[root@apache tmp]# mv mediawiki-1.34.2 testwiki
[root@apache tmp]# exit  # exits the rh-php73 environment
[root@apache tmp]# exit  # exits the container

Commit the container image as apache-test:

[root@host ~]# buildah commit $container localhost/apache-test

Here, will break from the Red Hat Blog example. That example runs httpd and php-fpm in the foreground. Here, we will run them in the background.

But first, SSL setup. As with the PostgreSQL service, systemctl cannot be used. Usually, the first time systemctl starts up Apache, it will also generate SSL certs. We need to do this manually. Enter appropriate information when prompted:

[root@host ~]# buildah run $container -- /usr/bin/bash
[root@apache ~]# openssl req -new -newkey rsa:4096 > new.cert.csr
[root@apache ~]# openssl rsa -in privkey.pem -out new.cert.key
[root@apache ~]# openssl x509 -in new.cert.csr -out /etc/pki/tls/certs/localhost.crt \
-req -signkey new.cert.key -days 730
[root@apache ~]# cp new.cert.key /etc/pki/tls/private/localhost.key

[root@apache ~]# openssl req -new -newkey rsa:4096 > new.cert.csr
Generating a 4096 bit RSA private key
.............................++
......................................................................................................................................................................................................................................................++
writing new private key to 'privkey.pem'
Enter PEM pass phrase: ***
Verifying - Enter PEM pass phrase: ***
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:US
State or Province Name (full name) []:California
Locality Name (eg, city) [Default City]:Riverside
Organization Name (eg, company) [Default Company Ltd]:ACME Corp.
Organizational Unit Name (eg, section) []:IT
Common Name (eg, your name or your server's hostname) []:myservername
Email Address []:web@acmecorp.com

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
[root@apache /]# openssl rsa -in privkey.pem -out new.cert.key
Enter pass phrase for privkey.pem:
writing RSA key
[root@apache /]# openssl x509 -in new.cert.csr -out /etc/pki/tls/certs/localhost.crt \
> -req -signkey new.cert.key -days 730
Signature ok
subject=/C=US/ST=Pennsylvania/L=Philadelphia/O=Drexel University/OU=URCF/CN=urcfstora-apache/emailAddress=dwc62@drexel.edu
Getting Private key
[root@apache /]# cp new.cert.key /etc/pki/tls/private/localhost.key
cp: overwrite '/etc/pki/tls/private/localhost.key'? y
[root@apache /]# exit

Commit changes to the image:

[root@host /]# buildah commit $container localhost/apache-test

Next, start up httpd without daemonizing, and php-fpm (FastCGI Process Manager). Run a shell on apache-test, mapping the http and https ports. And, in that shell, use tmux to manage the two terminal sessions, one for each process.

[root@host /]# podman run -p 80:80 -p 443:443 -it --name apache --hostname apache apache-test /usr/bin/bash
[root@apache /]# tmux
[root@apache /]# scl enable httpd24 /bin/bash
[root@apache /]# which httpd
/opt/rh/httpd24/root/usr/sbin/httpd
[root@apache /]# httpd -DFOREGROUND
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.88.0.9. Set the 'ServerName' directive globally to suppress this message

Make a note of that IP address 10.88.0.9. (We may or may not need this.)

Create a new terminal window to deal with php-fpm: type "Ctrl-b" then "c". Then, run php-fpm

[root@apache /]# scl enable rh-php73 /bin/bash
[root@apache /]# mkdir /run/php-fpm
[root@apache /]# php-fpm --nodaemonize
[22-Jun-2020 20:08:10] NOTICE: fpm is running, pid 120
[22-Jun-2020 20:08:10] NOTICE: ready to handle connections
[22-Jun-2020 20:08:10] NOTICE: systemd monitor interval set to 10000ms


XXX

To configure an entrypoint which runs more than one executable, we can to write a wrapper script. In our case, we need to run httpd and php-fpm. (Docker example of a wrapper script.)  Note that this is not the recommended way of doing things, which would be to have separate containers for httpd and php-fpm. 

/opt/rh/httpd24/root/usr/sbin/httpd

/opt/rh/rh-php73/root/usr/sbin/php-fpm

Configure entrypoints to run httpd24 and php-fpm:

[root@urcfstora tmp]# buildah config --entrypoint '["scl enable httpd24 /usr/sbin/httpd -DFOREGROUND", "scl enable rh-php73 php-fpm --nodaemonize"]' $container

Run: 

[root@urcfstora tmp]# podman run -p 80:80 -p 443:443 --name apache --hostname apache --detach  apache-test 
708bece1d46225f8628a680516df00ce66921673df4e1cc50f2953053f04af70
[root@urcfstora tmp]# podman ps -a
CONTAINER ID  IMAGE                           COMMAND               CREATED        STATUS                    PORTS                   NAMES
708bece1d462  localhost/apache-test:latest    /bin/bash             4 seconds ago  Exited (0) 3 seconds ago  0.0.0.0:80->80/tcp      apache
8651efee175f  localhost/postgres-test:latest  su - postgres -c ...  2 weeks ago    Up 2 weeks ago            0.0.0.0:5432->5432/tcp  psql


Open a new terminal on the host machine to examine the running containers:


[root@host ~]# podman ps
CONTAINER ID  IMAGE                           COMMAND               CREATED            STATUS                PORTS                   NAMES
e1b1270c4745  localhost/apache-test:latest    /usr/bin/bash         25 minutes ago     Up 25 minutes ago     0.0.0.0:80->80/tcp      apache
8651efee175f  localhost/postgres-test:latest  su - postgres -c ...  About an hour ago  Up About an hour ago  0.0.0.0:5432->5432/tcp  psql

Try to connect to the web server (unencrypted). Launch a web browser on another machine (your PC, or something not the host machine), and connect to the host machine (ignoring the self-signed certificate errors):

    https://host.acmecorp.com

podman will have automatically opened ports in the firewall.

For the MediaWiki container to connect to the PostgreSQL container, the PostgreSQL container's IP address needs to be known. Find it by doing:

[root@host]# podman inspect psql | egrep "10\."
            "Gateway": "10.88.0.1",
            "IPAddress": "10.88.0.8",

So, the "psql" container's IP is 10.88.0.8. We will need this address for the Mediawiki setup in the next step. Also, leave the port (5432) the same.

Now, fire up a web browser on your host, and browse itself. The httpd running in the container will respond, since we ran it mapping the appropriate http/https ports to the host ports:

     https://host.acemcorp.com/testwiki/

Follow the prompts to setup the wiki. Recall the wiki db name, db user name, and the password set up above.

At the end of that, you will be able to download the LocalSettings.php file, which you will then copy to the "apache" container.



Next, we mount the "apache" container, and copy MediaWiki's LocalSettings.php file to it:

[root@host]# apachemnt=$(podman mount apache)
[root@host]# cp /location/of/LocalSettings.php $apachemnt/opt/rh/httpd24/root/var/www/html/testwiki

Then, in your browser, click on that "enter your wiki" link. You should see something like this:


Test that you can create a new article:



PERFORMANCE CO-PILOT
Unfortunately, RHEL7 does not seem to provide PCP for Podman. 

DELETE CONTAINERS
When containers are not running, they may be deleted. First, get their container IDs, and then delete them:

[root@host ~]# podman ps --all
CONTAINER ID  IMAGE                           COMMAND               CREATED            STATUS                PORTS                   NAMES
b67d98b97ebd  localhost/apache-test:latest    /usr/bin/bash         About an hour ago  Up About an hour ago  0.0.0.0:80->80/tcp      apache
8651efee175f  localhost/postgres-test:latest  su - postgres -c ...  3 days ago         Up 3 days ago         0.0.0.0:5432->5432/tcp  psql
[root@host ~]# podman rm CONTAINER_ID

If you don't want the images that you built to hang around in your local storage, you can remove them. The "-f" option will also remove containers which use those images. (Use "buildah images" to see what images are in local storage.)

[root@host ~]# buildah images
REPOSITORY                         TAG      IMAGE ID       CREATED       SIZE
localhost/apache-test              latest   c4f284291b58   3 days ago    1.14 GB
localhost/postgres-test            latest   8d75ec494b55   3 days ago    340 MB
registry.access.redhat.com/rhel7   latest   1a9b6d0a58f8   6 weeks ago   215 MB

[root@host ~]# buildah rmi -f IMAGE_ID




2020-11-24

New Cerebras wafer-scale single server outperforms Joule supercomputer

From HPC Wire:

Cerebras Systems, a pioneer in high performance artificial intelligence (AI) compute, today announced record-breaking performance on a scientific compute workload. In collaboration with the Department of Energy’s National Energy Technology Laboratory (NETL), Cerebras demonstrated its CS-1 delivering speeds beyond what either CPUs or GPUs are currently able to achieve. Specifically, the CS-1 was 200 times faster than the Joule Supercomputer on the key workload of Computational Fluid Dynamics (CFD).

While Cerebras’s CS-1 system is billed as an AI-focused machine, it outdid the Joule Supercomputer (number 82 in the TOP500) on a non-AI workload. While the Joule cost 10’s of millions of dollars, occupies dozens of racks, and consumes 450 kW of power, the CS-1 fits in only one-third of a rack.

Cerebras has a good write-up on their blog. Gory detail in the preprint: arXiv:2010.03660 [cs.DC].

2020-11-18

OpenLDAP local root access to OLC cn=config database

If, like me, you converted your OpenLDAP server installation from slapd.conf to OLC (On-Line Configuration), aka cn=config, you may find that local root privileges to modify your config are not configured; i.e. doing the following will fail:

ldapmodify -Y EXTERNAL -H ldapi:/// -f some_changes.ldif

This is because the olcRootDN for the cn=config database is probably not set up right. Mine looked something like:

dn: olcDatabase={0}config,cn=config

objectClass: olcDatabaseConfig

olcDatabase: {0}config

olcAccess: {0}to *  by * none

olcRootDN: cn=root,dc=example,dc=com

 

It may or may not also have an olcRootPW (root password) set.

You can query you you appear to be to the LDAP server by using ldapwhoami and specifying the SASL mechanism (-Y) and the LDAP URI (-H):

#  ldapwhoami -Y EXTERNAL -H ldapi:///

SASL/EXTERNAL authentication started

SASL username: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth

SASL SSF: 0

dn:gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth

In this case, the EXTERNAL mechanism is the Linux IPC (Inter-Process Communication), which gets the UID and GID of the client process. This is communicated via the domain socket transport (ldapi:).

The fix is straightforward. First, create a file to replace the olcRootDN field:

# replace_olcrootdn.ldif

dn: olcDatabase={0}config,cn=config

replace: olcRootDN

olcRootDN: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth

If you have an olcRootPW field, add another operation to delete: it. Then, apply the changes:

# ldapmodify -D cn=root,dc=example,dc=cluster -w somepassword -H ldapi:/// -f replace_olcrootdn.ldif

And that should do it. From now on, you should be able to modify the OLC with “-Y EXTERNAL -H ldapi:///” if you are root. 

This post is expanded from this answer at Server Fault.

2020-11-06

Creating a systemd-based service for FlexLM lmgrd license manager daemon

This is a short post on how to set up a systemd-based service to run FlexLM lmgrd. It is based on what others have written, and primarily based on Schrödinger, Inc.’s instructions.

The full write-up with the various files is at GitHub: https://github.com/prehensilecode/flexlm-systemd-service

Outline of the steps:

  1. Create an unprivileged system user named flexlm to run the service
  2. Create an lmgrd.service service definition file to go into /etc/systemd/system
  3. Create a directory tree for the executables (lmgrd, lmutil, vendor daemons) and license files
  4. Create a /var/log subdirectory
And that is it.

At my site, this is currently running on a RHEL 8.1 server, serving an Intel Parallel Studio XE license.

2020-09-14

Switching to SSSD (from nslcd) in Bright Cluster Manager 9

In Bright Cluster Manager 9.0, cluster nodes still use nslcd for LDAP authentication. Since we have sssd working in Bright CM 6 (by necessity due to an issue with Univa Grid Engine and nslcd; see previous posts), we might as well change things over to sssd on Bright CM 9, too. The cluster now runs RHEL8.

First, we disable the nslcd service on all nodes. It was a little non-obvious how to do this since trying to remove it in the device services did nothing: the service just kept coming back enabled. I.e. do “remove nslcd ; commit” and then “list” and nslcd just reappears.

Examining that service in the device view showed that it “belongs to a role,” but it is not listed in any role, nor in the category of that node.

[foocluster]% category use compute-cat

[foocluster->category[compute-cat]]% services

[foocluster->category[compute-cat]->services]% list

Service (key)            Monitored  Autostart

------------------------ ---------- ----------

It turns out that nslcd is part of a hidden role which is not visible to the user. So, you have to write a loop to disable nslcd on each node. Within cmsh:

[foocluster]% device

[foocluster]% foreach -v -n node001..node099 (services; use nslcd; set monitored no ; set autostart no)

[foocluster]% commit

To modify the node image, I modify the image on one node, and then do “grabimage -w” in cmsh on the head node.

You will need to install these packages:

  • openldap-clients
  • sssd
  • sssd-ldap 
  • openssl-perl

Next, sssd setup. This may depend on your installation. The installation here uses the LDAP server set up by Bright CM, which uses SSL for encryption with both server and client certificates. (All self-signed with a dummy CA in the usual way.) The following /etc/sssd/sssd.conf shows only the non-empty sections. Your configuration may need to be different depending on your environment. 

[domain/default]

id_provider = ldap

autofs_provider = ldap

auth_provider = ldap

chpass_provider = ldap

ldap_uri = ldaps://fooserver.cm.cluster

ldap_search_base = dc=cm,dc=cluster

ldap_id_use_start_tls = False

ldap_tls_reqcert = demand

ldap_tls_cacertdir = /cm/local/apps/openldap/etc/certs

cache_credentials = True

enumerate = False

entry_cache_timeout = 600

ldap_network_timeout = 3

ldap_connection_expire_timeout = 60


[sssd]

config_file_version = 2

services = nss, pam

domains = default


[nss]

homedir_substring = /home

Then, 

# chown root:root /etc/sssd/sssd.conf 

# chmod 600 /etc/sssd/sssd.conf

I did not have to change /etc/openldap/ldap.conf

The next step is to switch to using sssd for authentication. But first, stop and disable the nslcd service: 

# systemctl stop nslcd

# systemctl disable nslcd

The old authconfig-tui utility is gone. The new one is authselect: you will have to force it to overwrite existing authentication configurations.

# authselect select sssd --force

There are other options to authselect, e.g. “with-mkhomedir”. See authselect(8) and authselect-profiles(5) for details. Other options may also require other packages to be installed.

Then, start and enable the sssd service. Check that user ID info can be retrieved:

# id someuser

Back on the head node, do “grabimage -w”. 

Then, modify the node category to add the sssd service, setting it to autostart and to be monitored.