Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Panel
7. What is PXE, DHCP, TFTP and NFS and why are they needed by my linuxRT IOC?

LinuxRT is installed on our system using the Preboot Execution Environment (PXE) method of network booting.

We enable the PXE/network-booting method in the BIOS.

In order to use PXE there is a boot-server that will allow our client system to :
(a) Request an IP address (via DHCP)
(b) Download a kernel (via TFTP)

With both of these services in place any system which supports PXE/network-booting
should be able to gain an IP address, fetch a kernel, and boot without an installed operating system.

PXE uses three distinct network protocols that map to three server processes to perform the installation.
In our case, some of the processes run on lcls-dev1 (LCLSDEV daemon)

Both  DHCP and TFTP services run on the LCLSDEV host 'dhcp3' maintained by SCCS.

(a) Dynamic Host Configuration Protocol (DHCP)

PXE uses DHCP to deliver initial network configuration options to client nodes.
The DHCP server supplies the PXE boot plug-in with
(i) IP address
(ii) TFTP server address
(iii) Stage 1 image boot-loader name from which to download and execute the image.

As the supplied PXE installation environments are non-interactive and will unconditionally reinstall a client machine,
we have the client associate its MAC address with a specific OS installation before starting the PXE boot.

The configuration information, in our case, in addition to IP/MAC address, includes a hostname and a pointer to the Master Starupt script in afs for our IOC.
It has an optional root-path variable pointing to the afs area which hosts the boot image that is served via TFTP.
This can be over-ridden as will be seen later.

When the Linux server is rebooted or power-cycled, PXE will attempt the network booting method first
and as a first step it will contact the DHCP server to retrieve the network configuration information.

Hence, every new linuxRT ioc (host) needs to be added to the DHCP server configuration file in afs.

This file is in /afs/slac/service/dhcp-pxe/dhcpd.conf

Note that the DHCP service running on dhcp3 is intended only for booting embedded devices like our linuxRT servers that are connected to the LCLSDEV and SSRL subnets.

The MAC addresses for such devices must be registered in CANDO and assigned fixed IP addresses.

The IP/MAC address of the primary ethernet that will fetch the linuxRT boot image is defined here.
To add a new host to the DHCP configuration, contact Thuy.

After a new ioc is added to dhcpd.conf, the DHCP service must be restarted.

To restart DHCP Server

From a Unix command line:

remctl dhcp3 dhcp check

remctl dhcp3 dhcp restart

For help:

remctl dhcp3 dhcp help

Currently only Thuy, Ernest and a couple of others have permissions to perform this restart on dhcp3.

Here's is an example - ioc-b34-bd32:

host ioc-b34-bd32 {
# SuperMicro (INTELx86)
#
hardware ethernet 00:25:90:D1:95:1E;
fixed-address 134.79.218.190;
option host-name "ioc-b34-bd32";
if ( substring( option vendor-class-identifier, 0, 5 ) = "udhcp" ) {
filename "/afs/slac/g/lcls/epics/iocCommon/ioc-b34-bd32/startup.cmd";
option root-path "afsnfs1:/afs/slac:/afs/slac";
}
}

To find out more about how our linux server boots up linuxRT, click on the following link:

 How does the Linux Server boot up linuxRT?

(b) Trivial File Transfer Protocol (TFTP)

PXE uses TFTP that defines a simple UDP protocol for delivering files over a network.
PXE delivers kernels and initial bootstrap software to client nodes using TFTP.

tftpboot is mounted in afs area in LCLSDEV on the server dhcp3, which runs both the TFTP an DHCP services:

 /tftpboot -> /afs/slac.stanford.edu/service/dhcp-pxe/tftpboot/

The iocs retrieve the linuxRT boot image from the TFTP server from the following location:

/afs/slac/g/lcls/tftpboot/linuxRT/boot

In this location, there are several linuxRT-x86 bootimages.
These were custom-built by T.Straumann for the various Linux Servers/IPCs that we currently have setup to boot with linuxRT OS.

Of these images, '3.14.12-rt9' is the latest and it has in-built support for the
Broadcom networking ethernet chipset that are used in our dev Poweredge Dell Servers.

(c) Network File System (NFS)

The NFS service is used by the installation kernel to read all of the packages necessary to the installation process.

NFS services run on the LCLSDEV hosts afsnfs1 and afsnfs2  maintained by SCCS.

 This service makes available the boot directory to all linuxRT targets that boot as diskless clients.

All clients have read-only permissions to this directory.

The linuxRT iocs need some additional NFS Mount Points to write their data some where.

surrey04b is an NFS Appliance and is used by iocs for data and have both read and write permissions to the data directory ($IOC_DATA).

In LCLSDEV, we must obtain permissions for the iocs to write to the $IOC_DATA directory.

 Fill out an online form provided by SCCS to obtain permissions for your ioc to write to this directory:

https://www-rt.slac.stanford.edu/rt3/SelfService/Forms/IocNfs.html  


For a more up-to-date and complete overview of the linuxRT system in ICD development area, refer to Alisha's block diagram below:

linuxRT System Overview in LCLSDEV subnet

Panel
8. How do I start my IOC? Where is my ioc's statrup.cmd?

There are a few scripts that automate this process.

To begin with, there is the 'ipxe.ini' script in the tftp boot area /afs/slac/g/lcls/tftpboot/linuxRT/boot that PXE will run.

This is where the version of (linuxRT) kernel to run is specified as follows:

set vers 3.2.13-121108

This version number can be over-ridden by a chained, host-specific pxe init script to load an image different from the above:

chain ${hostname}.ipxe ||

For example, we have defined a script specifically for our ioc ioc-b34-bd32.ipxe which chooses to load the latest linuxRT image:

set vers 3.14.12-rt9

This is also the place to over-ride the 'root-path' option specified in the DHCP configuration file dhcpd.conf.

For example, I may decide to over-ride the afsnfs1 server and instead choose to get my boot image from afsnfs2 server:

set extra-args ROOTPATH=afsnfs2:/afs/slac:/afs/slac BOOTFILE=/afs/slac/g/lcls/epics/iocCommon/ioc-b34-bd32/startup.cmd

A few more extra arguments are specified in ioc-b34-bd32. Leave them as they are.

The 'ipxe.ini' script loads the linuxRT kernel via the TFTP protocol:

kernel --name linux tftp://${next-server}/linuxRT/boot/${vers}/bzImage && initrd tftp://${next-server}/linuxRT/boot/${vers}/rootfs.ext2 || shell
imgargs linux debug idle=halt root=/dev/ram0 console=ttyS0,115200 BOOTIF_MAC=${net0/mac:hex} ${extra-args} || boot || shell

After linuxRT boot image is downloaded to the target and linuxRT starts to run, additional nfs mounts will be done.

The afs to nfs translator service makes available the directory structure, to all clients that have mounted this nfs space.
NFS File Servers for LCLSDEV are afsnfs1 and afsnfs2

One of the arguments to the kernel process is the location of the BOOTFILE that does the mounting.

The 'filename' argument (which can be over-ridden by the BOOTFILE argument for linuxRT) is as follows:

"/afs/slac/g/lcls/epics/iocCommon/ioc-b34-bd32/startup.cmd"

This script is similar to and modelled after RTEMS startup.cmd.

When linuxRT loads and start, the kernel process is run as "root" user.
Hence it has permissions to setup the nfs mounts which is done by the following line in startup.cmd:

/afs/slac/g/lcls/epics/iocCommon/All/Dev/linuxRT_nfs.cmd

Additional NFS Mount Points for linuxRT pertaining to the ioc data directory $IOC_DATA are mounted as well.

The next line in the startup.cmd file loads the linuxRT kernel modules.

More on kernel modules under question (10).

This can also be done only by the "root" user:

/afs/slac/g/lcls/epics/iocCommon/ioc-b34-bd32/kernel-modules.cmd

Next we must start the caRepeater process:

(http://www.aps.anl.gov/epics/base/R3-14/12-docs/CAref.html#Repeater)

"When several client processes (virtual iocs) run on the same host it is not possible for all of them to directly receive a copy of the server beacon messages when the beacon messages are sent to unicast addresses, or when legacy IP kernels are still in use. To avoid confusion over these restrictions a special UDP server, the CA Repeater, is automatically spawned by the CA client library when it is not found to be running. This program listens for server beacons sent to the UDP port specified in the EPICS_CA_REPEATER_PORT parameter and fans any beacons received out to any CA client program running on the same host that have registered themselves with the CA Repeater. If the CA Repeater is not already running on a workstation, then the "caRepeater" program must be in your path before using the CA client library for the first time."

So we add the following lines to startup.cmd to start a caRepeater for all EPICS VIOCs that may be hosted by this CPU:

export EPICS_CA_REPEATER_PORT=5067

su laci -c /afs/slac/g/lcls/epics/R3-14-12-4_1-0/base/base-R3-14-12-4_1-0/bin/linuxRT-x86/caRepeater

Finally, it is possible to automatically startup one or more EPICS IOCs right here and detach them using the linux screen program.

We can start another shell such that a user called "laci" can start the IOC process instead of the "root" user:

su laci -c /afs/slac/g/lcls/epics/iocCommon/ioc-b34-bd32/startup-epics-bd32.cmd

More on startup-epics-bd32.cmd under Question (13).

Under linuxRT,  a few real time processes that need real time scheduler and kernel memory locking features, can be specifically run as such.

The screen process does not need or have RT priority.

 The _MAIN_ thread in epics application process, which is started  doesn't have the RT priority either.

But, other threads which are created by the _MAIN_ thread may need RT priority.

Kukhee provides some information about this to find out which processes are running with real-time priorities:

Command to look up thread priorities

...

3)  Add a note to cram IOC application before CVSing, and provide a link for the cramming.

Rev 1.2: Shantha Condamoor        Date: 10-Apr-2015:  

1) added link to Alisha's linuxRT System Overview Block Diagram at end of FAQ 7.
----------------------------------------------------------------------

...