Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

However for data files checked in with the package, the general problem is where to put them, and how to find them at run time. Probably the best practice is to use the data subdirectory. This is intended for application data with the package, so you may want to make a subdirectory under it for testing files. The advantage of the data directory is that it is wired into the release. If you know your package name, you can use an environment variable to find the data directory. For example, if we hadSo if we add the directories

  • MyPkg/data/testdata

...

and then create the file

  • MyPkg/data/testdata/mytestfile.txt

there that starts with the string "my", then at run time, we could use the environment variable SIT_DATA to find the data subdirectory for the release. The data subdirectory for the release will have soft links to all the data directories for the packages. That is we will have the structure

  • unitTestTutorial/MyPkg/data    # our data dir
  • unitTestTutorial/data               # release data dir
  • unitTestTutorial/data/MyPkg    # soft link to unitTestTutorial/MyPkg/data

And the environment variable SIT_DATA will be a : separated list of directories for data, starting with an absolute path to unitTestTutorial/data. Then one could then write a Python unit test like

we can write a unit test that uses a psana utility to find the file and test that it starts with "my". The psana utility is the Python class AppDataPath in the AppUtils package (there is a C++ version there as well). The unit test would look like

Code Block
languagepython
    
Code Block
languagepython
    def testMyFile(self):
        import os
       assert 'SIT_DATA' in os.environ, "SIT_DATA not defined. Was sit_setup run?"
from AppUtils.AppDataPath import AppDataPath
        dataPathtestFileDataRelPath = os.environ['SIT_DATA'].split(':')[0]path.join('MyPkg', 'testdata', 'mytestfile.txt')
       testDataFilePath testFilePath = osAppDataPath(testFileDataRelPath).path.join(dataPath, 'MyPkg', 'testdata', 'mytestfile.txt')
        assert os.path.exists(testDataFilePath)len(testFilePath)>0 , "test file (relative to release data dir): %s not found." % testDataFilePathtestFileDataRelPath
        fileText = self.assertTrue(file(testDataFilePath)testFilePath, 'r').read()
        self.assertTrue(fileText.startswith("my"),
                        msg="Test file=%s doesn't start with my" % testDataFilePath)

One thing to note, scons test only runs tests for packages that are part of the working release. It does not run tests for all the packages that are part of the base release. Given this, I think the simplest thing when writing unit tests is to only use the first path in SIT_DATA which will be for the working release. When working with application data, a package will want to try at least the first two paths in SIT_DATA to check for a MyPkg subdirectory. This allows it to see if it is checked out into the working test release, or just part of the base release.

External Test Data Location

A directory for test data has been set up here:

/reg/g/psdm/data_test
testFilePath)

It may be worth understanding the mechanism by which AppDataPath works. At run time, SconsTools will create two directories:

  • unitTestTutorial/data               # release data dir
  • unitTestTutorial/data/MyPkg    # soft link to unitTestTutorial/MyPkg/data

Moreover, when sit_setup was run, it will set the environment variable SIT_DATA. SIT_DATA is a : separated list of paths, the first being the absolute path to unitTestTutorial/data, the second being the absolute path to the data directory of the base release. AppDataPath goes through these paths in order, returning the first match. One thing to note, scons test only runs tests for packages that are part of the working release. It does not run tests for all the packages that are part of the base release. Given this, there is no reason to search the base release, but there should be no harm as well. Harm could conceivably befall a developer who was modifying a test that is checked into an existing package in the base release. Were the developer to change the name of the test file in the working/test release, but not modify the unit test code to use the new name, then AppDataPath would find the old test file in the base release data directory.

External Test Data Location

A directory for test data has been set up here:

/reg/g/psdm/data_test

that was created expressly for the purpose of storing test data for the analysis releases. Presently it holds xtc files, and some calibration constant that was created expressly for the purpose of storing test data for the analysis releases. Presently it holds xtc files, and some calibration constant files. We do not want to copy entire xtc files from the experiments into this location as they are to big. We need to select the parts of the xtc file necessary for testing. The current organization of the data_test directory is

...

This section covers different methods to make small xtc test files. Presently the largest xtc test file in data_test is about 1GB, which is bigger than it needs to be. I think we should be able to keep test files down to 20-100 MB, smaller files mean faster unit tests as well.

Using psana_test and xtclinedump

well.

Using psana_test

Psana_test includes a library of Python code with a function to copy out a few datagrams from each xtc file for a run. An example of use is

Code Block
languagepython
import psana_test.psanaTestLib as ptl
ptl.copyToMultiTestDir('cxie9214',63,1,2,'/reg/g/psdm/data_test/multifile/test_012_cxie9214')

For experiment cxie9214, run 64, 1 calib cycle, the first 2 events from this calib cycle (for each stream) are copied into xtc files with the same name in the given directory. Moreover a 'index' subdirectory will be made and index files will be written there.

Using xtclinedump

Before writing that function, I would do things by hand. Suppose we want some test data for Epix100aConfig. We know it is somewhere in this file:For the testing that I have done, I typically want to run psana on a few datagrams in an xtc file to test how it parses a new type or handles some damaged data. Suppose we don't have unit tests to see how psana handles Epix100aConfig version1 and EpixElement version 2, and we have identified an experiment xtc file with these types, namely

/reg/d/psdm/xcs/xcsi0314/xtc/e524-r0213-s03-c00.xtc

...