Quick Start

In general, there are two ways to run the uPSP processing applications:

  • Processing one test condition: Useful for initial checkout of raw data from a wind tunnel test setup. The user manually runs each step in the pipeline.

  • Batch processing multiple test conditions: Useful after initial checkout. A framework auto-generates scripts that will invoke each pipeline step for each wind tunnel test condition. The user controls the pipeline via top-level “launcher” scripts.

Producing surface pressure time history for one test condition

The following instructions allow a user to process a single test conditions with the uPSP software. The steps will use bracketed shorthand like <name> to refer to the path to input files on disk (for more details concerning file formats, see File Formats).

  1. Acquire the following input files:

  2. Create the following processing configuration files:

  3. Run upsp-extract-frames to extract the first frame from the camera video file as a PNG.

    #!/bin/bash
    upsp-extract-frames \
      -input=<video-file> \
      -output=first-frame.png \
      -start=1 \
      -count=1
    

    Run upsp-extract-frames -h for more usage details.

    The output file first-frame.00001.png contains the first frame from the input <video-file>, scaled to 8-bit depth.

  4. Run upsp-external-calibration to compute the external camera calibration relative to the model position in the first frame (this step accounts for “wind-on” deflection of the model position that is not accounted for in the time-averaged model position reported by the wind tunnel data systems):

    #!/bin/bash
    upsp-external-calibration \
      --tgts <tgts-file> \
      --grd <grid-file> \
      --wtd <wtd-file> \
      --cfg <excal-cfg-file> \
      --cal_dir <cal-dir> \
      --out_dir . \
      --img first-frame.00001.png
    

    <cal-dir> refers to the name of the directory containing <cal-file>. Run upsp-external-calibration -h for more usage details.

    The output file cam01-to-model.json contains the external calibration parameters that will be fed to psp_proces in the next step.

  5. Run psp_process to produce a (usually quite large) time history of the surface pressure at each point on the model grid. This can take significant time and memory if run on a personal computer; we recommend instead that the application be run in parallel on a compute cluster. The application is best run so that one MPI rank runs on each compute node, and then thread-level parallelism is leveraged by each MPI rank to scale across cores on a given node.

    An example PBS job script for running psp_process on the Pleaides cluster at the NASA Advanced Supercomputing (NAS) Division is shown below. For more details about PBS syntax and uPSP-specific rules of thumb for sizing the job allocation, see Guidelines for setting NAS PBS job parameters.

    #!/bin/bash
    #PBS -q normal
    #PBS -l select=40:model=ivy
    #PBS -l walltime=00:20:00
    export MPI_DSM_DISTRIBUTE=off
    export OMP_STACKSIZE=250M
    export OMP_NUM_THREADS=16
    source /usr/local/lib/global.profile
    module purge
    module load mpi-hpe/mpt.2.25
    mpiexec psp_process \
      -input_file=<inp-file> \
      -h5_out=<out-dir>/output.h5 \
      -paint_cal=<paint-file> \
      -steady_p3d=<steady-file>
    

    <out-dir> refers to the value of the @output/dir variable specified in the <inp-file>. Run psp_process -h for more usage details.

    The output pressure_transpose file contains the surface pressure time history for each node on the model grid (see Pressure-time history raw binary outputs). Several diagnostic images are printed to verify the external calibration and (optional) fiducial patches align well with the position of the model in the first video frame.

  6. (Optional) post-processing steps

    • Run add_field to add the pressure_transpose data into the HDF5 file produced by psp_process. For some of its command line arguments, add_field must be provided with the number of vertices in the 3D model and the number of frames that were processed. Example usage below shows how to obtain these values from inspecting files output by psp_process in the BASH script language. <out-dir> should be replaced with the same directory as <out-dir> in the previous step (the directory containing outputs from psp_process).

      #!/bin/bash
      
      # Set output_dir to the folder containing outputs from `psp_process`
      # in previous step.
      output_dir=<out-dir>
      trans_h5_file=$output_dir/output.h5
      
      # Inspect the 'X' file, which is a flat binary dump of the
      # X-coordinates of the input wind tunnel model grid vertices.
      # The number of coordinates in the file gives the size of the model.
      data_item_size=4 # coordinates stored as 4-byte float's
      model_size="$(expr $(stat --printf="%s" $output_dir/X) '/' $data_item_size)"
      trans_flat_file="$(find $output_dir -type f -name 'pressure_transpose')"
      trans_flat_file_size="$(stat --printf="%s" $trans_flat_file)"
      trans_flat_file_number_frames="$(expr $trans_flat_file_size '/' '(' $model_size '*' $data_item_size ')')"
      
      echo ">>> number of model nodes: $model_size"
      echo ">>> data item size: $data_item_size"
      echo ">>> time history flat file: '$trans_flat_file'"
      echo ">>> time history flat file size: $trans_flat_file_size"
      echo ">>> time history flat file number of frames: $trans_flat_file_number_frames"
      echo ">>> adding flat-file data to '$trans_h5_file'"
      
      ex="add_field $trans_h5_file frames $trans_flat_file $trans_flat_file_number_frames"
      echo ">>> running: '$ex'"
      t_start="$(date +%s.%N)"
      echo ">>> started at: $(date)"
      $ex
      t_end="$(date +%s.%N)"
      echo ">>> elapsed time: $(python -c "print('%4.1f' % ($t_end - $t_start))") seconds"
      
      echo ">>> run-add-field DONE."
      

Batch processing multiple test conditions

The following instructions allow a user to batch process one or more test conditions from a wind tunel test with the uPSP software.

Batch processing is configured by upsp-make-processing-tree, a tool that auto-generates a file tree and associated command-line scripts that the user can then run to execute each step in the uPSP pipeline for one or more datapoints. The configuration process is illustrated in Fig. 2 and consists of the following steps:

  1. The user locates raw data files from a wind tunnel test on disk

  2. The user prepares four Javascript Object Notation (JSON) configuration files:

    • A datapoint index, listing the path to each raw input file for each datapoint

      • This often consists of writing test-specific scripts/tools to grok each input file on disk

    • A processing parameters file, containing parameter settings for each step in the pipeline

    • A PBS job parameters file, containing PBS scheduler settings (group ID, reservation wall time, number of nodes, etc.)

    • A plotting parameters file, containing parameters for plotting steps in the pipeline

  3. The user runs upsp-make-processing-tree and provides it with each configuration file. The script will autogenerate a file tree on disk to store all artifacts for batch processing

Once the processing tree is generated and saved to disk, the user can navigate to the 03_launchers subfolder and trigger each step in the pipeline as follows:

  1. Each step in the pipeline is launched using a script named step+<step-name><+optional-subtask-name>.

    • They should be run in the order given here (some steps use outputs from previous steps):

      1. step+extract-first-frame: extract the first frame from each camera video file.

      2. step+external-calibration: run the wind-on, first-frame external calibration for each camera.

      3. step+psp_process+psp-process: run psp_process - image-to-grid projection and calibration to units of pressure.

      4. step+psp_process+add-field: post-process psp_process outputs; add largest pressure-time history dataset into the HDF5 output file.

    • Each step launcher script can be invoked as follows:

      • ./<step-launcher-script> <datapoint-id-1> <datapoint-id-2> ... to process a specific subset of datapoints. By default, all datapoints are processed.

      • ./qsub-step <step-launcher-script> <datapoint-id-1> <datapoint-id-2> ... to launch the step on the cluster as one or more jobs (uses qsub). The jobs can then be monitored using qstat -u $USER. The jobs reservations are configured using the PBS job parameters supplied in the PBS job parameters JSON file.

  2. Once complete, data products for each datapoint will be available under 04_products/00_data/<step-name>/<datapoint-id>.

The JSON file format was chosen for batch processing configuration files due to its ubiquitous usage in industry and broad cross-platform/cross-language support. Users should be familiar with plain-text editing of JSON files and can reference the official JSON syntax here.

uPSP NAS batch processing flowchart.

Fig. 2 uPSP NAS batch processing flowchart.

Guidelines for setting NAS PBS job parameters

For complete details and tutorials, see the HECC wiki, “Running Jobs with PBS”.

Specific to the current implementation of the uPSP processing code, the following is rationale for practical “rules of thumb” for scaling the size of the PBS job to your input data size. Trial-and-error may be required to define these parameters correctly after initial best-guesses.

Given the following variable definitions:

  • NR: Number of MPI ranks

  • NC: Number of cameras

  • FR: Camera resolution (pixels)

  • NM: Number of 3D model grid nodes

  • NT: Number of time samples

Then, tl;dr a rule-of-thumb is to ensure each MPI rank has access to at least MT=MC+M1+M2 bytes of local memory, where

  • MC=O(KC(NTNCFR)/NR) accounts for storage of camera frames

  • M1=O(K1(NTNM)/NR) accounts for storage of the 3D-projected data (pixel counts)

  • M2=O(K2(NTNM)/NR) accounts for storage of the calibrated, 3D-projected data (physical pressure units)

and KC=2, K1=8, and K2=8 are reasonable constant “fudge factors” accounting for variability in camera bit depth and intermediate storage of elements of the solution in memory.

A practical application of this rule of thumb is as follows:

  • At NASA, for one example test condition, we collected 60K frames from 4x cameras, each approximately 1MP resolution

  • The wind tunnel 3D model had approximately 1M vertices

  • So:

    • MC260K41M/NR5E11/NR

    • M1860K1M/NR5E11/NR

    • M2860K1M/NR5E11/NR

    • MT1.5E12/NR (bytes)

  • We can use 40 MPI ranks, one per compute node, so the memory requirement per compute node is approximately 1.5E12 / 40 = 3.75E10 bytes, or 37.5GB. The NAS Ivy nodes each have 64GB of memory available, so we can fit our job into a PBS session with 40 Ivy nodes and 40 MPI ranks. From practical experience, we know this job takes less than 10 minutes of wall clock time, so we can use the following PBS directive to allocate the job:

    #PBS -lselect=40:model=ivy:walltime:00:10:00
    

Rationale for the rule of thumb is based on complexity analysis of the current algorithm implementation, described in more detail in Software Design.