Version History

VersionDateDescription of Changes
v0.04/??/24Initial draft work by Marcio and Jeremy
v0.15/2/24Applied changes from last meeting. Made note of "interface" GitHub organizations, removed lcls-daq from the organization list, noted down additional discussion points about team setup.
v0.25/3/24Added section about READMEs and GitHub pages. Added note that the "Other aspects to consider" is mostly noting capabilities that we probably want to apply in a more localized way. (we don't want to enforce a certain branching workflow for the whole lab!)
v0.35/3/24Added section on security, for discussion in next meeting.

Definitions

TermDescription
OrganizationA location on GitHub where many repositories and teams can be stored. Translates into a URL when browsing or cloning a repository.
TeamA group of users having common access permissions to one or more repositories.  Can be hierarchical.
Working CopyA clone of a Git repository that you can edit and compile.
RepositoryA location where Git history and code are stored. These are added as "remotes" on a local working copy.
GHEGitHub Enterprise
HLAHigh-Level Applications
ForkA copy of a repository from one organization to another. For example, github.com/slac-epics/asyn would be a fork of github.com/epics-modules/asyn. GitHub keeps track of forks so the upstream code has the link clearly visible.
Upstream

Repository that is the original basis of a fork.

NDA

Non-disclosure Agreement

GitHub Organizations

GitHub Enterprise (GHE) can support multiple organizations , each having any number of repositories, and also supporting a number of teams , providing different roles with distinct privilege within those repositories.  We would like to maintain a relatively small number of GHE organizations at SLAC, and offer some guidelines for naming those organizations, teams and other related aspects of GHE.

slaclab

slaclab is the organization that is the default starting point for all repositories maintained by SLAC. When thinking about the right place to create a new repository, the first answer is slaclab.

URL: https://github.com/slaclab

User access to this organization will probably be handled by SSO with the GitHub Enterprise account, meaning that if you already have an SLAC account, your access should be automatically allowed.  Individuals leaving SLAC will have their account removed from GHE teams, having no access to those SLAC repositories that were formerly available.  Certain employees maintain a working status at SLAC after their departure, with their login accounts remaining active.  Their status within specific GHE teams would be decided by those remaining team members based on the individual's working agreement.

Questions to be answered:

  1. What happens when a person leaves SLAC? Remove from Teams? Relinquish administrative privileges to repositories? (Doug: added suggestions above) Should this entire section be moved up, since it potentially applies to all GHE orgs and teams?
  2. What happens if the person leaving still contributes with code? (Doug: added suggestions above)
  3. How do we address contributions from people outside SLAC?

slac -sandbox

slac-sandbox is an organization for hosting repositories that will be short-lived, like prototypes, code practice, and proof of concept. It is ok if a repository in slac-sandbox is transferred to slaclab if it makes sense. 

We discourage people from creating sandbox repos related to work at SLAC in their personal GitHub accounts because the repositories could have value for SLAC in the future, even after the author leaves SLAC.

Creating new organizations

To keep the number of organizations small, new ones will be created according to the following guidelines:

  • If name collisions occur from simple repository names ( gatew ay, bsa, base ) which cannot be changed; perhaps the repo is forked from an external open-source repo with existing names.
  • If managing 100+ related repositories within slaclab gets burdensome.  
  • When collaborating with external developers on multi-repo projects. For example, the EPICS archiver appliance.  
  • Self-managing private repositories with limited visibility to organization administrators, such as projects requiring a Non-Disclosure Agreement (NDA).

Example of organizations outside slaclab

This is just an example of organizations that currently exist at the present date. This list is not intended to keep track of all recognizable SLAC organizations and will not be updated in the future. 

  • slac-epics

    • https://github.com/slac-epics
    • This is a separate organization due to the high number of related repos with a greater chance of encountering name collisions. 
    •  Examples:
      • EPICS base
      • EPICS modules
      • EPICS IOCs
      • PyDM and EDM screens
      • EPICS HLAs
      • Matlab code that uses EPICS
      • EDM
      • ALH
      • eco_tools
      • Striptool
      • Gateway
      • eget services
  • archiver-appliance

    • https://github.com/archiver-appliance
    • This organization is separate because it has a few correlated repositories that get strong collaboration from people outside SLAC. It has become a product that wouldn't fit inside slaclab as external people would get confused among the thousands of repos.

GitHub Enterprise umbrella

Organizations at SLAC that are accepted to the GitHub Enterprise umbrella will have:

  • Integration with SSO login, managed by SLAC IT.     TO BE CONFIRMED
  • Unlimited CI/CD jobs.
  • Security monitored and maintained by SLAC/ Stanford IT.
  • Accountability from GitHub with priority on problem solving in case of technical problems.
  • Shared Teams among all organizations.

Concerns when using GitHub Enterprise:

  • Less flexibility when defining new organizations as this will follow general standards.
  • Less flexibility when providing repository access, as this will be managed by SLAC IT's security standards.

We understand that repositories under NDA may need to be outside GitHub Enterprise and, maybe, outside GitHub at all because organization administrators can't see its contents. Projects with DoD is an example.

Note that if you decide to go outside GitHub Enterprise, you lose all the advantages and will be on your own regarding administration of the organization. SLAC IT still may need to be involved regarding, for example, CI/CD jobs with SLAC hosted runners. There's a risk that IT may deny authorization and recommend the repository to be transferred to one organization in GitHub Enterprise.

FOR THE NEXT MEETING:

What it means if you split off into your own org within the enterprise instance:  

  • You will need to self-manage org-level settings, such as security (there is also an impact to SLAC IT)  
  • If the codebase is critical to SLAC, SLAC will need to create a mirror or fork of the codebase to preserve our security.  

I didn't understand these arguments.

Repository names

GitHub doesn't provide a hierarchical structure for repository names, which means there is a possibility of name collisions when working in an organization with thousands of repositories. This document will not require specific naming rules, but offers guidelines for consistent naming that everyone can follow.

  • First of all, think if the name you want to give to a repository is too generic. Examples: base, main, test, network, gateway, support, etc. Even if you are lucky to find that these names are available, this doesn't mean the choice is adequate as these names will be among hundreds of others and they give no clue about what is inside the repository.
  • Consider prepending a standard word that your department agrees on to give the repository name more context. Examples:
    • ioc- or iocapp- for EPICS IOC applications.
    • For current Git packages in AFS grouped by directories, the directory name could be prepended, like in timing/bsa being converted to timing-bsa.
    • ops- for code from operators and physicists, like in Matlab scripts.
    • pydm- or edm- for repositories holding displays, like in lclshome.

GitHub Teams

This document provides only a guideline for defining Teams in GitHub. How to organize Teams is not a requirement.

GitHub does support a hierarchical structure for teams, so repositories can be associated and aligned with a team hierarchy.  The unique rule that is a requirement is that every repository must have a responsible Team. Repositories with one single individual as admin are only allowed in the slac-sandbox organization.

  1. Hierarchical team configuration to mimic what we have at SLAC. Using the embedded group in TID as an example:

    TID Team Groupping

    Naming could be abbreviated by Directorate-Department, for example, excluding the division if this makes sense. Avoid department acronyms though, as they may repeat inside different directorates. Another idea for naming hierarchical teams:

    1. Prepend "org-" before the team name t o make it clear this is an organizational structure.
  2. Subject matter. Examples:
    • Timing or accelerator-timing
    • EPICS
    • RTEMS
    • Beckhoff-core ( maybe eventually PLC core or controls engineering core)
    • AIML-core  
    • Xray-optics (this is a mechanical engineering discipline)  
    • LSS-core
  3. Project Teams. Examples:
    • Form 1 : Go by Work Breakdown Struc t u re (in the case of 413.3b):
      • Eg. mecu -controls, mecu - laser - rrl - c ontrols (matter in extreme conditions upgrade , rep-rated laser)  
    • Form 2: Project title:
      • l2si-core
      • lcls2
      • he


TID Team Groupping

This is a very high-level example, only meant to show the hierarchy.  All models can coexist because each repository can receive multiple teams and individuals.

GitHub Topics

As GitHub doesn't allow the distribution of repositories in a hierarchy like file systems do, one way to ease the search is by the use of Topics. Topics are like labels that can be set in each repository. A repository can have multiple Topics.

Once this is set, if you are interested in LLRF, for example, you would search by the LLRF Topic and see only the repositories related to that Topic.

Topics cross organizations in all GitHub. For example, checking the rtems topic returns https://github.com/topics/rtems. slaclab is one organization that shows up in the search results, but there are others.

At this moment GitHub allows for searching a Topic in one single organization or all organizations available in GitHub. There's no way to configure a search for a group of organizations. To improve the success in searches we suggest prepend "slac-" to all our Topics, like slac-timing, slac-atca, slac-llrf, etc. This way we ensure that a broad search in GitHub would bring repositories related only to organizations related to SLAC.

GitHub Issues

Currently we have 2 ticket systems in use for software development/bug tracking: CATER and Jira. GitHub brings its own ticket system called Issues.

CATER won't go away for a long time. So, what do we do regarding Jira and GitHub issues? The use cases could be:

  1. CATER is kept as it is used today. No changes.
  2. GitHub issues for SLAC maintained repositories that have external collaborators.
  3. GitHub issues for tickets clearly related to one single repository.
  4. Jira for tickets that cross multiple repositories or that are unrelated to work for any repository. Could GitHub Projects be used for this, instead?

Do we want to keep track of tickets in 3 different tools?

NOTE (5/1/24): Jerry K. indicated that EED is looking to move away from Jira. Other groups that have a heavier dependence on Jira may not want to move away.
Overall, this seems like a department-specific decision rather than one that can be made for the entire lab.

GitHub Pages and READMEs

If we end up creating multiple different organizations for SLAC-related projects, how do we keep track of things? A potential solution to this issue is to use GitHub organization READMEs and GitHub pages for organization level documentation.

We could create an organization-level README on slaclab that contains links to other SLAC orgs and some basic information. We could also create a documentation page for the entire slaclab org that contains links to and information about relevant GitHub organizations and projects.

The pcdshub organization is a good example of this type of setup:

GitHub pages simply publishes HTML. They can either be written manually or generated with a software package like Spinx or Hugo.
In the above example, pcdshub.github.io is using Sphinx as the documentation generator and ReStructured Text for the source files.

Licenses

In TID we've been following SLAC's legal request of adding a specific LICENSE file to each repository's top directory, plus a disclamer text in all .c, .cpp., .h, .hpp, .py, .vhd, etc files. There's a Python script that we run that do this automatically: https://github.com/slaclab/surf/blob/pre-release/scripts/apply_slac_license.py. As this comes from SLAC legal, I believe that this would be extended to all code available in SLAC's GitHub organizations.

The problem arises for external code that we fork in our repos. It is very common that the forked code has its own license that we can't modify. TID directors' orientation in this case is that the repository must be made private.

I believe that we need to talk with SLAC legal again to verify more use cases.

NOTE (5/1/24): SLAC legal is probably concerned about licensed code from other sources (i.e. vxWorks), not open source software. It is probably fine to keep open source projects we fork public.

Security

Some thoughts about security:

  • Some slaclab admins do not work at SLAC anymore
    • Effort is required to keep this list up to date
  • Some slaclab admins do not have 2fa enabled
    • Enforce 2fa for everyone in slaclab, probably!
  • Fine-grained personal access tokens should be used instead of classic tokens
    • Classic tokens allow access to all repos for a user. If a PAN is compromised for a slaclab user, a threat actor could read/write to any repositories that user can.
      • GitHub does a good job detecting compromised tokens and accounts, but still.
  • Use deploy tokens for CI/CD work.
    • Alternative to personal access tokens for CI/CD. More secure because it's added per-repo and scoped only to that repo.
      • Doesn't work for private submodules...
  • Require signed commits?
    • Branch protections/rulesets allow you to require signed commits for matching refs. (i.e. to avoid compromised maintainer accounts from pushing malicious code/releases)
      • Just makes it harder for an attacker to attack things: They'd need auth token/SSH private key, and GPG private key to push a malicious release.

Training

All staff that needs to write to repositories under the the GitHub Enterprise umbrella will attend a training session to learn about general standards and recommended practices described in this document. At the present date this training is still under development. Should it be in the staff STA?

Other aspects to consider

This section covers workflow specific guidelines for the usage of GitHub enterprise. These are not intended to be applied lab wide! Instead, these serve as examples of controls that may be applied on a per-department/group/division basis.

Repository naming

Should we standardize for repository naming or keep each team to define them freely? Use cases:

If we have slac-epics and slac-epics-apps:

  • No prefixes for modules, base, or other applications.
  • Prefix IOC applications with ioc-?

Standard rulesets

Should the entire SLAC follow the same workflow, with standard names for branches, and standard rules for using each branch? What if different departments have conflicting requirements? 

Settings > Rules > Rulesets

  • Mirror TID-ID workflows
  • Match branches:
    • main
    • pre-release
    • Release tags (in the format [0-9]+.[0-9]+.[0-9]+ 
      • NOTE: GitHub uses fnmatch, not regexp for this matching.
  • Allow bypass:
    • Maintainers and repository owners
  • Restrict creations
  • Restrict updates
  • Restrict deletions
  • Require a pull request before merging
    • Require at least 1 approval before merge
  • Require status checks to pass
    • Only if CI is enabled

Future work

The work outlined here is outside of the scope of this document and should probably be done on a per directorate/department/group basis.

Establish Standards for Locally Developed C/C++ Projects

ipmiComm and ek9000 module could be used as the reference implementation for these things.
These two modules should implement most or all of the recommended CI checks and whatnot, and adhere to the standards we define.

  • GitHub actions for CI. Verify that everything compiles to avoid broken code being merged.
  • Set -Werror, minimize ignored warnings
    • Define standard set of enabled/disabled warnings for C/C++ projects
  • Some level of automated testing
    • Testing should be run with a runtime analyzer too
      • asan
      • valgrind
    • Testing may not always be possible due to the nature of EPICS
  • Static code analysis with clang-tidy
    • Define standard set of clang-tidy checks
  • Code formatting with clang-format
    • Define standard clang-format configuration (ideally one that closely matches our current, most common style)
  • pre-commit hook for code formatting
    • Only if using code formatting.
    • Mostly concerned about Python projects here, clang-format can be finicky with pre-commit.

Establish Standards for Documentation

Document Contribution Workflow on GitHub Pages

Standard Labels

  • Labels can be applied to issues and pull requests so that they can be filtered.
  • Repositories may have custom labels, but you can also define them organization-wide.
  • Depending on the workflow for certain software components, different labels may be needed.

Commit Naming Guidelines

  • Commit naming guidelines may be helpful to enforce consistent and detailed change descriptions.
  • ECS enforces their own commit guidelines. Detailed here: https://pcdshub.github.io/development.html#commit-guidelines.
  • Now that slac-epics contains some software that may be from ECS (i.e. eco_tools), should we enforce a controls-wide commit naming scheme?
    • Maybe just describe the model used by a particular software packaged in the README?
  • No labels