Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

Descriptor based analysis is a powerful tool for understanding the trends across various catalysts. In general, the rate of a reaction over a given catalyst is a function of many parameters - reaction energies, activation barriers, thermodynamic conditions, etc. The high dimensionality of this problem makes it very difficult and expensive to solve completely and even a full solution would not give much insight into the rational design of new catalysts. The descriptor based approach seeks to determine a few "descriptors" upon which the other parameters are dependent. By doing this it is possible to reduce the dimensionality of the problem - preferably to 1 or 2 descriptors - thus greatly reducing computational efforts and simultaneously increasing the understanding of trends in catalysis. 

The "kineticsmkm" Python module seeks to standardize and automate many of the mathematical routines necessary to move from "descriptor space" to reaction rates. The module is designed to be both flexible and powerful. A "reaction model" can be fully defined by a configuration file, thus no new programming is necessary to change the complexity or assumptions of a model. Furthermore, various steps in the process of moving from descriptors to reaction rates have been abstracted into separate Python classes, making it easy to change the methods used or add new functionality. The downside of this approach is that it makes the code more complex. The purpose of this guide is to explain the general structure of the code, as well as its specific functions and how to begin using it. 

Useful Definitions: (Note symbols may not appear in Safari/IE)

  • descriptor - variable used to describe reaction kinetics at a high level. Most commonly these are the binding energies of atomic constituents in the adsorbates (e.g. carbon/nitrogen/oxygen adsorption energies) or other intermediates. However, in the general sense other variables such as electronic structure parameters (e.g. d-band center) or thermodynamic parameters (e.g. temperature/pressure) could also be descriptors.
  • descriptor space ( D ) - the space spanned by the descriptors. Usually ℝ2.
  • parameter space ( ? P ) - the space spanned by the full reaction parameters for the reaction model. Usually 2n where where n is the number of elementary steps (one reaction energy and one reaction barrier per elementary step).
  • reaction rates ( r? ) - an n-vector of rates corresponding to each of the n elementary reactions.
  • descriptor map - a map of some variable (rates,coverages,etc.) as a function of descriptor space.
  • reaction model - a set of elementary steps and conditions which define the physics of the kinetic system, along with the mathematical methods and assumptions used to move from "descriptor space" to reaction rates.
  • model definition setup file - a file used to define the reaction model
  • input file - a file used to store other data which is likely common to many reaction models (e.g. energetics data)

Code Structure: 

  • The interface to the code is handled through the the ReactionModel class class. This class acts as a messenger class which broadcasts all its attributes to the other classes used in the kinetics module. The class also handles common functions such as printing reactions/adsorbates or comparing/reversing elementary steps. The attributes of of ReactionModel are are automatically synchronized with the parser, scaler, solver, and mapper so it is useful to think of of ReactionModel as as a "toolbox" where all the necessary information and common functions are stored. 

The The Parser class class serves to extend the "model definitionsetup file" file by reading in various quantities from an "input file". Technically the use of a parser is optional, but in practice it is extremely helpful for reading in common data such as adsorption energies or vibrational frequencies rather than re-typing them for every reaction model.  

The process of creating a "descriptor map" is abstracted into three general processes, which are handled by the following classes within the kinetics module:  

Scaler: Projects Projects descriptor space into parameter space: ?? ? ?  DP

Solver: Maps Maps parameter space into reaction rates: ? ? Pr

Mapper: Moves Moves through descriptor space. This becomes important for practical reasons since it is often necessary to use a solution at one point in descriptor space as an initial guess for a nearby point.  

The The ThermoCorrections class class is responsible for applying thermodynamic corrections to the electronic energies which are used as direct inputs. This includes the contributions of entropy/enthalpy and zero-point energy due to temperature, pressure, or other thermodynamic variables.

There are also a variety of analysis classes which allow for automated analysis and visualization of the reaction model. These include the RateAnalysis which creates "volcano plot" (descriptor maps of the rate), CoverageAnalysis which creates descriptor maps of the coverages of various intermediates, ScalingAnalysis which can give a visual representation of how well the scaler projects descriptor space to parameter space, and several other useful tools.

Using the code: 

the VectorMap and MatrixMap classes which create plots of outputs as a function of descriptor space. The VectorMap class is designed for outputs in the form of vectors (rates, coverages, etc.) while the MatrixMap is designed for outputs in the form of matrices (rate control, sensitivity analyses, etc.). The analysis folder also contains MechanismAnalysis which creates free energy diagrams, and ScalingAnalysis which can give a visual representation of how well the scaler projects descriptor space to parameter space.

Using the code:

Some examples can be found in the "demos" folder, and these should explain the syntax necessary and serve as a good starting point. The currently implemented features are also Some examples can be found in the "demos" folder, and these should explain the syntax necessary and serve as a good starting point. The currently implemented features are also briefly described below in order to allow a better understanding of the demos and creating original reaction model definitionssetup files.  

Using the kinetics module to conduct a descriptor analysis requires (at least) 2 files: the "model definitionsetup file" which defines the reaction model, as well as another script to initialize the ReactionModel class and conduct analyses. The "model definitionsetup file" is generally a static file (i.e. it is not a program) while the submission script will be an actual python program. Model definition Setup files typically end in ".mkm" for micro-kinetic model (although this is not required) while submission scripts end in .py since they are just python scripts. In addition it is very useful to also have an "input file" which contains the raw data about the energetics of the reaction model. An example of such how to create an input file is provided in "published_data.txt" in the demos folder. based on a table-like format is given in the 1 - Generating an Input File tutorial.

Each class described in the the Code Structure section section will require some specialized parameters. Some of these parameters are common to all variants of these classes, while others are specific to certain implementations. The possible inputs for the currently implemented variants of each class are listed below. Required attributes are are underlined.  

  • ReactionModel:
      unmigrated-wiki-markup
    • +rxn_expressions\*+&nbsp;\- These expressions determine the elementary reaction, and are the most important part of the model. They must be defined unless the expressions - These expressions determine the elementary reaction, and are the most important part of the model. They must be defined unless the elementary_rxns, adsorbate_names, and transition_state_names , and gas_names are all explicitly defined since the rxn_expressions are parsed into these 3 attributes. It is much easier to just define rxn_expressions, although it is important to note the syntax. There must be spaces between all elements of each expression (i.e. C*+O\* is not okay, but C\* + O\* is), and species ending with \ _g are gasses by default. Adsorbed species may end with * or \ _x where * designates adsorption at the "s" site (by default), while \ _x designates adsorption at the "x" site (note that "x" may be any letter except "g", and that X* and X_s are equivalent). Transition-states should include a \ -, and reactions with a transition-state are specified by 'IS <-> TS \ -> FS' while reactions without a transition-state are defined as 'IS \ -> FS' (where IS,TS,FS are expressions for the Initial/Transition/Final State). When the model initializes it checks the expressions for mass/site balances, and if it finds that they are not balanced it prints a warning but does not raise an exception. \[list of strings\] Wiki Markupit will raise an exception. [list of strings]. Instead of specifying rxn_expressions the following attributes may instead be defined:
      • elementary_rxns - list version of
      elementary_rxns - list version of
      • rxn_expressions.
      • These
      • will
      • be
      • automatically
      • populated
      • if
      • rxn_expressions
      • are
      • defined.
      \[
      • [list
      • of
      • lists
      • of
      • lists
      \
      • ]
      Wiki Markup
      • adsorbate_names
      • -
      • list
      • of
      • adsorbate
      • names
      • included
      • in
      • the
      • analysis.
      • Automatically
      • populated
      • if
      • rxn_expressions
      • are
      • defined.
      \
      • [list
      • of
      • strings
      \
      • ]
      Wiki Markup
      • transition_state_names\
      *
      • -
      • list
      • of
      • transition-state
      • names
      • included
      • in
      • the
      • analysis.
      • Automatically
      • populated
      • if
      • rxn_expressions
      • are
      • defined.
      \
      • [list
      • of
      • strings
      \
      • ]
    • Wiki Markup
      +surface_names+&nbsp;\- list of surface names to be included in the analysis. \[list of strings\]
    • Wiki Markup
      +gas_names+&nbsp;\- list of gas names included in the analysis. \[list of strings\]
    • Wiki Markup
      +gas_pressures+&nbsp;\- list of gas pressures to be used for the analysis. May not be defined if ThermodynamicScaler is being used with pressure as a descriptor and gas_ratios is defined. \[list of numbers in bar\]
    • Wiki Markup
      gas_ratios - list of gas ratios to be used for the analysis. Only necessary if ThermodynamicScaler is being used to determine the overall pressure. \[list of dimensionless numbers\]
    • Wiki Markup
      +temperature+&nbsp;\- temperature used for the analysis. May not be defined if ThermodynamicScaler is being used with temperature as a descriptor. \[number in Kelvin\]
    • Wiki Markup
      +site_definitions+&nbsp;\- definitions for all sites used in rxn_expressions. Note that * defaults to 's' so 's' should always be defined. The values of the dictionary can be strings corresponding to the "site" field in the input file, or a list of strings if a site definition corresponds to multiple values of the "site" field. \[dictionary of string:(string OR list of strings)\]
    • Wiki Markup
      +site_totals+&nbsp;\- dictionary of total which each site must sum to. \[dictionary of string:(floatable string OR dimensionless number)\]
    • Wiki Markup
      +descriptor_names+&nbsp;\- names of variables to be used as descriptors. \[list of strings\]
    • Wiki Markup
      parser - name of class to use for solver. Defaults to KineticsParser. \[string\]
    • Wiki Markup
      mapper - name of class to use as a mapper. Defaults to MinResidMapper. \[string\]
    • Wiki Markup
      scaler - name of class to use for scaler. Defaults to GeneralizedLinearScaler. \[string\]
    • Wiki Markup
      solver - name of class to use for solver. Defaults to SteadyStateSolver. \[string\]
    • Wiki Markup
      thermodynamics - name of class to use for thermodynamic corrections. Defaults to ThermoCorrections. \[string\]
    • Wiki Markup
      coverage_map_file - file to save coverage map data. \[filepath string\]
    • Wiki Markup
      rate_map_file - file to save rate map data. \[filepath string\]
  • Parser:
    • input_file - file where input data is stored. File must be in the correct format for the parser used.
  • KineticsParser:
    • Wiki Markup
      estimate_frequencies - attempt to estimate frequencies when they are not listed in the input_file. First try to find adsorbates on different sites, then by using the frequencies of atomic species. If no frequencies are found for transition-states, use the combined frequencies of the dissociated state. Defaults to False. \[boolean\]
    • Wiki Markup
      adsorbate_definitions - dictionary which allows adsorbates to have a different name in the reaction model than in the input file e.g. \{'CH3O':'H3CO'\} if methoxy was called 'CH3O' in the reaction model but 'H3CO' in the input file. Defaults to \{\}. \[dictionary of string:string\]
  • Scaler:
    • Wiki Markup
      gas_thermo_mode - Approximation used for obtaining gas-phase free energy corrections. Defaults to ideal_gas. Other possibilities are: shomate_gas (use Shomate equation), zero_point_gas (zero-point corrections only), fixed_entropy_gas (include zero-point and assume entropy is 0.002 eV/K) , frozen_gas (no corrections), frozen_zero_point_gas (no zero-point and entropy is 0.002 eV/K). \[string\]
    • Wiki Markup
      adsorbate_thermo_mode - Approximation used for obtaining adsorbate free energy corrections. Defaults to harmonic_adsorbate (use statistical mechanics+vibrational frequencies). Other possibilities are: zero_point_adsorbate (zero-point corrections only), frozen_gas (no corrections). \[string\]
    • transition_state_scaling_parameters - Used if transition-state scaling is used. Many published values are hard-coded, and parameters can often be input directly so it is usually not necessary. Read scalers/_init_.py for syntax.
  • GeneralizedLinearScaler:
    • Wiki Markup
      descriptor_dict - dictionary of known descriptor values for various surfaces. Usually populated by the parser. \[dictionary of string:list of numbers\] where list of numbers is the length of descriptor_names and is ordered to correspond to the order in "descriptor_names".
    • Wiki Markup
      parameter_dict - dictionary of known parameter values for various adsorbates/reactions. Should be either adsorption energies (if parameter_mode=adsorption) or reaction energies/barriers (if parameter_mode=reaction). This is normally populated to be the adsoprtion energies by the parser. \[dictionary of string:list of numbers\] where list of numbers is the length of surface_names and is ordered to corespond to the order in "surface_names"; if the value for some surface is not known the entry should be '-' in order to "hold its place".
    • Wiki Markup
      parameter_mode - scale to adsorption energies (adsorption) or directly to reaction energies/barriers (reaction). Note that to use reaction a custom parser is needed to make the "parameter_dict" contain reaction energies instead of adsorption energies. Defaults to adsorption. \[string\]
    • Wiki Markup
      default_constraints - list of "constraints" on scaling coefficients to be used when constraints are not explicitly specified. This should be a list of length m+1 where m is the number of descriptors (the extra constraint is on the intercept). Constraints can be single numbers (e.g. 0) in order to constrain the coefficient to that value, colon separated numbers (e.g. 0:1) to constrain the coefficient to that range, '+' to constrain to positive, '-' to constrain to negative, or None to use no constraints. Defaults to \['+','+',None\]. \[list of numbers/strings\]
    • Wiki Markup
      scaling_constraint_dict - dictionary of constraint lists (see default_constraints) for adsorbates which do not use the "default_constraints". Can also specify transition-state scaling by using 'TS(A+B):\[slope,intercept\]' where A and B are species in the initial/final state and slope, intercept are the transition-state scaling parameters. See the "methanation" demo for an example and other syntaxes: \[dictionary of string:(list of strings OR string)\]
    • Wiki Markup
      use_input_energies - if the descriptor points happen to be in "descriptor_dict" then use the exact energies instead of the scaled energies. This usually doesn't matter in practice since it is unlikely that the descriptor map will land exactly on one of the points; it is mostly used for the RateAnalysis where the rate of the unscaled parameters is calculated and compared to the scaled rate. Defaults to False. \[boolean\]
  • Solver:
  • SteadyStateSolver:
    • Wiki Markup
      decimal_precision - number of decimals to explicitly store. Calculation will be slightly slower with larger numbers, but will become completely unstable below some threshhold. Defaults to 50. \[integer\]
    • Wiki Markup
      tolerance - all rates must be below this number before the system is considered to be at "steady state". Defaults to 1e-50. \[number\]
    • Wiki Markup
      max_rootfinding_iterations - maximum number of times to iterate the rootfinding algorithm (multi-dimensional Newtons method). Defaults to 50. \[integer\]
    • Wiki Markup
      internally_constrain_coverages - ensure that coverages are greater than 0 and sum to less than the site total within the rootfinding algorithm. Slightly slower, but more stable. Defaults to True. \[boolean\]
    • Wiki Markup
      residual_threshold - the residual must decrease by this proportion in order for the calculation to be considered "converging". Must be less than 1. Defaults to 0.5. \[number\]
  • Mapper:
  • MinResidMapper:
    • Wiki Markup
      search_directions - list of "directions" to search for existing solutions. Defaults to \[\[0,0\],\[0,1\],\[1,0\],\[0,-1\],\[-1,0\],\[-1,1\],\[1,1\],\[1,-1\],\[-1,-1\]\] which are the nearest points on the orthogonals and diagonals plus the current point. More directions increase the chances of findinga good solution, but slow the mapper down considerably. Note that the current point corresponds to an initial guess coverage provided by the solver (i.e. Boltzmann coverages) and should always be included unless some solutions are already known. \[list of lists of integers\]
    • Wiki Markup
      max_bisections - maximum number of time to bisect descriptor space when moving from one point to the next. Note that this is actuall the number of iterations per bisection so that a total of 2{^}max_bisections^&nbsp;points could be sampled between two points in descriptor space. Defaults to 3. \[integer\]
    • Wiki Markup
      descriptor_decimal_precision - number of decimals to include when comparing two points in descriptor space. Defaults to 2. \[integer\]
  • ThermoCorrections:
    • Wiki Markup
      thermodynamic_corrections - corrections to apply. Defaults to \['gas','adsorbate'\]. \[list of strings\]
    • Wiki Markup
      thermodynamic_variables - variables/attributes upon which thermo corrections depend. If these variables do not change the corrections will not be updated. Defaults to \['temperatures','gas_pressures'\]. \[list of strings\]
    • Wiki Markup
      frequency_dict - used for specifying vibrational frequencies of gasses/adsorbates. Usually populated by the parser. Defaults to {}. \[dictionary of string:list of numbers in eV\]
    • gas_param_dict - used for parameters in gas correction. Should be ideal gas parameters for ideal gas, or shomate parameters for shomate gas. Most common gasses are hard-coded, so it is usually not necessary. Read thermodynamics.py for syntax.
    • atoms_dict - used for ideal gas correction. Defaults to ASE structure database for common gasses so it is usually not necessary. Read thermodynamics.py for syntax.
      • gas_names - list of gas names included in the analysis. [list of strings]
    • surface_names - list of surface names to be included in the analysis. [list of strings]
    • species_definitions - This is a dictionary where all species-specific information is stored. The required information will vary depending on the scaler/thermo corrections/solver/mapper used, and the "parser" generally fills in most information. However, there are a few things which generally need to be supplied explicitly:
      • species_definitions[{site}]['site_names'] (where {site} is each site name in the model) - A list of "site names" which correspond to {site}. If the TableParser (default) is being used then the "site names" must also match the designations in the "site_name" column. For example, if you want the "s" site to correspond to the energetics of an adsorbate at a (211) site, and (211) sites are designated by '211' in the site_name column of the input_file, then this would be specified by: species_definitions['s'] = {'site_names':['211']}. Similarly, if you wanted the 't' site to correspond to 'fcc' or 'bridge' sites then you could specify: species_definitions['t'] = {'site_names':['fcc','bridge']}.
      • species_definitions[{site}]['total'] (where {site} is each site name in the model) - A number to which the total coverage of {site} must sum. For example, if you wanted to have a total coverage of 1 with 10% 's' sites and 90% 't' sites (with the same site definitions as above) you would specify: species_definitions['s'] = {'site_names':['211'],'total':0.1} and species_definitions['t'] = {'site_names':['fcc','bridge'],'total:0.9}.
      • species_definitions[{gas}]['pressure'] (where {gas} is each gas name in the model) - The pressure of each gas species in bar. For example, if you wanted a carbon monoxide pressure of 10 bar and hydrogen pressure of 20 bar you would specify: species_definitions['CO_g']['pressure'] = 10 and species_definitions['H2_g']['pressure'] = 20. Note that for some situations you may instead need to specify a 'concentration','approach_to_equilibrium', or some other key, but in almost every situation some method for obtaining the gas pressures must be specified for each gas in the model.
    • temperature - temperature used for the analysis. May not be defined if ThermodynamicScaler is being used with temperature as a descriptor. [number in Kelvin]
    • descriptor_names - names of variables to be used as descriptors. [list of strings]
    • descriptor_ranges - Used for mapping through descriptors space. Specify the limits of the descriptor values. Should be a list equal in length to the number of descriptors where each entry is a list of 2 floats (min and max for that descriptor). [list of lists of floats].
    • resolution - Used for mapping through descriptor space. Resolution used when discretizing over descriptor_range. [int]
    • parser - name of class to use for solver. Defaults to TableParser. [string]
    • mapper - name of class to use as a mapper. Defaults to MinResidMapper. [string]
    • scaler - name of class to use for scaler. Defaults to GeneralizedLinearScaler. [string]
    • solver - name of class to use for solver. Defaults to SteadyStateSolver. [string]
    • thermodynamics - name of class to use for thermodynamic corrections. Defaults to ThermoCorrections. [string]
    • data_file - file where large outputs will be saved as binary pickle files. Defaults to 'data.pkl' [filepath string]
    • numerical_representation - determines how to store numbers as binary. Can be 'mpmath' for multiple precision or 'numpy' for normal floats. Note that 'numpy' rarely works. Defaults to 'mpmath'. [string]
  • Parser:
    • input_file - file where input data is stored. File must be in the correct format for the parser used.
      See 1 - Generating an Input File for more information.
  • Scaler:
    • gas_thermo_mode - Approximation used for obtaining gas-phase free energy corrections. Defaults to ideal_gas. Other possibilities are: shomate_gas (use Shomate equation), zero_point_gas (zero-point corrections only), fixed_entropy_gas (include zero-point and assume entropy is 0.002 eV/K) , frozen_gas (no corrections), frozen_zero_point_gas (no zero-point and entropy is 0.002 eV/K). [string]
    • adsorbate_thermo_mode - Approximation used for obtaining adsorbate free energy corrections. Defaults to harmonic_adsorbate (use statistical mechanics+vibrational frequencies). Other possibilities are: zero_point_adsorbate (zero-point corrections only), frozen_gas (no corrections). [string]
    • transition_state_scaling_parameters - Used if transition-state scaling is used. Many published values are hard-coded, and parameters can often be input directly so it is usually not necessary. Read scalers/_init_.py for syntax.
  • Solver:
  • SteadyStateSolver:
    • decimal_precision - number of decimals to explicitly store. Calculation will be slightly slower with larger numbers, but will become completely unstable below some threshhold. Defaults to 50. [integer]
    • tolerance - all rates must be below this number before the system is considered to be at "steady state". Defaults to 1e-50. [number]
    • max_rootfinding_iterations - maximum number of times to iterate the rootfinding algorithm (multi-dimensional Newtons method). Defaults to 50. [integer]
    • internally_constrain_coverages - ensure that coverages are greater than 0 and sum to less than the site total within the rootfinding algorithm. Slightly slower, but more stable. Defaults to True. [boolean]
    • residual_threshold - the residual must decrease by this proportion in order for the calculation to be considered "converging". Must be less than 1. Defaults to 0.5. [number]
  • Mapper:
  • MinResidMapper:
    • search_directions - list of "directions" to search for existing solutions. Defaults to [[0,0],[0,1],[1,0],[0,-1],[-1,0],[-1,1],[1,1],[1,-1],[-1,-1]] which are the nearest points on the orthogonals and diagonals plus the current point. More directions increase the chances of findinga good solution, but slow the mapper down considerably. Note that the current point corresponds to an initial guess coverage provided by the solver (i.e. Boltzmann coverages) and should always be included unless some solutions are already known. [list of lists of integers]
    • max_bisections - maximum number of time to bisect descriptor space when moving from one point to the next. Note that this is actuall the number of iterations per bisection so that a total of 2max_bisections points could be sampled between two points in descriptor space. Defaults to 3. [integer]
    • descriptor_decimal_precision - number of decimals to include when comparing two points in descriptor space. Defaults to 2. [integer]
  • ThermoCorrections:
    • thermodynamic_corrections - corrections to apply. Defaults to ['gas','adsorbate']. [list of strings]
    • thermodynamic_variables - variables/attributes upon which thermo corrections depend. If these variables do not change the corrections will not be updated. Defaults to ['temperatures','gas_pressures']. [list of strings]
    • frequency_dict - used for specifying vibrational frequencies of gasses/adsorbates. Usually populated by the parser. Defaults to {}. [dictionary of string:list of numbers in eV]
    • ideal_gas_params - parameters used for ase.thermochemistry.IdealGasThermo. Defaults to mkm.data.ideal_gas_params. [dictionary of string:string/int]
    • fixed_entropy_dict - entropies to use in the static entropy approximation. Defaults to mkm.data.fixed_entropy_dict. [dictionary of string:float]
    • atoms_dict - dictionary of ASE atoms objects to use for ase.thermochemistry.IdealGasThermo. Defaults to ase.structure.molecule(gas_name). [dictionary of string:ase.atoms.Atoms]
    • force_recalculation - re-calculate thermodynamic corrections even if thermodynamic_variables do not change. Slows the code down considerably, but is useful for sensitivity analyses where thermodynamic variables might be perturbed by very small amounts. Defaults to False. [boolean]
  • Analysis:
  • MechanismAnalysis:
    • rxn_mechanisms - dictionary of lists of integers. Each integer corresponds to an elementary step. Elementary steps are indexed in the order that they are input with 1 being the first index. Negative integers are used to designate reverse reactions. [dictionary of string:list of integers
  • Analysis:
  • MechanismAnalysis:
    • Wiki Markuprxn_mechanisms - dictionary of lists of integers. Each integer corresponds to an elementary step. Elementary steps are indexed in the order that they are input with 1 being the first index. Negative integers are used to designate reverse reactions. \[dictionary of string:list of integers\]