Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Splitting the ProcessInstance table into ProcessInstance and BatchProcessInstance
    1. This drastically reduces the size of the ProcessInstance table and prevents row migration
    2. Do we remove the fullpath from the logFile column, and assume it will always be under the WorkingDirectory?
  2. Partition the ProcessInstance table by Month on CreatedDate column
    1. Partitioning requires inserting all rows into a new table.
    2. This may need to be modified to partition to Stream reference.
    3. This involves copying the table effectively, but this can be done in one SQL statement with the previous performance goal
    4. Partitioning by month seems to be easiest way to eek out extra performance of current transactions
    5. Partitioning by month also helps to theoretically boost performance of the web interface, specifically the task.jsp page if we allow users to limit the stats of processInstances to be within the last week/month(s), which currently isn't enabled.
  3. Partition the Stream table by Month on CreatedDate column
    1. Should probably be done before ProcessInstance partitioning
    2. Not sure what the best strategy is.
    3. One strategy is to partition based on Task, or a reference to a root-level task possibly. This gets a little messy.
    4. Second strategy is to partition based on RootStream. The problem with this is that it requires processing of the ancestor stream at table insertion.
      1. But maybe we could add the column to the current database, and start computing those values now. Then later, we can insert those into a new table with interval partitions
    5. Time-only partitioning requires all queries to specify a time range in order to limit search partitions
  4. Add a RootStream column to the Stream table, enable RootStream locking.
    1. Enables fast Stream Tree locking
    2. When a process instance acquires a lock, it will no longer lock on it's parent node, which is a stream. It will, from now on, lock on Stream with the primary key of the RootStream of the PI's parent stream node, which will prevent any dead locks as each change to a process instance's status will preclude the modification of any other part of the tree.
    3. Enables the canceling of top-level streams. Without this, it's impossible to know if another database connection has a lock on a child node of a top level stream.
    4. To maintain full backward compatibility, the stream table will need to be modified to populate this column, which may take a while for all streams.
    5. Should speed up rolling back
  5. Dead branch isLatest decrementing
    1. Currently, Stream Tree >= Stream Tree where isLatest = 1 >= Stream Latest Tree
    2. Proposal: When a branch is declared dead, all child nodes would be decremented by 1
    3. This changes to ( Stream Tree where isLatest = 1 == Stream Latest Tree ), which eliminates recursive queries needed to find the latest tree
    4. This requires more work when rolling back a stream, for instance, but may also be offset with the speed gained from a reduced query time.
    5. May benefit greatly status changes to a stream execution tree

...