What?

I'd like to split the ProcessInstance table to only have information common to both 'BATCH' and 'SCRIPT' processinstances.

This should significantly speed up row retrieval for a lot of operations where the other information is not needed (Especially because of the WorkingDir and LogFile columns), increase database performance

Things to do:

Make a BatchProcessInstance table with the following columns:

  • ProcessInstance
  • JobSite
  • JobID
  • SubmitDate (Maybe should just be in ProcessInstance even though Jython scriptlets will never use it)
  • CPUSecondsUsed
  • MemoryUsed
  • SwapUsed
  • ExecutionHost
  • WorkingDir
  • LogFile

This should nearly eliminate row migration in the ProcessInstance table which should really, really help index performance (especially with partitioning)
We can easily do a left outer join between the two tables to get the same thing as the processinstance table.

After that, things we should do:

  1. Simplify ProcessInstance class in org.srs.pipeline.server.sql to correspond to the reduced ProcessInstance table
  2. Create a new BatchProcessInstance class that corresponds to the join of ProcessInstance and BatchProcessInstance
  3. We need to deal with Row migration/chaining due updating the logfile directory.  Options are:
    • PCTFREE should be tuned
    • A temporary table to hold BatchProcessInstance data until we get LogDir, when we get that, insert it into BatchProcessInstance
    • ?

This site has some interesting information: http://www.akadia.com/services/ora_chained_rows.html