What?
I'd like to split the ProcessInstance table to only have information common to both 'BATCH' and 'SCRIPT' processinstances.
This should significantly speed up row retrieval for a lot of operations where the other information is not needed (Especially because of the WorkingDir and LogFile columns), increase database performance
Things to do:
Make a BatchProcessInstance table with the following columns:
- ProcessInstance
- JobSite
- JobID
- SubmitDate (Maybe should just be in ProcessInstance even though Jython scriptlets will never use it)
- CPUSecondsUsed
- MemoryUsed
- SwapUsed
- ExecutionHost
- WorkingDir
- LogFile
This should nearly eliminate row migration in the ProcessInstance table which should really, really help index performance (especially with partitioning)
We can easily do a left outer join between the two tables to get the same thing as the processinstance table.
After that, things we should do:
- Simplify ProcessInstance class in org.srs.pipeline.server.sql to correspond to the reduced ProcessInstance table
- Create a new BatchProcessInstance class that corresponds to the join of ProcessInstance and BatchProcessInstance
- We need to deal with Row migration/chaining due updating the logfile directory. Options are:
- PCTFREE should be tuned
- A temporary table to hold BatchProcessInstance data until we get LogDir, when we get that, insert it into BatchProcessInstance
- ?
This site has some interesting information: http://www.akadia.com/services/ora_chained_rows.html