...
- Data Catalog Versions
- DEV and TEST databases have been converted to the new "Versioned Dataset" tables and code.
- Tony has updated the Dataset Crawler to use the new tables and it is running in DEV
- Warren is making progress converting L1 to use versions.
- Dataset registration and querying works with the new code, at least in example tasks
- Jim Chiang is having problems with registration in his task
- I'm looking into this now
- Jim Chiang & Co were having trouble with Dataset querying yesterday, but Tony made a change that should have fixed this
- The new code is meant to be backward compatible to existing tasks
- This seems to be generally true except for Jim's task
- We would love to have more people exercising their tasks in DEV to confirm these claims
- Migrating the data in DEV (550k Datasets) took 22 minutes
- If this scales linearly, it will take 7.5 hrs to migrate the 11M Datasets in PROD
- Would be really nice to clean up datasets that no longer exist, in order to speed the migration
- Tony ran a special crawl on NFS files last night to identify those that no longer exist on disk.
- 91k files were identified.
- Not enough to make a big difference
- Would be nice to do this in XROOT, but an unresponsive node could give us false positives (?false falsesor would that be 'positive falses'?)
- Run Quality
{"serverDuration": 57, "requestCorrelationId": "49a037e6e1a9e406"}