Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Data Catalog Versions
    • DEV and TEST databases have been converted to the new "Versioned Dataset" tables and code.
    • Tony has updated the Dataset Crawler to use the new tables and it is running in DEV
    • Warren is making progress converting L1 to use versions.
    • Dataset registration and querying works with the new code, at least in example tasks
      • Jim Chiang is having problems with registration in his task
        • I'm looking into this now
      • Jim Chiang & Co were having trouble with Dataset querying yesterday, but Tony made a change that should have fixed this
        • Jim, can you confirm?
      • The new code is meant to be backward compatible to existing tasks
        • This seems to be generally true except for Jim's task
    • We would love to have more people exercising their tasks in DEV to confirm these claims
    • Migrating the data in DEV (550k Datasets) took 22 minutes
      • If this scales linearly, it will take 7.5 hrs to migrate the 11M Datasets in PROD
      • Would be really nice to clean up datasets that no longer exist, in order to speed the migration
        • Tony ran a special crawl on NFS files last night to identify those that no longer exist on disk.
          • 91k files were identified.
            • Not enough to make a big difference
          • Would be nice to do this in XROOT, but an unresponsive node could give us false positives (?false falsesor would that be 'positive falses'?)
  • Run Quality