SLAC Document Management Meeting 5/24/2005 JO (Joseph Olszewski) NG (Norman Graf) JM (Jeremy McCormick) WH (Wayne Heiser) RM (Ruth McDunn) SM (Stuart Marshall) AC (Andrea Chan) TM (Tom Markiewicz) --- SM: [slides] Agenda 1) review work done so far by Brian, Andrea, Ruth in this area 2) summarize the known requirements / desirements 3) enumerate known solution space parameters 4) come up with names and expectations for membership in the subgroup 5) plan the membership for the group and course TM: Started to make talk based on agenda -- files, emails, links, test servers, etc. SM: RM and AC can summarize work thusfar. RM: Starting 1 year ago, looked at Sharepoint for policies for internal and other. Test server running with MS portal -- way of aggregating Sharepoint sites. Started to wok with TM to come up with talk/calendaring system. Tried to force req. of ILC into Sharepoint, but didn't work very well. Knew it would take some programming time. Still some ppl @ SLAC who are interested in using Sharepoint. Relatively inexpensive because "bundled" with other MS infrastructure. Did put together a site for Policy Steering Committee. May or may not exist. Tried to create site for policy repository so not scattered around at various sites -- built on Sharepoint. Once things are finished can go into policy site. Only other software that I have looked at is Plone. SM: Setup a site? RM: Yes. Unix/Linux ppl interested setting up on Linux @ SLAC. SM: Have also setup a Plone site. RM: Personal opinion is that any could work, but nothing will work "out of the box". Pay for proprietary programming (MS) or Python/Zope. Can do some customizations "out of the box" but doing what TM wants is above my ability. SM: Which add-on packages? RM: Installed calendaring system and a few others that came bundled with Windows installation. Didn't do that many add-ons. Did attempt to add many add-ons to Sharepoint. SM: Only document management system saw for Plone is Railroad. Plone intended for digital movies, images, etc. But capable of storing anything. RM: Seems pretty powerful. (Plone) RM: Livelink, Interwoven = bad. TM: International need for this type of software. "Reinventing the wheel" in many places simultaneously. Come to grips with that. SM: Nothing works "out of the box". TM: Tool chosen should be configurable. Upper level management chooses what needs to same at top-level and what should be customizable down the line. Can be hard to customize/configure. RM: TM has specific ideas in mind. If don't have specific ideas, maybe some existing solutions may work "as is". So two different ways to approach. Call from Marty about calendaring. Merging various calendars into Outlook. But non-Outlook users want something not tied into it. WH: More content management varies from strict content management, harder it is to customize existing solutions. Get spin-offs with different solutions. Most web design studios claim to have a CMS but each is different. TM: A lot of tools -- list of tools tried at Kavli. CERN site. BaBar hypertext and meeting mgr. Karen Heidenreich has written a complete meeting system in ASP. Plone FNAL site but nothing about meetings or working groups. Labelled as Doc Mgt System. Email from Neil Calder and Judy Jackson saying Zeno Media to create ILC site. Commercial ASP products and built my own. Somehow "picking the right solution" is not the answer -- codify some solution over a geographical area. SM: May want to fragment, because what is right for one group is not right for another. RM: No such thing as perfect solution for everyone. Which one most easily adapted. TM: Look for monolithic solution or "bunch of widgets" with Doc Mgt components added on. Having played with, found that usually just file system + database / meta-data. AC: Looking at it a few months ago, all kinds of homegrown solutions @ SLAC -- BaBar, LCLS. Tom Glanzman did his own. From SCS side, looked like supporting all kinds of solutions. Need to come up with at least one "good thing" to support. If you can live with it, get good support, otherwise pay for support yourself. Migrating from Oracle to MySQL. Various Doc Mgt systems e.g. Babar, (?) -- keep proliferating. Don't forsee being able to maintain as programmers come and go. CAD group needs own solution by department. Can't let anyone else into SQL database. Need wrapper around Sharepoint. 300 GB of data in 3 years. Richard willing to fund a central solution whether it is Sharepoint or something else. Summed up -- two parts. Web collaboration tools -- LCLS, ILC, etc. Interests coincide with SLAC "stable population" then this would be great, because SLAC needs something. Can pull some of the project ppl in, this would be ideal, but have experimental needs that they need to respond to. Should have calendaring, discussion groups, webcasts, video conferencing and some other things. One request is to be x-platform. Interwoven dropped out. Livelink seemed most capable. Also started to look at OS software such as Plone. Different look-and-feel w/ Sharepoint, plus different interface between administrative functions on Linux and Windows. Content management. Looking at something that could take from web collaboration tool to something that expands into content management. SM: Web content? AC: Yes. SM: Mailing lists? AC: Yes, should be under same system. Can have different "protections". RM: Sharepoint ok just for SLAC, but with international collaborations need way to have accounts w/o having to pay additional fees (?). TM: Great for SLAC to have institutional tool for its needs. Might be impossible for international/national projects for non-SLAC projects to ever come to a conclusion. Maybe way to parse problem is that ILC, ESST will "fend for themselves". Maybe SLAC should concentrate on its own needs. SM: Group of ppl all on site can much more easily come up with something but off-site can cause problems. Maybe have a suite of solutions taylored for different needs. Can't blame ppl using SolidEdge, because they want a working solution immediately. AC: But really have no choice. SM: For LSST, there will be a lot of engineering documents. Planning to use Kinesio. DB for storing CAD drawings. Standalone solution for a fairly limited group of ppl. TM: Maybe the requirement is the ability to "interface" to another DB on another server. New system interface to old system, e.g. SLAC address database. Should be able to link/export e.g. mechanical drawings, ppl. SM: Provide read-only interfaces. TM: If define fields for all documents, then can have consistent set attached. Other systems can then "suck it up" when needed. NG: Do this up-front. GLAST found couldn't extract info from a commerical database. TM: Want to "own" the data. Want to be able to move data from one DB to another. WH: Depends on way the metadata is represented -- complex. Problem is that can't always get human-readable text dump. Various needs -- need daily info dumps? connection to live system? Institutional data can be difficult to get e.g. list of ppl. Setting up content database but having trouble finding list of names. Agreeing on metadata can be hard. In OpenText demo, can work with various categories. Until spend time creating the categories, it doesn't work. SM: How many FTE's on this? Currently have X amount of FTE's on on mailing lists, GLAST/BaBar systems, etc. TM: Must be at least two ppl working full time on content management. AC: BaBar and GLAST "pay themselves" so can customize/build for them. AC: Web collab / CMS -- 1 FTE divided into both. Sounds reasonable? SM: Doesn't sound like enough. In OS scheme, need to have enthusiastic programmer. Would guess that need 1 person who understands "big picture" to do technical stuff and assign other tasks to other ppl. Need to have other ppl available. No matter what is picked, then if not support... TM: Everything is going to be different in the end. Want ppl to program in various languages. Unify knowledge base of contacts. "Produce ASP solution to the problem." AC: Both looking for stable support via SCS and planning/design. Working with the Tony Johnsons, Karen Heirenreich, etc. Want to avoid straight copies/forks to system. SM: Different technical work in different groups. AC: SCS could provide solutions for "generic" users. Need caretakers of the software in various groups. SM: Worried about 1 FTE. 1 FTE + others "on demand". SM: [FNAL ILC site demo] RM: Plone has calendaring but missing things e.g. no concept of standing meeting. TM: Tried to setup similar things but doesn't have features out of the box. Arbitrary list of ILC presentations in returned list. Private/published/under construction status. TM: Meeting and attach talks to the meeting. SM: Plone with CDS. TM: Email from Judy about need for collaboration software. [CERN ILC site demo] Calendar in CDS system. SM: Separate product. TM: At CERN, see all of LHC meetings in tree format. [ILC CDS demo] TM: Change query to unify results of search. Concept of speaker or author. NG: Can export to calendar system using CDS: "export to personal scheduler". Export to Outlook calendar, save as .vcs . Elements in Outlook calendar are vcals. RM: Install CDS? SM: Tried install on Debian. Need to recompile parts of MySQL and PHP -- was getting close. NG: One attraction of CDS is that FNAL/DESY/CERN already using it. SM: Could maybe use for documents. NG: Maybe use Plone with CDS. TM: Let's say CDS does 80% of features. Want to read from MySQL database for additional features. XenoMedia put together Quantum Diaries system. RM: Also did Light Sources. TM: ex-FNAL programmers using content management and selling back. Could have ppl look at different solutions. SM: Should have discussion about known solutions. RM: Have hard time conceiving that a product serving TM needs will serve WH's. TM: Not so sure. Everything a file with metadata. Anything that is mechanical drawing needs these attributes. Attributes by document type. Define in relational database. Common for everyone. WH: Conceptually simple, yes, but solution can be hard. SM: Never have a problem that DB can't handle. WH: DB paradigm can breakdown with difficulty of problem. File system with DB can break down. Create metadata when create file. Labor intensive to manually create metadata. Automatically extract metadata. Interface with MS Office's standard file properties. Looking at all the different ingredients can make complex solution. SM: Way beyond what is needed generically. WH: May need fulltime "librarian". Otherwise, might not get consistent results. NG: Difference between archiving and maintaining history on documents. CDS allows to put up talks but suppose two versions? Does not handle it gracefully. SM: Groups have different concepts. NG: Put up Word doc but Linux users complain. Do conversion? WH: Focusing more on ppl and less on products. So many solutions to look at and such a range of things to cover. Better sense of problem space. Experts in various areas: MySQL, Python, PHP, etc. TM: Homebrew solution: figure out "gotchas". [SLAC ILC site demo by TM] Used Frontpage. Meetings have agendas. Talks available on fields. Could have uploaded talks to known place. All meetings database. 20 different working groups. Can search on individual groups. Or look at talks database ordered by date. Gives meeting where it was given. Can add new meeting. Track by standing meeting, working group, speaker, etc. Talk like a paper or a diagram. Tagged by meeting information. As long as RDMS is setup correctly, then can always extract metadata later. WH: Not so much whether can do or not but whether it is implemented. Plone not necessarily built on all Zope plugins. Pick among different features that already exist. RM: Knew could develop meeting-type system. But get "feature creep" e.g. mail notice, non-SLAC ppl, etc. NG: Probably end up with multiple solutions. Lots of different problems. Can't solve all problems for all constituencies. Smaller pieces solving various problems e.g. mailing lists vs. calendars. SM: Hard to imagine one solution that solves all the problems. Because not best choices for individual areas. Doc Mgt not a closed group so should get input from all here. Need ppl willing to "do real work". Charged with finding ppl to do real work. When get around to deciding about work to do, ppl need to be picked. How get membership and get programmers? NG: Above ~12 ppl in group, then "intractable". Do want to keep it open and let ppl volunteer. ppl don't let imposition. AC: Suggest to include someone from SSRL and LCLS. Can try to find a LCLS person approved by mgmt. Martin George from SSRL. SM: Statement from SCS on amount of support can deliver. Assumption that different groups would assign a person. If working for KIPAC, can do work on it. But need backend/engine/http server/DB supported. Anyone earmarked for this? AC: Right now, Brian and myself interested in this. Already doing different things for different groups. SM: Next meeting? Few more ppl. TM: Each look at different things? SM: Should try to explore options. Within group have agreed to look at Plone system. Found Typo3 today. Whole bunch of CMS/collaboration systems. Need someone to "adopt" and take care of it. WH: Need someone who is enthusiatic and has time. TM: Need to make sure ppl are hired for this area. SM: Not willingness to assign 50 hours a week to the problem, it probably won't work. May take a lot more time initially. Design something "perfect" but then might not happen. AC: Already spending efforts elsewhere and at deadends. Need to find 1 or more solutions that work for most ppl. TM: Zope? JM: Highly configurable. Built on Zope/Python -> CMF -> Plone. WH: Isn't a very layered approach. Need Systems Analyst. Layered approach is not happening at SLAC. Limited access to basic infrastructure e.g. need to run SQL Server for Sharepoint but SCS can't do it. Need experts in various areas. Might not make sense to have developers at SLAC. SM: Document management is main concern. Railroad seems like only solution. Upload certain document, be nice to get some automatic metadata. Place to put documents and have metadata. Can search metadata later on. [LSST Knowledge Center / KnowledgeBase demo] Users want to see only their content. TM: OpenText does this. SM: non-SLAC users need password-protected way to access. TM: Maybe build the "frontend" that ppl want to see. WH: Really discussing wishlist rather than requirements. RM: Sharepoint ok on most recent version but x-browser.