Raghu's comments

  • AnHai: I agree. Raghu, pls don't worry too much about appearance yet, because we are still working on it. There are font/color/proportion/etc. problems all over, but we will clean them soon. Making the Wiki private is an issue. Let me talk with the folks here about it.
  • Newletter problem: Take a look at the front page today (May 5) and you'll see a problem---the newletter is way too long and distorts the front page. I suggest that we replace this with a "people in the news" instead (and truncate that with a "more" link after it fills up a certain predetermined number of lines)
  • IDEA---I can give you guys access to the dbworld login registry---this gives you a list of names that can be instantly added to dblife people (the email should be kept private). We should do this so that the dblife site reflects the dbworld community on day 1 ...
  • IMPORTANT: We SHOULD REMOVE THE WIKI LINK ASAP as a cheap way to keep this page private. If everyone on the team bookmarks the page (which I've already done), then we continue to be able to access it, and no one else can (hopefully) because they don't know about it ...
  • IMPORTANT: think about every single category of currently collected information (at the level of both URLs and content extracted from a given page) and think about how to detect/handle failure. in particular, on day 1, there should be a robust way to automatically detect broken links in our various categories of collected URLS, and there should be a way to automatically display all broken links together with their metadata, so that we can manually fix these periodically. down the road, as soon as possible, it would be good to add mass-collab user-facing pop-ups asking for help in fixing these. similarly, as people add new institutions and join dblife (see previous bullet) we need to add mass-collab pop-ups asking people to fill in missing contact info for new people, event and award URLS for the new institutions, etc.
  • people can contribute to gamma services
  • provide html/xml template, especially for bib entries
  • search within a category and display search results by category (down the road)
  • provide as many mass-collab entry points as possible in a staged fashion, e.g., early on, make it possible to provide contact info, join dblife, and add institution urls (and down the road, do more sophisticated things like fixing extracted info such as authorship, personalizing topic hierarchies, etc.)
  • add latex version of paper lists, so that people can import that directly into reports, CVs, etc. the higher level point is that joining dblife brings important benefits

Brainstorm & To Do List

  • on annotated dbworld, provide RSS feed so that someone can use it to say "whenever this person posts, please alert me".
  • for every single paper, provide a short form of bib tex that can be used in papers
  • try to expand DBLife: monitor to find new URL, given a new URL P, show it to users and ask: is this URL database related? should I add it to the system?
  • provide a feature to do collaborative curation of the data; this is so that the developers or the volunteers can help curate the data (see eg., peter buneman's work)

  • Several major things that we haven't done well?
    • newsletter
    • detect when a mention has changed
    • keyword search
    • disambiguation

System Architecture

  • crawl raw data / create crutches / create ER
    • how to do this efficiently, extend ER
  • services on top of ER
    • contextualized, alert, query, bibtext, etc. services
  • provenance, justification, feedback
  • personalization: each has own account, free to do certain things
  • mass collaboration
    • implicit feedback, explicit feedback, volunteer (ads)

