Connect the Dots
From LoTVWiki
Contents |
[edit] Introduction
Connect-the-Dots (CtD) is a tool (or set of tools) that makes it easier to create associations between chunks of information, and to navigate that information in terms of those associations. CtD provides a partial shift in control from the creator of content to the users of that content.
This document describes what exactly we are trying to accomplish with Connect the Dots and how we hope to accomplish it.
One way to think about CtD is as the next step in the evolution of online content: First there were BBS sites; then we had Telnet and Gopher. Moving up we get the first basic Web HTML protocol which refines over time, adding multimedia, advertising, spam, and endless recursive porn links. Search engines step into the fray to help the user find what they want, and meta-web efforts launch around the globe to try to make sense of the mess. Connect the Dots steps in and gives users more control, and the information web begins to really look like a web -- with users connecting the dots.
[edit] The Problem
There is a lot of information in the world, and a growing percentage of it is available online in the form of Internet web pages. From the very beginnings of the ‘net, developers have been working on ways to find what they are looking for amongst the vast sprawl of data.  As it exists now, web pages provide a means to navigate from one page of content to another via hyper-links. This is a good way to navigate in one local area of ‘net space, but it is difficult to travel very far in this one-step-at-a-time manner. Another difficulty with these links is the rigid nature of them -- a link is created by the content owner and you are at their mercy as to the quality or utility of that link.
Search engines (e.g. Google) have stepped into this limitation and give us the ability to look for pages that match our interests. They do this by automatically (in most cases) searching the entire internet and try to determine what each page is “about”. This is complicated by the limits of machine understanding, and by page storage mechanisms (e.g. databases and dynamic content) that may elude the search engines entirely.
There has been an effort for the last decade or so to encourage people to provide internal semantic content on their web pages, to make pages easier to understand and locate, but this both requires the content owners to do more work than they need to now, and it also trusts those content owners to be honest and do a good job of it.
Other developers (e.g. Del.icio.us) have done an excellent job of allowing users to provide their own semantic content for web pages, albeit in a very narrow sense, providing a database of one-word content “tags” associated with pages on the ‘net. Users can tag pages for their own use, for other people, they can share tags, and generally enrich the semantic content of the webpages they enjoy. However, searching or filtering web pages is a one-layer proposition (e.g. you can see all web pages for “high voltage” but you can not then see the subset of pages “tesla” or “cockroft-walton” beneath that). Tags also do not allow us to provide commentary on the links between pages, to describe the nature of the link between two pages.
On a different front, developers have created a solution to allow users to dive into the content and edit it (e.g. Wikis, such as MediaWiki, used behind Wikipedia), opening up content creation from a few “elite” users to anyone with a keyboard and a ‘net connection. Of course, this editability only applies to select content areas and users are still limited by the navigational system of hyper-links.
Information does not stand along in a void, but exists in reference to related information. Also, while one information provider may be very very good at some aspect of their information, they will not have (or, if they do, provide) all of the information someone may be looking for.
It seems that all of the efforts to this date, all of the search engines, tag databases, and Wiki interfaces, strive towards a common goal: To provide a way to give the Internet user the ability to structure the information they find, to modify this information, and to extend it in various ways.
One area where nobody has broken through to the mainstream at all is in the area of filtering (e.g. Slashdot moderation model), where the Internet user can choose to not even see low-quality content, or content that fails in some valuation. Users are sometimes stuck wading through page after page of junk in order to find the gem of information they desire.
Here are some explicit examples where enhanced linking, tagging, and filtering of information can come in handy:
- The US Federal Budget is a complex web of bills, measures, sponsors, and lobbyists, all working in a miasma of obfuscation. An information management tool to tie them all together could shine a bright light on the inner workings of government.
- Laws and bills and many things legal are an impenetrable mass of raw data, which when interpreted in light of case law and various legal precedents and principles, begin to make sense. A structural and informational overlay on the legal system could make it easier to understand and navigate.
- Help systems and online tutorials could be organized by this information management tool, giving them structure and allowing the users of these systems to modify and extend them in ways that the original creators might not envision.
- Craftspeople and Hobbyists of all stripes (e.g. Makers) have a number of excellent lists of project information, and can find all kinds of theoretical knowledge and material suppliers through various websites (e.g. Wikipedia, Instructables) and by using search engines. However, an information tool like the one described here could put an organizational overlay on it all, linking practical projects with theoretical discussions, with recommended suppliers; all filtered by level of difficulty, tools used, or general usability rating.
[edit] The Solution
While the Connect the Dots system is a new thing in and of itself, sometimes it is easier to describe new things in terms of the familiar. So let’s start do that.
CtD is a database that can hold references to content (transclusions, in some terminologies, or simply URLs to pages) plus actual content. Where CtD holds actual content, that information is editable by the users of the system (e.g. like MediaWiki).
Associations are considered to be “real things” in CtD, and users can create links between existing content pages as content in the database.
User identities are also a form of content.  CtD allows users to associate content tags to web pages (e.g. like Del.icio.us) as well as to tag the links between web pages, and even users. Tags can be simple names (“HighVoltage”) and simple names that are better suited to associations (“Theory” and “Instructions”). Tags can also have a positive or negative valuations or ratings (“Sucks” -1, “Excellent” +1), which should be familiar to Slashdot users. Finally, tags could even attach name:value semantic fields to content, and in fact some semantic fields will be created behind the scenes (e.g. creation and editing users and timestamps).
A tag with content attached looks a lot like a comment or kibitz. Or, commentary could be created in Wiki form as a separate page and tied to the comment’s target with an association.
CtD acts like a content management tool, applying content control and versioning to its data (e.g. Perforce, CVS).
When you browse the Internet using CtD, it looks exactly like your current web browser because it is your current web browser, at least until you decide to annotate the web (e.g. like Del.icio.us) or step back to view the connections. Then, in the Dots viewing mode, it looks a lot like a Flash or Javascript application running in your web browser. Future versions might be stand-alone applications or variations on browser technology.
CtD lets you search for information using tag-based step searches, and it also lets you filter content based on various criteria (not unlike Slashdot’s moderation system).
This last bit needs further illustration in an example. If you don’t want a project that requires soldering, for example, filter against “soldering” as a tag or “skill:soldering” as a semantic attribute. If you want to learn to solder, search for soldering tags or skills, or even for associations such as “HowTo” that point to “soldering”-tagged pages.
Internally, CtD is a database. That database could reside on any number of online servers, such as MakerBrain.com, or it could be a private database on a user’s machine.
[edit] Next Steps
While many pieces of CtD already exist in the world, and some of these pieces even exist in an open-source arena, creating this tool requires careful design work and skilled implementation; and these, in order to be reliably performed, require dedicated resources. And these take time and money.
1. Formalize the design of CtD, in terms of:
a) Database structure (which must be extensible, flexible, and have published API in order to encourage third-party support). b) Capabilities and Functional Requirements c) User Interface initial design for the Dots browser, the tagging and commentary system, and for filter and search management. d) Identity management
2. Implement a bare-bones prototype with the necessary core concepts:
a) Content transclusion and at least limited sourcing b) Tagging and Associations c) Dots browser
3. Distribute the prototype into several test cases, and usability studies. 4. Re-design the nasty system that we were embarassed to have sent out using feedback. 5. Re-implement into a production system 6. Send out into the world in its various guises.
