Finding what you need

30 April 2002

Two comments from class members to get me started on today's subject:

The major step at finding a solution to a problem is knowing how to get started. (JM)

To be able to give a computer a problem and a stack of research articles and expect some solution seems to me an incredible direction for CS research. (TR)

...and one allusion from cyberpunk:
wetware /wet'weir/ /n./
[prob. from the novels of Rudy Rucker] 1. The human nervous system, as opposed to computer hardware or software. "Wetware has 7 plus or minus 2 temporary registers." 2. Human beings (programmers, operators, administrators) attached to a computer system, as opposed to the system's hardware or software. See liveware, meatware. [from The New Hacker's Dictionary --see www.dromo.com/fusionanomaly/wetware.html for one version]

What's the Question?

Maybe it's completely obvious, but I keep rediscovering that there's often more challenge to posing the right questions than to finding the answers once the questions are clear. In any case, while I'll mostly be talking about resources that can be useful in tracking down what others have written, I want to keep the focus clearly on (a) why we go looking for the work of others, and (b) what we should do with what we find.

It's obvious that ideas have pedigrees and histories... though sometimes (and perhaps often at the outer edges of technologies) it looks like ideas come 'out of nowhere'. Science is (and probably has to be) a collaborative activity, and is quintessentially a communication activity: people make their findings public in a number of ways, and for a number of purposes. That said, there are substantial sectors of what we might call 'practical science' (or maybe 'industry') which involve secrecy, and in which communication is circumscribed. Computer Science exhibits both patterns, probably more than any other science except perhaps that surrounding pharmaceuticals.

Anyway, there are several different information media that we need to deal with, each of which has peculiarities and conventions, advantages and limitations:

Books get taken for granted and even overlooked --very few of the books 'about' Computer Science are ever checked out of the library, and it's my impression that most of what we have is largely irrelevant to CS courses. Now WHY is this the case (or am I wrong?)? Consider what Annie tells us about 4 topics mentioned by people in the class as candidates for our attention:

an Annie search for the KW 'artificial intelligence' gets me 389 hits; a SUBJECT search gets me 342 entries, and 18 'related subjects'

KW 'neural networks' gets me 83 hits, a SUBJECT search gets 45 entries

KW 'expert systems' gets 88 hits, 70 entries, and 3 'related subjects'

KW 'natural language processing' gets 22 hits and SUBJECT gets 11 --including an e-journal called Natural language engineering

Let's look at one of these: Bioinformatics : the machine learning approach (Baldi and Brunak, MIT Press, 2001)

Periodicals or journals are another somewhat vexing problem. Look at this listing of what we have in the library... and reflect on which of these you've ever looked at. Again, the question is why... and there are some reasonable answers. But if you DO want to get at the literature, how can you do it? How can you search for "articles about" a subject that you need to research? Some disciplines have standard indexes, the obvious place to start a search on a topic, but CompSci doesn't really --now, why is that? Partly it's a matter of the very broad range of CS, partly a matter of overlaps with, say, math, engineering, etc. Buth there ARE utilities that can be helpful:

Online... the Web has wrought all sorts of evolutionary changes, still not widely understood. Take a look at these:

Patents combine revealing enough to lay legal claim to originality for an idea, providing intellectual background, and concealing the fine details that might make it possible for a reader to copy the innovation. The intellectual background can be VERY useful.

Conferences are a significant arena for some disciplines. There are two FirstSearch databases that may be worthwhile sources: PapersFirst and Proceedings

Annual Review-type publications are an important feature of many disciplines, but don't seem to have become established in CS. Why? We Do have Advances in Computers (QA76 .A24), now 55 volumes, and there is a Web page from Univ Trier that seems to offer some access electronically. The publication has recently been acquired by Elsevier...