Collaborative Semantic Analysis.
Nov. 8th, 2004 08:35 pmI just had a random thought that I figured I should write down before I forget it. When
_sps_ was here on Saturday, I mentioned to him a paper I had seen on the difficulty of storing and retrieving scientific papers that are relevant to a field of research. It has gotten so bad in the field of mathematics that it is now often easier to spend a year re-solving a tricky mathematical problem than it is to find an existing paper with the solution. There is a (woefully underfunded) institute that tries to produce a controlled-vocabulary description of the semantic elements in new papers, and record them. They keep falling further and further behind.
Anyway,
_sps_ had some not-unreasonable ideas on how to encode useful indexes of these math papers so that relevant materials could be searched for. The big question is: how do you do the semantic analysis? For something like Math, you need a human, and one that understands the math as well. Plus, it would help if they just happened to know of all of the other bits of math that the paper overlapped, even if they are in other fields and use different nomenclature.
Anyway, it suddenly occurred to me that it might be possible (I'm not sure how) to design a mathematics-paper search-engine and browser which had the express purpose of eliciting from a mathematician information about the nature of the paper being studied, and how closely its contents matched that mathematicians current work. This would be done, not by asking questions, but by allowing the mathematician to categorize his searches by project, and to pay attention to how long he spent studying various sections of the paper. As well, if we provided various renaming and renomenclaturing systems, we might get further information by observing the transformations that were performed on the paper.
In the end, I would hope the gathered data from a large number of mathematicians could be used to build a fuzzy index of any given paper, and to let us build a map of which things seemed to be close to each other in a semantic space. I don't know, ultimately, how well such a system would work, but I think it would be worth giving it a try.
Anyway,
Anyway, it suddenly occurred to me that it might be possible (I'm not sure how) to design a mathematics-paper search-engine and browser which had the express purpose of eliciting from a mathematician information about the nature of the paper being studied, and how closely its contents matched that mathematicians current work. This would be done, not by asking questions, but by allowing the mathematician to categorize his searches by project, and to pay attention to how long he spent studying various sections of the paper. As well, if we provided various renaming and renomenclaturing systems, we might get further information by observing the transformations that were performed on the paper.
In the end, I would hope the gathered data from a large number of mathematicians could be used to build a fuzzy index of any given paper, and to let us build a map of which things seemed to be close to each other in a semantic space. I don't know, ultimately, how well such a system would work, but I think it would be worth giving it a try.
Re: Interesting problem of taxonomy
Date: 2004-11-09 09:22 am (UTC)