|
|
|
|
|
Hi, I am an information scientist at UC Data Archive & Technical Assistance (UC DATA) within the Institute for the Study of Societal Issues (ISSI), University of California, Berkeley
Recent stuff: In July 2009 I co-chaired (with Jussi Karlgren of the Swedish Institute of Computer Science and Noriko Kando of the National Institute of Informatics, Tokyo) the SIGIR 2009 workshop Information Access in a Multilingual World, held in Boston – the workshop website is here. My workshop paper “Romanization – An Untapped Resource for Out-of-Vocabulary Machine Translation for Cross-Language Information Retrieval” may be found here.
In May 2009 I presented on the topic of “Combining Statistics and Text for a View of Irish Cultural Heritage” at the IASSIST 2009 conference in Tampere, Finland. The presentation can be found here. A paper on the topic is being written.
In fall 2008 I was a visiting scholar at the Institute of Information Science and Speech Technology at the University of Hildesheim in northern Germany. My hosts were Professors Thomas Mandl and Christa Womser-Hacker. I consulted with graduate students and presented lectures on “History As Events In Time And Space: Biography In Context”, “Multi-genre Search Using Common Geography”, and “How can you use a Japanese-English Technical Lexicon for Phonetic Matching?” (latter presented Oct 7, 2008 at the German “Lernen, Wissen & Adaptivität” conference at Würzburg University).
During the summer of 2007 I was a visiting researcher in Tokyo at the National Institute of Informatics working on the creation of a bilingual Japanese-English technical lexicon (dictionary) from the NTCIR scientific text collections, NTCIR-1 and NTCIR-2. My final presentation may be found here. A paper on the project was presented at the Language Resources and Evaluation Conference (LREC 2008) in Morocco in late May 2008. The lexicon can be found here.
Somewhat older stuff:
Statistical and Bibliographic resources
I am co-PI on a grant “Context and Relationships: Ireland and Irish Studies” from National Endowment for Humanities and Institute for Museum and Library Services (Oct 2007-Sept 2010). The UC Berkeley press release about the project can be found here 15 November, 2007 - UC Berkeley NewsCenter.
Current grant home pages:
Even older, but still neat stuff:
1. Geotemporal Search of Russian and Hindi document collections (uses TimeMap interface), writeup found here. 2. California Latino Demographic Data Book, 2004 4. Language distribution of the Web
|
|
|
|
|
|
|