1
|
- Chu-Ren Huang (Academia Sinica), Feng-ju Lo (Yuan Ze University), Ru-Yng
Chang (Academia Sinica), Sueming Chang (Academia Sinica)
|
2
|
- Language and Knowledge
- Ontology
- WordNet
- The Academia Sinica Bilingual Ontological Wordnet
- BOW and Tang Knowledgebase
|
3
|
- 語言的功能為何?
- What is the ultimate function of Language?
- To Communicate-- to send and receive information (and mis-information)
- 溝通:傳送與接收訊息
|
4
|
- Language encodes information, and decodes knowledge (= acquired
information)
- 溝通包括了訊息的傳遞與知識的接收
- Language mediate what we know (word knowledge) and what is there to be
known (world knowledge)
- 語言是個人知識與天下知識間的媒介
- --a knowledge-based lexicalist view
|
5
|
- By adopting a lexicalist
approach, we assume
- 1. That a person’s lexicon contains a set of conventionalized terms
which are conceptual atoms to that person.
- 2. That knowledge and conceptual structure is lexically accessed
|
6
|
- A speaker’s word knowledge is the comprehensive interface to his/her world
knowledge
- àThe lexical
knowledge from Tang texts represents a window to access the knowledge
about the Tang Cilvilization
|
7
|
- Ontology本體論 (in Philosophy, the original meaning)
- A theory/description of the nature of existence, which predicts what
types of things exist.
- Ontology 知識本體
- (In computer science, the meaning in use)
- A document defining all (conceptual) terms in a system by describing all
their relations.
- -An ontology usually contain a taxonomy of elements, as well as some
inference rules.
|
8
|
- Ontological Variations
- Sources: culture, domain, environment, ethnicity, media, science,
society…
- Instantiation and Representation: Shared linguistic use; i.e.
Sub-language and Sub-lexicon
- àAn ontology of Tang
Civilization will be enlightening as the knowledge system of Tang.
|
9
|
- SUMO Atoms
- Concepts: around 1000
- note that concepts are not necessarily linguistically realized
- Relations (ISA): See SUMO Graph
- Axioms: for inference
|
10
|
- Entity
- Physical
- Object
- SelfConnectedObject
- Region
- Collection
- Agent
- Process
- Abstract
- SetOrClass
- Relation
- Quantity
- Attribute
- Proposition
- Graph
- GraphElement
|
11
|
- LingOnto: The conceptual structure that all speaker of a language
implicitly adopt
- Each major language has a LingOnto with more expert users than any
constructed ontology
- Linguistic Ontology can be successfully constructed from a
(sub-)language with good LT
- Linguistic Ontologies can be used to signal and detect language
variations and changes
- (The linguistic anchoring
project of Taiwan’s NDAP)
|
12
|
- WordNet 1.7.詞彙網路 1990
- www.cogsci.princeton.edu/~wn/
- Monolingual: English
- SUMO: Suggested Upper Merged Ontology
- http://ontology.teknowledge.com
- Open resource created under an initiative from IEEE Standard Upper
Ontology Working Group
|
13
|
- Wordnet Atoms
- Formal Atoms: Lemmas
- Content Atoms: Senses
- Relation Atoms:
- Lexical Semantic Relations (LSR)
|
14
|
- Concept-driven: All words sharing the same sense form a SynSet 同義詞集
- -each synset is an instance of linguistic conceptualization
- (Note that a word is a unique pair of lemma-sense)
- Relation-based: A language wordnet is the network formed by
instantiating all LSR’s
between each synset pairs
|
15
|
- Synset: Lexicon driven concept identification
- LSR: Lexically entailed knowledge inference
|
16
|
- antonymy 反義關係
- hypernymy 上位關係
hyponymy 下位關係
- holonymy 整體-部份關係
meronymy 部份-整體關係
metonymy 轉指關係
- near-synonymy 近義關係
synonymy 同義關係
troponymy 方式關係
|
17
|
- Sinica BOW
- http://BOW.sinica.edu.tw/
- Towards a linguistic infrastructure for knowledge representation and
knowledge engineering
- English-Chinese translation equivalence database
|
18
|
- English-Chinese Translation Equivalents Database
- -includes all WordNet entries
- -manually checked, up to 3 Chinese translation for each English entry
- SUMO as upper ontology
- -C-E bilingual ontology nodes
- -lexical-conceptual link
- Domain Tag (under constr.)
|
19
|
|
20
|
- To give each linguistic form a rigorous conceptual location,
- To clarify the relation between conceptual classification and linguistic
instantiation, and
- To facilitate genuine cross-lingual access of knowledge.
- To provide the basic infrastructure for construction of different domain
bilingual ontologies
|
21
|
- Over 1,000 Conceptual Nodes (from SUMO, in English and Chinese)
- Over 100,000 English synsets (each contain more than one lemmas, from
WordNet)
- Chinese translation equivalents for each lemma in Wordnet
- Links from each lexical lemma to conceptual node on ontology (E & C)
|
22
|
- Sense-based English-Chinese translation equivalency
- From English Word-Sense to Ontology and Inference
- From Chinese Word to Ontology and Inference
- From Word-Sense to Domain (to be completed)
|
23
|
- The Shakespearean-Garden Approach
- In a Shakespearean garden, all Shakespearean plants (i.e. the collection
of all plants that were referred to in Shakespeare) are grown together.
- It replicates the botanic knowledge and flora experience of
Shakespearean England
- We propose to collect domain lexicons form an archive and to ‘grow’ them
to an ontology.
|
24
|
- The archives of Tang 300 poems
- All lexical items are tokenized and categorized
- Three special sets of lexemes are collected to represent knowledge of
three different domains
|
25
|
- Translation to English are obtained through Sinica BOW and other
references
- Direct linkings to SUMO conceptual nodes are made through BOW
- Sub-ontology is made based on the above linking to attest to the fauna,
flora, and artitacts of Tang
|
26
|
- 53 animals are referred to in Tang 300. 23 of them are birds.
- 水棲哺乳動物 *No marine mammals
- 有蹄哺乳動物 馬 牛 羊 駱駝 斑騅 鹿
- 有袋類 *No marsupials
- 肉食性動物 羆 熊
- 犬科動物 狼 豺 犬
- 貓科動物 貙 lynx 貔 虎 tiger
- 囓齒動物 鼯 蝙蝠
- 靈長類
|
27
|
- 動物類 Animals in Tang 300
- Following the SUMO/BOW structure
- http://bow.sinica.edu.tw/ont/ts300_ont.html
|
28
|
- Link a term directly to the ontology node if its use in the Tang Poem
fits the sense explanation on wordnet
- If a term is used to refer strictly for the materials etc. but not the
animal, it is not put on the animal ontology 玳瑁樑 (sea turtle shell, but
see the radical and context.)
- However, when a terms refers to the non-animal meaning using knowledge
of the animal, then the concept exists, and the term is add to the
ontology.
- 雙鯉 letter, but refers to the form of the carp
|
29
|
- No marsupials: only found in Australia, and only found much later
- No marine mammal: Tang civilization activities mainly stays on land
- Large number of birds among mammals, and the dominance of insects 昆蟲
among invertebrates 無脊椎動物
- Tang civilization’s fascination with flying
- [Birds fly. And
insects are the invertebrates that have wings.]
|
30
|
- An Online Ontology for Tang 300
- http://bow.sinica.edu.tw/ont/ts300_ont.html
|
31
|
|
32
|
|
33
|
|
34
|
|
35
|
|
36
|
|
37
|
|
38
|
|
39
|
- This presentation is dedicated to the celebration of the formation of
the new
- Institute of Linguistics, Academia Sinica
- (formerly a preparatory office)
- Comments are welcomed
- churen@sinica.edu.tw
|