A tentative Chinese
Dependency Treebank
We
are building a tentative Chinese
Dependency Treebank for verifying the coverage and capability of a pure
dependency grammar to describe contemporary Chinese. The small treebank will
also be used as an experimental means for syntactic study and applications
of computational linguistics.
- Current size:
711 sentences, 20034 tokens, POS tagset 24 tags, dependency label tagset
53/34 tags.
- Linguistic format:
pure dependency structure.
- License:
only for internal use.
- Format:
- plain
text with TAB as separator
- TIGER-XML
- Malt-XML
- MS Access
database
-
- Encoding: GB-2312, Unicode
- Contact:
- Prof.
Dr. LIU Haitao, Applied Linguistics Department, Communication University
of China, CN-100024, Beijing, P.R. China.
- Email:
byliuhaitao at cuc dot edu dot cn