Introduction

Attention

Please use CKIPNLP for structured data types and pipeline drivers.

Contributers

Requirements

Attention

For Python 2 users, please use PyCkip 0.4.2 instead.

CKIPWS (Optional)

CKIPParser (Optional)

  • CKIP Parser Linux version 20190506+ (20190725+ recommended)

Installation

Denote <ckipws-linux-root> as the root path of CKIPWS Linux Version, and <ckipparser-linux-root> as the root path of CKIPParser Linux Version.

Install Using Pip

pip install --upgrade ckip-classic
pip install --no-deps --force-reinstall --upgrade ckip-classic \
   --install-option='--ws' \
   --install-option='--ws-dir=<ckipws-linux-root>' \
   --install-option='--parser' \
   --install-option='--parser-dir=<ckipparser-linux-root>'

Ignore ws/parser options if one doesn’t have CKIPWS/CKIPParser.

Installation Options

Option

Detail

Default Value

--[no-]ws

Enable/disable CKIPWS.

False

--[no-]parser

Enable/disable CKIPParser.

False

--ws-dir=<ws-dir>

CKIPWS root directory.

--ws-lib-dir=<ws-lib-dir>

CKIPWS libraries directory

<ws-dir>/lib

--ws-share-dir=<ws-share-dir>

CKIPWS share directory

<ws-dir>

--parser-dir=<parser-dir>

CKIPParser root directory.

--parser-lib-dir=<parser-lib-dir>

CKIPParser libraries directory

<parser-dir>/lib

--parser-share-dir=<parser-share-dir>

CKIPParser share directory

<parser-dir>

--data2-dir=<data2-dir>

“Data2” directory

<ws-share-dir>/Data2

--rule-dir=<rule-dir>

“Rule” directory

<parser-share-dir>/Rule

--rdb-dir=<rdb-dir>

“RDB” directory

<parser-share-dir>/RDB

Usage

See http://ckip-classic.readthedocs.io/ for API details.

CKIPWS

import ckip_classic.ws
print(ckip_classic.__name__, ckip_classic.__version__)

ws = ckip_classic.ws.CkipWs(logger=False)
print(ws('中文字喔'))
for l in ws.apply_list(['中文字喔', '啊哈哈哈']): print(l)

ws.apply_file(ifile='sample/sample.txt', ofile='output/sample.tag', uwfile='output/sample.uw')
with open('output/sample.tag') as fin:
    print(fin.read())
with open('output/sample.uw') as fin:
    print(fin.read())

CKIPParser

import ckip_classic.parser
print(ckip_classic.__name__, ckip_classic.__version__)

ps = ckip_classic.parser.CkipParser(logger=False)
print(ps('中文字喔'))
for l in ps.apply_list(['中文字喔', '啊哈哈哈']): print(l)

ps.apply_file(ifile='sample/sample.txt', ofile='output/sample.tree')
with open('output/sample.tree') as fin:
    print(fin.read())

FAQ

Danger

Due to C code implementation, both CkipWs and CkipParser can only be instance once.


Warning

CKIPParser fails if input text contains special characters such as ()+-:|&#. One may replace these characters by

text = text
   .replace('(', '(')
   .replace(')', ')')
   .replace('+', '+')
   .replace('-', '-')
   .replace(':', ':')
   .replace('|', '|')
   .replace('&', '&') # for tree draw
   .replace('#', '#') # for tree draw

Tip

The CKIPWS throws “what(): locale::facet::_S_create_c_locale name not valid”. What should I do?

Install locale data.

apt-get install locales-all

Tip

The CKIPParser throws “ImportError: libCKIPParser.so: cannot open shared object file: No such file or directory”. What should I do?

Add below command to ~/.bashrc:

export LD_LIBRARY_PATH=<ckipparser-linux-root>/lib:$LD_LIBRARY_PATH

License

CC BY-NC-SA 4.0

Copyright (c) 2018-2020 CKIP Lab under the CC BY-NC-SA 4.0 License.