Common Lisp と日本語と文字コード
LISPMEMO
俺がやろうとしていたことをonjoさんがやってくれている!
asdf-install経由でインストールできれば幸せになれる人が多いかも。
uffi経由でlibcharguessを使えるようcharguess.lispを作ろうとしているが頓挫。とりあえず残骸でも置いておく。誰か拾ってくれ…るわけないか…
(require :uffi) (defpackage charguess (:use common-lisp uffi)) (in-package :charguess) ;; int CharGuessInit(void); ;; const char* GuessChardet(const char *str); ;; int CharGuessDone(void); (uffi:load-foreign-library "/usr/local/lib/libcharguess.so") (uffi:def-function ("CharGuessInit" %charguess-init) () :returning :int) ;;(uffi:def-function ("GuessChardet" %guess-chardet) ((str :cstring)) :returning :cstring) (uffi:def-function ("GuessChardet" %guess-chardet) ((str (* :unsigned-int))) :returning :cstring) (uffi:def-function ("CharGuessDone" %charguess-done) () :returning :int) (defun guess (from-vector) (declare (type (vector (unsigned-byte 8)) from-vector)) (%charguess-init) (let* ((len (length from-vector)) (inbuffer (uffi:allocate-foreign-object :unsigned-byte (+ 2 len))) (in-ptr (uffi:allocate-foreign-object :unsigned-int))) (loop for i from 0 below len do (setf (uffi:deref-array inbuffer :unsigned-byte i) (aref from-vector i))) (setf (uffi:deref-array inbuffer :unsigned-byte len) 10) (setf (uffi:deref-array inbuffer :unsigned-byte (1+ len)) 0) (setf (uffi:deref-pointer in-ptr :unsigned-int) (uffi:pointer-address inbuffer)) (prog1 (%guess-chardet in-ptr) (%charguess-done) (uffi:free-foreign-object inbuffer) (uffi:free-foreign-object in-ptr)))) (guess (sb-ext:string-to-octets "abc")) (guess (sb-ext:string-to-octets "日本語"))
iconv.lispとlibcharguess-rubyを参考にした。これを真似ようとしたんだが…
static VALUE cg_s_guess(VALUE klass, VALUE str) { const char*ptr; int ret; Check_Type(str, T_STRING); ret = CharGuessInit(); ptr = GuessChardet((const char *)RSTRING(str)->ptr); ret = CharGuessDone(); return ptr ? rb_str_new2(ptr) : Qnil; }