Drkcore

22092014 Python

実践コンピュータビジョンの8章をScikit-learnで

kNN, Naibe Bayesm, SVM, (Random Forrest)をScikit-learnでやってみた。データはiPython NotebookでReSTで出力したものをpandocでmarkdown_strictに変換しなおしてblogに貼り付けた。

描画用のヘルパー関数とデータセットの生成

from matplotlib.colors import ListedColormap
import Image
from numpy import *
from pylab import *
import pickle

def myplot_2D_boundary(X,y):
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02),
                        np.arange(y_min, y_max, 0.02))
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

    Z = Z.reshape(xx.shape)
    plt.figure()
    plt.pcolormesh(xx, yy, Z, cmap=cmap_light)

    plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold)
    plt.xlim(xx.min(), xx.max())
    plt.ylim(yy.min(), yy.max())

    plt.show()

with open('points_normal.pkl', 'r') as f:
  class_1 = pickle.load(f)
  class_2 = pickle.load(f)
  labels = pickle.load(f)

X_normal = np.r_[class_1, class_2]
y_normal = labels

with open('points_ring.pkl', 'r') as f:
  class_1 = pickle.load(f)
  class_2 = pickle.load(f)
  labels = pickle.load(f)

X_ring = np.r_[class_1, class_2]
y_ring = labels

with open('points_normal_test.pkl', 'r') as f:
  class_1 = pickle.load(f)
  class_2 = pickle.load(f)
  labels = pickle.load(f)

X_normal_test = np.r_[class_1, class_2]
y_normal_test = labels

with open('points_ring_test.pkl', 'r') as f:
  class_1 = pickle.load(f)
  class_2 = pickle.load(f)
  labels = pickle.load(f)

X_ring_test = np.r_[class_1, class_2]
y_ring_test = labels

cmap_light = ListedColormap(['#FFAAAA', '#AAFFAA'])
cmap_bold = ListedColormap(['#FF0000', '#00FF00'])

kNN

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from sklearn import neighbors, datasets

clf = neighbors.KNeighborsClassifier(3)
clf.fit(X_normal, y_normal)

myplot_2D_boundary(X_normal,y_normal)

clf = neighbors.KNeighborsClassifier(3)
clf.fit(X_ring, y_ring)

myplot_2D_boundary(X_ring,y_ring)

ベイズ

from sklearn.naive_bayes import GaussianNB

clf = GaussianNB()
clf.fit(X_normal, y_normal)
labels_pred = clf.predict(X_normal_test)

print "Number of mislabeled points out of a total %d points : %d" % (y_normal_test.shape[0],(y_normal_test != labels_pred).sum())

myplot_2D_boundary(X_normal,y_normal)

clf = GaussianNB()
clf.fit(X_ring, y_ring)
labels_pred = clf.predict(X_ring_test)

print "Number of mislabeled points out of a total %d points : %d" % (y_ring_test.shape[0],(y_ring_test != labels_pred).sum())

myplot_2D_boundary(X_ring,y_ring)

SVM

from sklearn import svm
clf = svm.SVC()
clf.fit(X_normal, y_normal)

labels_pred = clf.predict(X_normal_test)

print "Number of mislabeled points out of a total %d points : %d" % (y_normal_test.shape[0],(y_normal_test != labels_pred).sum())

myplot_2D_boundary(X_normal, y_normal)

clf = svm.SVC()
clf.fit(X_ring, y_ring)
labels_pred = clf.predict(X_ring_test)

print "Number of mislabeled points out of a total %d points : %d" % (y_ring_test.shape[0],(y_ring_test != labels_pred).sum())

myplot_2D_boundary(X_ring,y_ring)

Random Forest

from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier(n_estimators=10)

clf.fit(X_normal, y_normal)
labels_pred = clf.predict(X_normal_test)

print "Number of mislabeled points out of a total %d points : %d" % (y_normal_test.shape[0],(y_normal_test != labels_pred).sum())

myplot_2D_boundary(X_normal,y_normal)

from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier(n_estimators=10)

clf.fit(X_ring, y_ring)
labels_pred = clf.predict(X_ring_test)

print "Number of mislabeled points out of a total %d points : %d" % (y_ring_test.shape[0],(y_ring_test != labels_pred).sum())

myplot_2D_boundary(X_ring,y_ring)

22092014 Python

実践コンピュータビジョンの8章に参加してきました

最近ちょっと忙しくて実践コンピュータビジョンの読書会には初参加なのに発表してきたわけだが。

実践コンピュータビジョン8章 from Kazufumi Ohkawa

写経以外に務めたことはipython notebookとscikit-learnを推してきたｗ。あとディープラーニングの話とかしてた。そして少しまじめにディープラーニングを学ぼうと思った。

懇親会は筋肉居酒屋で。

1411363268

店に入ると鉄アレイ等がお出迎え

1411363265

塩バターラーメン風パスタ（一人前のハーフサイズｗ）

1411363267

久しぶりに参加して楽しかったですね。主催者がいい感じにバトンタッチしつつ、新しい人も程よく入りながら長いこと続いているいい読書会だなぁと思いました。数えてみたらもうちょっとで5年ですね。

次回は島田でやるそうです。

22092014 Python

RHEL6にサイエンス系のPythonパッケージを導入する

最近RHEL6を与えられたのだが、Pythonのバージョンが2.6系なので2.7系を入れつつ以下のパッケージを導入したのでメモ

numpy
scipy
scikit-learn
matplotlib
ggplot
ipython (ipython notebook)

Pythonのインストール

開発環境をrpmで入れておく

yum groupinstall "Development tools"
yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel

あとはPythonをソースからインストールする

pipのインストール

wget https://bootstrap.pypa.io/get-pip.py
python get-pip.py

blas,lapackのインストール

scipyにはlapack(付きのnumpy)が必要なのだけどyum install blas,lapackは上手くいかないのでソースからインストールした。

wget http://www.netlib.org/lapack/lapack.tgz
tar xzfv lapack.tgz
cd lapack-*/
cp INSTALL/make.inc.gfortran make.inc

meke.incのオプションを修正する -fPICオプションを追加。もし64ビットマシンなら-m64オプションも追加

FORTRAN  = gfortran
OPTS     = -O2 -frecursive -fPIC -m64
DRVOPTS  = $(OPTS)
NOOPT    = -O0 -frecursive -fPIC -m64

書き換えたらmakeする

make blaslib; make lapacklib

出来た*.aを適当なディレクトリに配置して環境変数を設定し、.bashrcとか/etc/profileに追加しておく

export BLAS=/[path]/[to]/librefblas.a
export LAPACK=/[path]/[to]/liblapack.a

numpy,scipyのインストール

pip install numpy

インストール出来たらblas,lapackが使われているかどうかを確認するためimport numpyしてnumpy.show_config()で確認しておく。

OKだったらscipyを入れる。

pip install scipy

matplotlibのインストール

libpngが必要なのでyumで入れる。それからRHEL6のfreetypeは2.3だがmatplotlib1.4.0でも動くので設定ファイルを書き換えてコンパイルする。

yum install libpng libpng-devel

1.4.0のソースをダウンロード

tar xvfz matplotlib-1.4.0.tar.gz

setupext.pyでfreetypeの2.4以上をチェックしているところを2.3に書き換える

python setup.py install

scikit-learn, pandas, ggplot, ipythonのインストール

入れるだけ

pip install scikit-learn
pip install pandas
pip install patsy
pip install ggplot
pip install pyzmq
pip install jinja2
pip install tornado
pip install ipython

17092014 life 将棋