Decision Tree & Graphviz、pip, conda を使う

  • Visit this site for the original codes.
  • Start your jupyter notebook ジュピターノートブックを開始する
    • (1) Start your "Anaconda navigator" then select jupyter notebook. アナコンダ・ナビゲータを立ち上げて、そのメニューからジュピターノートブックを立ち上げてもよい
    • (2) Or start "jupyter notebook" directly. You can type "jupyter notebook" in your terminal for this. ジュピターノートブックを直接立ち上げてもよい。ターミナルに"jupyter notebook"と打ち込んで立ち上げることもできる
  • Open a new jupyter notebook with "python3" kernel in your browser. 新しいジュピターノートブック(カーネルはpython3)をブラウザで開く
  • Write (or copy-and-paste) the following python commands 以下のパイソンコマンドをノートブックに書きます(コピーペーストします)
    • Each section should be filled in each chunk. 一区切りごとに1チャンクに書き込みます
  • We will encounter various errors, but let's solve them one by one. エラーが続出すると思いますが、一つずつ解決していきます
# The following codes are borrowed from https://qiita.com/Hawaii/items/53efe3e96b1171ebc7db 
# pandas for Data structure "dataframe"
# numpy for matrix-like data structure
# sklearn for machine learning
import pandas as pd
import numpy as np
from sklearn.tree import DecisionTreeClassifier, export_graphviz
# To make an image of "decision-tree" and display
# Non-python application "Graphviz" is required
# this python package "graphviz" connects NON-python application "Graphviz" into python environment
import graphviz 
# python package to assist Graphviz functionality
import pydotplus 
# something to show the image in jupyter notebook 
from IPython.display import Image 
# something to show the image in jupyter notebook (environment dependency is likely) 
from sklearn.externals.six import StringIO 
# If you get an error from the above line, try
# from io import StringIO 
data = pd.DataFrame({
        "buy(y)":[True,True,True,True,True,True,True,False,False,False,False,False,False],
        "high":[4, 5, 3, 1, 6, 3, 4, 1, 2, 1, 1,1,3],
        "size":[30, 45, 32, 20, 35, 40, 38, 20, 18, 20, 22,24,25],
        "autolock":[1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0,1,0]
    })
y = data.loc[:,["buy(y)"]]
X = data.loc[:,["high", "size","autolock"]]
clf = DecisionTreeClassifier()
clf = clf.fit(X, y)
# Mac
dot_data = StringIO() # "dot-format" information for "tree-graph" will be stocked
export_graphviz(clf, out_file=dot_data,  
                     feature_names=["high", "size","autolock"], 
                     class_names=["False","True"],  
                     filled=True, rounded=True,  
                     special_characters=True) 
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
#graph.progs = {'dot': u"C:\\tools\\Anaconda3\\Library\\bin\\dot.bat"}  # 追加部分

Image(graph.create_png())
# Windows
dot_data = StringIO() # "dot-format" information for "tree-graph" will be stocked
export_graphviz(clf, out_file=dot_data,  
                     feature_names=["high", "size","autolock"], 
                     class_names=["False","True"],  
                     filled=True, rounded=True,  
                     special_characters=True) 
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
graph.progs = {'dot': u"C:\\tools\\Anaconda3\\Library\\bin\\dot.bat"}  # Windows
#C:\tools\Anaconda3\Library\bin\dot.bat
# graph.progs = {'dot': u"C:\\Users\\ryamada\\Anaconda3\\Library\\bin\\dot.bat"} 
# worked in another Windows environment.
Image(graph.create_png())

f:id:ryamada22:20210107111506p:plain

print(graph.to_string())
  • pip
    • Open the regular terminal
pip install --upgrade pip
    • Install the packages if you get errors @ import xxxx
pip install xxxx
  • conda
    • Graphviz is a graph visualization application. Graphviz はパイソンから独立したグラフ描図アプリ

graphviz.org

    • Graphviz can be installed in your local PC, either Mac or Windows... ダウンロードしてMac/WindowsのローカルPC上でも使えます
    • In my environment, python3 on jupyter notebook was able to call the installed Graphviz without further settings.
    • In my environment, python3 on jupyter notebook failed to call the installed Graphviz (probably "path-problem"). Therefore, a python code line was added to specify the location of installed Graphviz's "dot"-functionality command file "dot.bat", based on an instruction here

niwakomablog.com.

      • The "dot.bat", that was called from python to generate an image by Graphviz, was created by the command "conda install graphviz" . Windowsの場合、"dot.bat"というファイルをpython内から呼び出して、Graphvizに図を作らせていますが、それは"conda install graphviz"というコマンドで作られたものです
conda install graphviz
      • It installed "Graphviz" under the Anaconda folder in my windows PC: "C:\\tools\\Anaconda3\\Library\\bin\\dot.bat" WindowsではそのコマンドによりAnaconda関連の一式を納めた領域にGraphvizがインストールされます。
      • As far as searched the websites, various errors/troubles around Graphviz and python connection. Graphvizpythonで使うためにはいろいろな方法でGraphvizをインストールすることができるようです(難航した人が結構、存在する、ということの裏返し)
  • If no solution, print out "dot-formatted string" and copy-and-paste the contents to the Graphviz-online. もしpythonからのGraphviz呼び出しに成功しなかったり、そもそもGraphvizがローカルPCで動かない場合には、pythonGraphvizが受け付ける「グラフを描くための文字列」を作って、それを「Graphvizをウェブサイト上で動かすサイト」に持っていくという手もあります

dreampuf.github.io

print(graph.to_string())
digraph Tree {
node [color="black", fontname=helvetica, shape=box, style="filled, rounded"];
edge [fontname=helvetica];
0 [fillcolor="#399de524", label=<size &le; 27.5<br/>gini = 0.497<br/>samples = 13<br/>value = [6, 7]<br/>class = True>];
1 [fillcolor="#e58139d4", label=<autolock &le; 0.5<br/>gini = 0.245<br/>samples = 7<br/>value = [6, 1]<br/>class = False>];
0 -> 1  [headlabel="True", labelangle=45, labeldistance="2.5"];
2 [fillcolor="#e58139ff", label=<gini = 0.0<br/>samples = 5<br/>value = [5, 0]<br/>class = False>];
1 -> 2;
3 [fillcolor="#e5813900", label=<size &le; 22.0<br/>gini = 0.5<br/>samples = 2<br/>value = [1, 1]<br/>class = False>];
1 -> 3;
4 [fillcolor="#399de5ff", label=<gini = 0.0<br/>samples = 1<br/>value = [0, 1]<br/>class = True>];
3 -> 4;
5 [fillcolor="#e58139ff", label=<gini = 0.0<br/>samples = 1<br/>value = [1, 0]<br/>class = False>];
3 -> 5;
6 [fillcolor="#399de5ff", label=<gini = 0.0<br/>samples = 6<br/>value = [0, 6]<br/>class = True>];
0 -> 6  [headlabel="False", labelangle="-45", labeldistance="2.5"];
}
  • Regular terminal & Anaconda terminal
  • You can get the jupyter notebook file here: 以上のpythonコマンドをjupyter notebook形式で保存したものは、以下から取れます

github.com