Drkcore

バイオインフォだけだと片手落ちの気がする。表紙はケクレ構造にちなんで亀でお願いしたい。

バイオインフォマティクスのためのPerl入門
水島洋
オライリー・ジャパン / ￥ 5,040 ()

実践バイオインフォマティクス -ゲノム研究のためのコンピュータスキル-
Cynthia Gibas,Per Jambeck
オライリー・ジャパン / ￥ 4,410 ()

今のCPUとかmemoryの状況でLLで量子化学とか、かなりアリだと思うんだけどなぁ。pyquanteとかやばいでしょう。

02122009 chemoinformatics

MCSTree

ある化合物集合（ごく近傍のmarkushをちょっと拡張したようなセット）に対して、適当なスキャフォールドをrootとしたmaximum common subgraph(MCS)のサブセットの木（再帰するやつ）を作りたくて、午後中考えてたんだけどとうとう家まで持ち帰ってしまった。

適当な骨格を与えたときに、分割情報を最大にするように枝分かれさせていくアルゴリズムがいまいちよくわからん。

クリークとMCSをうまくつかってヒューリスティックなアルゴリズムで適当に分岐させていけばそれっぽいネットワークになる気がするんだけど。

どっかにそんな論文ないかのう

17092009 chemoinformatics DMPK

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization (Part 5)

Part 5はいままでのパートにはいらないけど、ケミストが知っておくべき事柄的な章の集まりかな。

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization
Edward Kerns,Li Di
Academic Press / ￥ 9,177 ()
在庫あり。

Diagnosing and Improving Pharmacokinetic Performance
Prodrugs
Effects of Properties on Biological Assays
Formulation

これで一通り読んだが、良書だった。

16092009 chemoinformatics DMPK

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization (Part 4)

Part 4はメソッド。予測モデル使う時とかHTSみたいなハイスループットなアッセイ系のQCやるとき、さらにはシステムでの情報管理の際には実験系に対するある程度の理解がないとまるでダメなので、ここら辺は基本的に実験系も含めて結構把握している。

最後のほうの毒性系のアッセイとかは知らないことが多くて参考になった。

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization
Edward Kerns,Li Di
Academic Press / ￥ 9,177 ()
在庫あり。

Strategies for Integrating Drug-Like Properties Into Drug Discovery Methods
Methods for Profiling Drug-like Properties: General Concepts
Lipophilicity Methods
pKa Methods
Solubility Methods
Permeability Methods
Transporter Methods
Blood-Brain Barrier Methods
Metabolic Stability Methods
Plasma Stability Methods
Solution Stability Methods
CYP Inhibition Methods
Plasma Protein Binding Methods
hERG Methods
Toxicity Methods
Integrity and Purity Methods

15092009 chemoinformatics DMPK

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization (Part 3)

Part 3はDMPKとtoxicity。QSPR的というよりはもうちょとケミストよりのそういったissueに対してどういう骨格変換のアプローチをするかといった内容。

なので、経験則的なものが若干多め。ケミストリー的には量子化学計算的な手法を用いながら適切な修飾を選ぶといったものももうちょっと入ってきてもいいんじゃないかなぁと思うけど。一方、インフォマティクスとかのコンピュテーショナルケミストは、パラメータとフィルタリングとかそういうのばっかりに行きがちで、リアルな化合物を出すという視点が失われがちだからこういった本は読みやすくてよいと思う。

結局、メディシナルケミストってのはwhat's to makeであってHow to makeじゃないんだよね。とか考えると解析的な側面から入るか、合成可能性の制約から入るかの違いで行き着くところは同じような気はするんだけど。

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization
Edward Kerns,Li Di
Academic Press / ￥ 9,177 ()
在庫あり。

Transporters
Blood-Brain Barrier
Metabolic Stability
Plasma Stability
Solution Stability
Plasma Protein Binding
Cytochrome P450 Inhibition
hERG Blocking
Toxicity
Integrity and Purity
Pharmacokinetics
Lead-like Compounds
Strategies for Integrating Drug-Like Properties Into Drug Discovery Methods

さて、このPartに関してはvitroに近いところは理解できるんだけど、生体に近いところはイメージが掴めない。もっと毒性とか解剖学に近いところも勉強しないと、全体を見渡せないかもしれないなぁと思った。

08092009 chemoinformatics DMPK

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization (Part 2)

Part 2は物性の話。ここらへんはQSPRの範疇だから割と読みやすかった。ほとんどおさらい的な内容だが、Chemistry色が強いので、知っとくとケミストと話がしやすいかもしれない。

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization
Edward Kerns,Li Di
Academic Press / ￥ 9,048 ()
在庫あり。

Rules for Rapid Property Profiling From Structure
Lipophilicity
pKa
Solubility
Permeability Disposition, Metabolism and Safety

7の溶解性の部分は他の章よりも厚い。予測モデルとかのstate-of-the-artが載ってなかったりするけど、モデルをうまく合成計画に取り入れるのも今どきは重要なんではないかね。

03092009 chemoinformatics Python

パターンベースのフィンガープリント

化学構造の類似度を測るフィンガープリントで部分構造由来のものには2種類あって、ビットにパターンが対応しているものと、そうでないもの。

後者はハッシュ関数とかを使って動的に生成するのでビットの密度の効率が良いが、結局解釈できなくて困ることが多い。

で、前者を解釈しましょうっていうスレッドがあったので書いてみた。

ケモインフォマティックス―予測と設計のための化学情報学
J.Gasteiger,T.Engel
丸善 / ￥ 18,900 ()
在庫あり。

02092009 chemoinformatics R

openbabelのフィンガープリントを使ってSVMの予測モデルを作る

久々にケモインフォクックブックを更新した。

で、このフィンガープリント使って予測モデルを作ってみる。

Benchmark Data Set for in Silico Prediction of Ames Mutagenicity のSupporting InformationにAmes試験のデータセットがあるのでこれを使ってcsvを用意する。

babel -ismi ci900161g_si_001/smiles_cas_N6512.smi -ofpt ci900161g.fpt -xh -xfFP2

これでfptファイルを用意してクックブックので変換。

f2b.py ci900161g.fpt > fingerprint.txt

さらにフィンガープリントをcsvにするのは下のやっつけスクリプトで。

file = "ci900161g_si_001/smiles_cas_N6512.smi"

dic = {"0":"negative","1":"positive"}
act = {}
for l in open(file,"r"):
   smi,id,num = l.split()
   act[id] = dic[num]


fingerprints_file = "fingerprint.txt"

header = "ID,"

for i in range(1,1025):
   header += "bit" + str(i) + ","

header += "act"
print header

for l in open(fingerprints_file, "r"):
   col = ""
   id, fp = l.split()
   col += "\"" +id + "\","
   for c in fp:
       col += c + ","
   col += "\"" + act[id] + "\""
   print col

実行

python fconv.py > ci900161g.csv

でRで解析してみる。

ames <- read.csv("/Users/kzfm/ci900161g.csv")
set.seed(50)
tr.num <- sample(6512,2500)
ames.train <- ames[tr.num,]
ames.test <- ames[-tr.num,]
ames.svm <- ksvm(act ~.,data=ames.train[,-1])
ames.pre <- predict(ames.svm, ames.test[,c(-1,-1026)])
ames.tab <- table(ames.test[,1026],ames.pre)
sum(diag(ames.tab))/sum(ames.tab)

結果

>     sum(diag(ames.tab))/sum(ames.tab)
[1] 0.7771685
> ames.tab
          ames.pre
           negative positive
  negative     1424      425
  positive      469     1694

まぁまあかなぁ。フィンガープリントとかカーネルを考えればもうちょっと精度が上がる気もするが。

参考

[連載]フリーソフトによるデータ解析・マイニング第 31 回. Rとカーネル法・サポートベクターマシン.

マシンラーニング (Rで学ぶデータサイエンス 6)
辻谷将明,竹澤邦夫
共立出版 / ￥ 3,675 ()
通常1～2か月以内に発送

01092009 chemoinformatics

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization

第一部を読んだ。

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization
Edward Kerns,Li Di
Academic Press / ￥ 9,048 ()
在庫あり。

これはなかなか名著の予感がしますな。(DMPKやin silicoから見ると)ちょっと浅い気もするが、メディシナルケミスト向けの本だったらこのぐらい押さえておけばいいのかもしれない(というかこの本の内容理解してるケミストって数人しか知らんけど)。

LeadOptimizationから前臨床のあたりのin vivoにおいてどういったパラメータに気をつけながら合成をするか的な内容なので、コンピュテーショナルなヒトだとEkinsなんかを併せて読んだほうがいいかもしれない。

Computer Applications in Pharmaceutical Research and Development (Wiley Series in Drug Discovery and Development)

Wiley-Interscience / ￥ 14,819 ()
通常1～3週間以内に発送

19082009 chemoinformatics macbook

BKChem

ちょっと描きたいときに便利なので、macbookに入れた。

bkchem

BKChem

もう5年目(wishlistありマス♡)
最近はPythonとDeepLearning
日本酒自粛中
ドラムンベースからミニマルまで
ポケモンGOゆるめ

Drkcore

オライリーからはやっぱケモインフォマティクスの本も出して欲しいと思う

MCSTree

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization (Part 5)

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization (Part 4)

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization (Part 3)

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization (Part 2)

パターンベースのフィンガープリント

openbabelのフィンガープリントを使ってSVMの予測モデルを作る

Drug-like Properties: Concepts, Structure Design and Methods: from ADME to Toxicity Optimization

BKChem