[关闭]
@zhengyuhong 2015-06-02T11:41:52.000000Z 字数 850 阅读 1277

toolkit

Python linux 机器学习 数据挖掘 toolkit


machine learning tools

scikit-learn
scikit-learn on pypi.python.org
numpy
numpy on pypi
scipy
scipy on pypi
matplotlib
Orange
Orange reference可以看到里面的算法基本可以在scikit-learn中找到,且scikit-learn更为完善,所以只需要掌握scikit-learn即可。

Nosql database

mongodb
Python API for mongodb
SSDB
redis
Python API for redis
levelDB
leveldb python API
memcached
Python API for memcached

Text Mining

Jieba
“结巴”中文分词:做最好的 Python 中文分词组件
Jieba on pypi
NTLK
Natural Language Toolkit
NLTK on pypi

  1. #when comes to LookupError,download the corresponding package
  2. import nltk
  3. nltk.download()

Pattern.web
Pattern on pypi
The pattern.web module has tools for online data mining
参考文档不够丰富完善

gensim
Topic modeling for humans
PLSA in Python
PLSA in C++
Topic Models C++
This is a C++ implementation of topic models with variational inference
It include LDA, supervised-LDA, HDP, supervised HDP, online HDP, online SHDP.

crawler

scrapy
A high-level Python Screen Scraping framework
scrapy in pypi

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注