Github - Top 10 Python Libraries for Data Science

Top 10 Python Data Science Libraries by GitHub Contributors, Commits and Size (size of the circle)

Image result for python libraries
1. pandas (Contributors – 1328, Commits – 18162, Stars – 16890)
pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python.”
2. Matplotlib (Contributors – 771, Commits – 27937, Stars – 8224)
“Matplotlib is a Python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shell (à la MATLAB or Mathematica), web application servers, and various graphical user interface toolkits.”
3. NumPy (Contributors – 708, Commits – 19241, Stars – 8666)
“NumPy is the fundamental package needed for scientific computing with Python. It provides a powerful N-dimensional array object, sophisticated (broadcasting) functions, tools for integrating C/C++ and Fortran code and useful linear algebra, Fourier transform, and random number capabilities.”
4. SciPy (Contributors – 670, Commits – 20080, Stars – 5096)
“SciPy (pronounced "Sigh Pie") is open-source software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more.”
5. Bokeh (Contributors - 325, Commits - 17365, Stars - 8439)
“Bokeh is an interactive visualization library for Python that enables beautiful and meaningful visual presentation of data in modern web browsers. With Bokeh, you can quickly and easily create interactive plots, dashboards, and data applications.”
6. Gensim (Contributors - 299, Commits - 3676, Stars - 8107)
“Gensim is a Python library for topic modellingdocument indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community.”
7. Scrapy (Contributors – 295, Commits – 6802, Stars – 30014)
“Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.”
8. StatsModels (Contributors – 164, Commits – 10896, Stars – 3383)
“Statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation and inference for statistical models.”
9. plotly.ly (Contributors – 62, Commits – 3291, Stars – 4218)
“plotly.ly is an interactive, open-source, and browser-based graphing library for Python. Built on top of plotly.js, plotly.py is a high-level, declarative charting library. plotly.js ships with over 30 chart types, including scientific charts, 3D graphs, statistical charts, SVG maps, financial charts, and more.”
10. pydot (Contributors – 12, Commits – 169, Stars – 267)
“pydot is an interface to Graphviz, can parse and dump into the DOT language used by Graphviz and is written in pure Python.”

Comments

Popular Posts

Contact Form

Name

Email *

Message *