SymPy 1.1 has been released

Have you ever used sympy? AS the authors say: “SymPy is a Python library for symbolic mathematics. It aims to become a full-featured computer algebra system (CAS) while keeping the code as simple as possible in order to be comprehensible and easily extensible. SymPy is written entirely in Python.”

You can install new release by simply:

pip install -U sympy

It is promised that it will be be soon available also via conda.

What is new (among others)?

1. Many improvements to code generation, including addition of
tensorflow (to lambdify), C++, llvm JIT, and Rust language support, as
well as the beginning of AST classes.

2. Several bug fixes for floating point numbers using higher than the
default precision.

3. Improvements to the group theory module.

4. Implementation of Singularity Functions to solve Beam Bending
problems.

5. Improvements to the mechanics module.

As the main author (Aaron Meurer ) say: “A total of 184 people contributed to this release. Of these, 143 people contributed to SymPy for the first time for this release. ” Maybe you will be next?

Multiple other projects also use SymPy; just to name some, there is Cadabra for (quantum) field theory system, LaTeX Expression project for typesetting of algebraic expressions and yt for analyzing and visualizing volumetric data.

 

Official page of sympy you can find here: http://www.sympy.org/en/index.html

It is also freely available at github: https://github.com/sympy/sympy.github.com

Exact information about the release/authors/deprecations/etc. can be found here: https://github.com/sympy/sympy/wiki/Release-Notes-for-1.1

 

Why use Jupyter Notebook for data analyses?

I think that everyone interested in data science and data analysis somewhere, somehow during their education or internet searches comes across Jupyter Notebook.  Jupyter notebook is an aplication that enables you to create (and share) document that contains code (in various programming languages), explanaitions (text) and visualizations. Jupyter Notebook is super useful when you want to show your workflow and prepare how-to for future analyses for yourself or your team.

I use Jupyter Notebook with Python3 but you can use it with various programming languages if you prefer to. Python has very broad offer of libraries for statistical analysis, data visualizations and machine learning.

With Jupyter Notebook you can show every step of data transformation showing, e.g. pandas’ DataFrames in really nice shape:

Screen Shot 2017-07-01 at 07.53.29

Moreover you can include plots with the code you used to create it so you can easily reproduce it for other data:

Screen Shot 2017-07-01 at 07.51.01

Just to mention, super useful thing in Jupyter Notebook is:

%matplotlib inline

that makes your plots appear as you execute a cell without calling

plt.show()

I hope you see what are indisputable perks of using Jupyter Notebook, I encourage you strongly to try it out.

If you’re into Jupyter Notebook, this year there is a conference in August in New York, called Jupytercon (https://conferences.oreilly.com/jupyter/jup-ny).

There are a lot of interesting projects around Jupyter Notebook, like JupyterHub (https://jupyterhub.readthedocs.io/en/latest/) that allows Jupyter server to be used by multiple users or nteract (https://nteract.io/) that transforms Jupyter Notebooks to desktop application so it’s even easier to use.

DATA Science in business – perspective from first day employee

This is the one and only opportunity for me to write what was my thoughts about data science after first couple of days in a new office. I decided to start another job, first job in data analysis and first in software house – huge difference from the first day.

First of all I had no idea what anyone was talking about. All those abbreviations and  office slang is a bit overwhelming at first. But you get used to it and understand more and more every day. But after couple of days it is still not enough. But hey! During the first weeks you’re allowed to ask stupid questions, so use it.

Second of all, your chosen technology is not necessary used in the office. Even if you spoke about used technologies during interview and you were asked to implement your chosen technology to your new office, you will have to use also technologies they use. It is not at all a bad thing, it gives you opportunity to expand your knowledge and perspectives. It is easier to understand the data in a way they work with it and after you understand it you can go on with your technologies.

Third thing is the co-operation. You will not be with data alone, others need your results, they (engineers) change the way how data looks like, they want to learn how to do some analyses on their own – strong co-operation in data analysis is crucial.

Forth – you need to understand the goal of your existence in the company. You don’t analyse the data just to analyse it, there is company strategy in it, you have to keep it mind all the time.

Last, but not least, connected to all of the above, company lives its life and what you do today may not be so necessary tomorrow and sometimes you have to deal with unfinished projects. Just go with the flow (and company development).

Examples of Data Scientists’ Portfolios

While searching for employment as Data Scientist it is important to show your skills with well prepared portfolio, just as Developers show their github accounts to show their programming skills. Your portfolio should show how you use your skills stated in the Resume. I think that the most important thing is to tell a story about chosen data, show what you can do with openly available data, how insightful you are when it comes to asking questions based on the data and whether you can represent the results clearly (but also aesthetically and beautifully).

Let’s look at some examples of projects representation:

1.Projects – scientific way

http://timdettmers.com/data-science-portfolio/

I like this portfolio because I am scientist (but probably it’s not ideal for recruiters). Each project is described with abstract, methods and results with discussion (with accompanying figure). It’s quite simple with no graphical fireworks, but it’s clear.

2.Projects – more advertising way

http://binnie869.github.io/

Projects are represented by title, short comment and a image that redirects to the github project (code).

3.Analyses – tools used

http://gemelli.spacescience.org/~hahnjm/data_science/data_science.html

As the author stated, not exactly projects but activities are shown. Each activity is represented by a graph and short description of statistical method/tool used. It for sure shows skills of the author.

4. Projects – very advertising

http://davidventuri.com/portfolio

For sure the author knows how to make nice website ;). Again, projects are represented by title, short comment (however here in the caption also technologies are included) and a image that redirects to the extended project description or website presenting results.

5. Projects – story telling

http://dsal1951-portfolio-v1.businesscatalyst.com/portfolio.html

I really like this portfolio as it is really ‘story telling’ portfolio. When you enter a project, story about data and various approaches to analyse it are presented.

As you can see, each Data Scientist has different way for showing their expertise. Which is best? Hard to say, depends what you want to do with data science and what kind of a company you want to work in.