Can graphical representation of data influence the audience understanding? No doubt. In last couple of years there is a hot discussion over manipulating (knowingly or simply by ignorance) data by various graphical inaccuracies.
While preparing graphical representation of you data, doesn’t matter whether it is statistics taken from big data, your app popularity or improvement of speed of the module you’re preparing, you need to take into account the audience of your graphic. You should ask yourself what reader of your graph will understand without your explanation: is it what you wanted him/her to get from this graph (and I assume that it is the truth and not what you would really want to be the truth)? Most common mechanisms of misleading in graphical representation of data (according to specialists in the field) is showing too many data, not enough data or distorting data. There are many examples of such misleading, especially in journalism. However, it is important to remember about graphics truthfulness in data science, which is constantly emerging branch of IT.
Graphical representation of data created with programming languages is built from scratch and you can control practically any element of the graph, depending on your knowledge of libraries. You should therefore work on your skills to be sure to really be the designer of your graphs and not relying on accidents that may lead to misleading graphics (by so called ignorance). From my experience Python and R and perfect for complete control of your graphics. Data scientists and bioinformaticians use these programming languages widely and many libraries (open source) are available.
I was inspired to write this post by the course I’m attending (Coursera, Applied Plotting, Charting & Data Representation in Python, Applied Data Science with Python, week 1).
If you’re interested, here is some additional reading:
Cairo, A. (2015). Graphics lies, misleading visuals. In New Challenges for Data Design (pp. 103-116). Springer London.