b /lib: Learn, Imagine, Build Development Environment | /lib

/lib

Learn, Imagine, Build
Geoff Messier's Projects & Ideas

Development Environment

All of the tools we use in my group are open source and can be downloaded and used for free. If you have the time, consider joining an open source development community to give back to the amazing array of tools that are available. It makes the data science community better and is excellent experience when it comes time to apply for a job.

Operating System

My team tends to use either the mac OSX or linux operating systems. These are far more stable for scientific computing than windows (IMHO). If you’re using linux, Ubuntu is the best choice simply because it’s so common and most software packages are tested on it. Note: Be sure you’re using a native installation and not running Linux inside a virtual machine. A virtual machine installation will not give you the performance or stability that you will need.

Development Environment

The first step in installing all of the libraries and tools mentioned here is to install Miniconda on your computer. This will install python and the pip installer that is necessary for installing the other packages.

Once miniconda is installed, you should install:

  1. jupyter lab
  2. pandas
  3. numpy
  4. sci-kit learn
  5. matplotlib

In all cases, google the installation procedure for each of these libraries/tools and use the pip method when available.

Github

All code developed by my research group is managed using GitHub. This is good software development practice and allows members of our team to collaborate on a common code base. Prospective software develoment and data science employers will often ask potential candidates if they have a GitHub portfolio of the code they worked on during their studies.

You will need to know how to:

A good resource is the Pro Git book. Concentrate on Chapters 1-3 and the relevant commands from appendix A3. The reference guide is also handy.

You will need to install:

Latex

An important part of research is writing about your results. For your thesis and all technical papers, you will be using latex. I write latex directly in a text editor and compile it on the command line. However, most students prefer Overleaf and the Overleaf website also has some good tutorials. For drawing diagrams, Inkscape is a good choice.

Be sure to also check out my page on effective technical writing.