DC Demo

Self-learning course

Site Updated On: December 08, 2022
For More Info Email: rsginfo@soton.ac.uk
DC Demo

General Information

Requirements: Participants must have access to a computer with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).

Accessibility:

We are dedicated to providing a positive and accessible learning environment for all. Please get in touch if you require any accommodations or if there is anything we can do to make this workshop more accessible to you.

Contact: Please email or rsginfo@soton.ac.uk for more information.


Surveys

Please be sure to complete these surveys before and after attempting the material.

Please input the date as the date you started the materials.

Pre-workshop Survey

Post-workshop Survey


Schedule

Best Practices in Data Organisation Using Spreadsheets

Good data organisation is the foundation of any research project. Most researchers have data in spreadsheets, so it is the place that many research projects start. We often organise data in spreadsheets in the ways that we as humans want to work with data. However, in order to use tools that make computation and data analysis more efficient, reusable and reproducible, such as programming languages like R or Python, we need to structure our data in a particular way so that computers can "understand" and "make use of" the data. Since this is where most research projects start, this is where we want to start too!

Data Cleaning with OpenRefine

Before you can analyze data you need to clean it. Data cleaning identifies errors and corrects formatting to create consistent data. This step must be taken with extreme care and attention because without clean data the results of analysis may be false and non-reproducible. OpenRefine is a powerful free and open source tool for working with messy data: cleaning it and transforming it from one format into another. This lesson will teach you to use OpenRefine to clean and format data effectively and automatically track any changes that you make. Many people comment that this tool saves them literally months of work trying to make these edits by hand.

Managing Academic Software Development

Short(ish) lesson description

Automating Tasks with the Unix Shell

The Bash shell has been around longer than many of its users have been alive. It has survived so long because it's a power tool that allows people to do complex things with just a few keystrokes. More importantly, it helps them combine existing programs in new ways and automate repetitive tasks so that they don't have to type the same things over and over again. Use of the shell is fundamental to using a wide range of other powerful tools and computing resources (including "high-performance computing" supercomputers, like IRIDIS at the University of Southampton). These lessons will start you on a path towards using these resources effectively.

Version Control with git

Automated Version Control is a process for tracking changes to files and folders within a 'repository'. This workshop will teach you how to use the Git version control system to manage your code, tracking changes to files, inspecting and merging multiple different changes, and how to collaborate with other Git users using the GitHub online respository store.

Building Programs with Python

The best way to learn how to program is to do something useful, so this introduction to Python is built around a common scientific task: data analysis. Our real goal isn’t just to teach you Python, but to teach you the basic concepts that all programming depends on. We use Python in our lessons because: 1. we have to use *something* for examples; 2. it’s free, well-documented, and runs almost everywhere; 3. it has a large (and growing) user base among scientists; and 4. experience shows that it’s easier for novices to pick up than most other languages. But the two most important things are to use whatever language your colleagues are using, so that you can share your work with them easily, and to use that language well.


Setup

To participate in this workshop, you will need access to software as described below. In addition, you will need an up-to-date web browser.

The instructions for all the software can be found on the setup page.