Requirements: Participants must have access to a computer with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).
Accessibility:
We are dedicated to providing a positive and accessible learning environment for all. Please get in touch if you require any accommodations or if there is anything we can do to make this workshop more accessible to you.
Contact: Please email or rsginfo@soton.ac.uk for more information.
Please be sure to complete these surveys before and after attempting the material.
Please input the date as the date you started the materials.
Good data organisation is the foundation of any research project. Most researchers have data in spreadsheets, so it is the place that many research projects start. We often organise data in spreadsheets in the ways that we as humans want to work with data. However, in order to use tools that make computation and data analysis more efficient, reusable and reproducible, such as programming languages like R or Python, we need to structure our data in a particular way so that computers can "understand" and "make use of" the data. Since this is where most research projects start, this is where we want to start too!
Before you can analyze data you need to clean it. Data cleaning identifies errors and corrects formatting to create consistent data. This step must be taken with extreme care and attention because without clean data the results of analysis may be false and non-reproducible. OpenRefine is a powerful free and open source tool for working with messy data: cleaning it and transforming it from one format into another. This lesson will teach you to use OpenRefine to clean and format data effectively and automatically track any changes that you make. Many people comment that this tool saves them literally months of work trying to make these edits by hand.
This course is designed to introduce academics to project management in a light and flexible way, providing basic guidance on breaking a project into tasks to be prioritised and tracked. It also covers software sustainability, encouraging best practise like clear coding and issue management, as well as the use of DOIs and releases to enable easy citation.
The Bash shell has been around longer than many of its users have been alive. It has survived so long because it's a power tool that allows people to do complex things with just a few keystrokes. More importantly, it helps them combine existing programs in new ways and automate repetitive tasks so that they don't have to type the same things over and over again. Use of the shell is fundamental to using a wide range of other powerful tools and computing resources (including "high-performance computing" supercomputers, like IRIDIS at the University of Southampton). These lessons will start you on a path towards using these resources effectively.
Automated Version Control is a process for tracking changes to files and folders within a 'repository'. This workshop will teach you how to use the Git version control system to manage your code, tracking changes to files, inspecting and merging multiple different changes, and how to collaborate with other Git users using the GitHub online respository store.
The best way to learn how to program is to do something useful, so this introduction to Python is built around a common scientific task: data analysis. Our real goal isn’t just to teach you Python, but to teach you the basic concepts that all programming depends on. We use Python in our lessons because: 1. we have to use *something* for examples; 2. it’s free, well-documented, and runs almost everywhere; 3. it has a large (and growing) user base among scientists; and 4. experience shows that it’s easier for novices to pick up than most other languages. But the two most important things are to use whatever language your colleagues are using, so that you can share your work with them easily, and to use that language well.
This is an introduction to R designed for participants with no programming experience. These lessons will be taught over three days. They start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, the structure of data frames, how to deal with factors, how to add/remove rows and columns, how to calculate summary statistics from a data frame, and a brief introduction to plotting.
To participate in this workshop, you will need access to software as described below. In addition, you will need an up-to-date web browser.
The instructions for all the software can be found on the setup page.