Friday, October 2, 2009
As somebody who naturally loves to organize, this session was close to my heart. Oddly enough, I didn't really do a whole lot of organizing for my Masters research (I guess it was 'simple' enough that I didn't need to), but I'm really excited to use some of this advice as I start my PhD. One of the first things I'm going to (finally) do after thinking about it a lot is setting up an SVN server on my own webserver.
Why Organization Matters
You will do a lot of stuff in 5-7 years, and you'll forget a lot of it. Why waste time recreating work you've already done by being disorganized? (Your advisor doesn't teach you this kind of thing!)
The panelists share the following mistakes they have made:
- Not commenting code.
- Not taking notes during meetings.
- Keeping track of papers (also known as messy piles on your desk).
- Not using source control systems.
- Not writing down research ideas.
- Do I work alone or with collaborators?
- Do I work on multiple machines that require synchronization?
- Do I have limited amounts of storage?
- Do I need to keep paper records or record data off my computer?
- Is my work backed up?
Index cards, loose leaf paper, or notebooks are good for temporary notes and drawings, but are easy to lose, not portable, and not searchable. A research blog might be a good place to process ideas and search them later, as well as allow group members to follow your work and make comments, but makes it difficult to organize ideas. You can keep weekly notes in Google Docs, using coloured highlighting to track what is done and what is not; however, this often produces very large documents. A Google Site takes this a step further, allowing multiple pages that can be used to track progress, share with group members, and so on.
Audience suggestions: Webspiration: Online Visual Thinking. For math notes, some use TeX and SVN. Delicious is used to remember websites visited and Diigo is a web highlighter and sticky note tool. MS OneNote is also popular.
Keeping Papers Organized
Keep track of author, title, etc, but also notes about key points and criticisms. Even if you've only skimmed a paper, make a note of it. When choosing tools look for the ability to make citations and bibliographies for papers, take notes, and link to the paper (PDF).
I've blogged before about the tools available on Windows, and another mentioned here is Pybliographer. I also hadn't included EndNote in my list since it's not free.
Pro tip from audience: As soon as you read a paper, get the FULL citation information. It's amazing how hard it can be to find later when you only note the title. Always put every document you've read in your organizing software.
Keeping Experiments Organized
At stake: sanity, time, and reputation. When you were wrong about "never using that code again," you will waste a lot of time if you didn't bother keeping everything organized.
Organized your file system by project and experiment. Make your code modular by separating code for preprocessing data, running the method, summarizing results, and creating figures/tables. When something goes wrong, make it so you can re-run only the part that went bad. Make your experiments reproducible (store random seeds, input parameters, and know what versions of libraries (etc) are used.
- Use good programming practices.
- Handle errors.
- Code unit tests.
- Use an IDE which integrates with debuggers and revision control.
- Use a good LaTeX editor.
- Use revision control and/or track changes (especially with multiple authors!).
- Keep track of what version of a paper has been submitted where.
- Start early, and remember that writing can help organize your thoughts.