Having written code for guessing visit-number from dates in the Immunity dataset, I ran it and found several records with significant error. Spent a long time inspecting each one in detail, and found lots of data-entry errors (wrong year, wrong month, transposed digits, miskeyed subject id's, etc). In one case, the visit number was off-by-one in the self-report data, which was only apparent when the immunity data was introduced and an extra "halways-between" visit was present. Was able to correct about 30 of these errors confidently; in the end only about 10 out of 700 records remained error greater than 10 days.
Wrote code to merge new datasets into the full database; used it to merge newly-cleaned immunity data into the full database.
Next step: write export-to-CSV routine
Posted by Kyle Simek