![]() ![]() For example, you may process the raw data in the 1_data/private folder and then put a clean version in the 3_intermediate/private folder. This is for any datatable or object that contains line-level data that cannot be shared publicly. Note there is also a private sub-directory in the intermediate folder. RDS file of a model or data structure that is generated in one script and then analyzed in another. At the beginning of the script, any necessary packages should be loaded, any necessary data should be read in from file (e.g., from the 1_data or 3_intermediate folders), and if any scripts containing helper functions should be run (e.g., using source(./2_scripts/00_helper-functions.R)).ģ_intermediate this folder can be used if you have intermediate output that is primarily meant as an input to another script or function, rather than to be analyzed or graphed. For example, the files in your scripts folder may look like:Įvery script should be written such that it can run successfully in a new R session with no variables in the environment. Often, we have an additional script that contains helper functions which may be shared across scripts. The preferred naming convention for your scripts is to start with a 2-digit number and underscore, followed by a brief description of what the script does. They will then need to transfer the data via an approved secure method and create the private directory on their own machine before the data will load.Ģ_scripts folder for storing all scripts. If another team member who has permission wants to work on the project on their own computer, they can first clone the project from Github, but this will not include the ‘private’ directory. This is to comply with data use agreements and avoid improper disclosure of private data. gitignore file dictates that the private directory will not be version controlled using Git nor uploaded to Github. If you are using an encrypted laptop or a MCHI desktop computer, you can, in most cases, keep line-level data in a sub-directory on your computer (encrypted laptop or MCHI desktop) named “private”. Note: We generally do not upload private line-level data on individual patients into GitHub or publish these data. Often, we will include sub-directories within data with information on the data sets (e.g., data dictionaries or keys). The directories in our generic project are:ġ_data folder containing all raw data, before any code has been applied, as well as tables of model parameters that are estimated from the literature or other sources. Ask Alton to share our generic project directory with you. gitignore file and the ignored folders/files would not show up. The generic project directory template is not on Github because if we added it, the. Typically, one project = one academic paper (that way, when we go to publish, we can easily publish the accompanying project directory). In our lab, we have a template we use for each project, with a set file structure. ![]() An Rstudio project is linked to a specific working directory. Rstudio projects are the recommended method for keeping all data, scripts, and output for a project in a single place. 3.3 Rstudio project & folder organization RStudio can also be used for Python, if you install Python and point RStudio to it. Our lab website and lab manual were both created in Rstudio as well. We use R and Rstudio for much of our work: to analyze data, program models, and write reproducible manuscripts. R studio is the most popular integrated development environment (IDE) for use with R. R is a free, open-source programming language commonly used in statistics, data science, and quantitative research. Upon publication, we create a DOI-indexed copy of our Github repository, which includes all analytic code and data except for sensitive data that must be kept confidential. R-markdown documents can include code chunks in R and in Python. We write manuscripts using R-markdown, typically knitting to Microsoft Word to enable sharing and revisions with non-programmer colleagues. We use Github for version control and collaboration. This contains folders for data, code scripts, output, and the manuscript. ![]() 3.1 OverviewĮach project is maintained in a self-contained directory (folder) using a standard template for organization. All lab members are invited to improve the documentation for our current process and suggest ways to improve our process. This page describes our default process for conducting and reporting quantitative research. Our lab is committed to transparent and reproducible research. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |