Docker has transformed how R developers build, deploy, and share data science applications, Shiny dashboards, and analytical workflows. With R’s growing adoption in enterprise environments and the rise of containerized data science, mastering Docker for R development is essential for modern statisticians, data scientists, and R developers.
Whether you’re containerizing Shiny applications, Plumber APIs, R Markdown reports, or machine learning models, following Docker best practices can dramatically improve reproducibility, deployment reliability, and collaboration across teams.
Why Docker Matters for R Development
R’s notorious “dependency hell” and environment reproducibility challenges have long frustrated developers. Docker solves these issues by providing consistent, isolated environments that ensure your R applications run identically across development, testing, and production environments.
Key benefits of using Docker with R:
- Reproducible environments eliminating “works on my machine” issues
- Isolated R package dependencies preventing version conflicts
- Simplified deployment of Shiny apps and R APIs
- Enhanced collaboration with standardized development environments
- Version control for R environments alongside your code
- Scalable deployment of R applications in production
1. Choose the Right R Base Image for Your Project
The Rocker Project provides excellent, maintained R base images optimized for different use cases. Selecting the right base image is crucial for performance, security, and development efficiency.
Recommended R Base Images
For data science and analysis:
# Tidyverse with RStudio - full data science stack
FROM rocker/tidyverse:4.3.1
# Verse - includes tidyverse + publishing tools (LaTeX, pandoc)
FROM rocker/verse:4.3.1
# Geospatial - includes spatial analysis packages
FROM rocker/geospatial:4.3.1
For production applications:
# Minimal R installation for production
FROM rocker/r-base:4.3.1
# R with basic development tools
FROM rocker/r-devel:4.3.1
For Shiny applications:
# Pre-configured Shiny Server
FROM rocker/shiny:4.3.1
# Shiny with tidyverse
FROM rocker/shiny-verse:4.3.1
Rocker Image Comparison
| Image | Size | Use Case | Included Packages | Best For |
|---|---|---|---|---|
| r-base | ~200MB | Minimal R | Base R only | Production APIs, minimal apps |
| tidyverse | ~1.2GB | Data analysis | Tidyverse, devtools | Data science development |
| verse | ~2GB | Publishing | Tidyverse + LaTeX, pandoc | R Markdown, reports |
| shiny | ~800MB | Web apps | Shiny Server + essentials | Shiny applications |
| geospatial | ~3GB | Spatial analysis | Spatial packages + GDAL | GIS and mapping |
Best Practice: Start with rocker/tidyverse:4.3.1 for most data science projects, use rocker/shiny:4.3.1 for Shiny apps, and rocker/r-base:4.3.1 for minimal production deployments.
2. Optimize R Package Installation with Smart Layering
R package installation can be time-consuming. Proper layering and caching strategies can reduce build times significantly.
FROM rocker/tidyverse:4.3.1
# Install system dependencies first (rarely changes)
RUN apt-get update && apt-get install -y \
libcurl4-openssl-dev \
libssl-dev \
libxml2-dev \
libpq-dev \
&& rm -rf /var/lib/apt/lists/*
# Install R packages in order of stability
# 1. System packages (rarely change)
RUN install2.r --error \
RPostgres \
httr \
jsonlite \
&& rm -rf /tmp/downloaded_packages/
# 2. Analysis packages (change occasionally)
COPY renv.lock .
RUN R -e "renv::restore()"
# 3. Copy application code last (changes frequently)
WORKDIR /app
COPY . .
EXPOSE 3838
CMD ["R", "-e", "shiny::runApp('/app', host='0.0.0.0', port=3838)"]
Using renv for Reproducible Package Management
FROM rocker/tidyverse:4.3.1
WORKDIR /app
# Copy renv files first for better caching
COPY renv.lock renv.lock
COPY .Rprofile .Rprofile
COPY renv/activate.R renv/activate.R
COPY renv/settings.dcf renv/settings.dcf
# Install renv and restore packages
RUN R -e "install.packages('renv', repos = c(CRAN = 'https://cloud.r-project.org'))"
RUN R -e "renv::restore()"
# Copy application code
COPY . .
CMD ["Rscript", "app.R"]
3. Implement Multi-Stage Builds for Production R Applications
Multi-stage builds help create lean production images by separating development dependencies from runtime requirements.
# Development stage with all tools
FROM rocker/tidyverse:4.3.1 as development
WORKDIR /app
# Install development dependencies
RUN install2.r --error \
devtools \
testthat \
roxygen2 \
pkgdown \
&& rm -rf /tmp/downloaded_packages/
# Copy and install packages
COPY renv.lock .
RUN R -e "renv::restore()"
COPY . .
# Run tests and build
RUN R -e "devtools::test()"
RUN R -e "devtools::build()"
# Production stage - minimal runtime
FROM rocker/r-base:4.3.1 as production
# Install only production system dependencies
RUN apt-get update && apt-get install -y \
libcurl4-openssl-dev \
libssl-dev \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Copy only necessary R packages from development stage
COPY --from=development /usr/local/lib/R/site-library /usr/local/lib/R/site-library
# Copy application code
COPY --from=development /app/*.R ./
COPY --from=development /app/data ./data
# Create non-root user
RUN useradd -m -s /bin/bash ruser
RUN chown -R ruser:ruser /app
USER ruser
EXPOSE 8000
CMD ["Rscript", "plumber.R"]
4. Secure Your R Containers
Security is critical when deploying R applications, especially those handling sensitive data.
Run as Non-Root User
FROM rocker/shiny:4.3.1
# Create application user
RUN groupadd -r shinyuser && useradd -r -g shinyuser shinyuser
# Install packages as root
RUN install2.r --error \
shinydashboard \
DT \
plotly \
&& rm -rf /tmp/downloaded_packages/
# Copy application
COPY --chown=shinyuser:shinyuser . /srv/shiny-server/myapp/
# Configure Shiny Server to run as non-root
RUN echo "run_as shinyuser;" > /etc/shiny-server/shiny-server.conf && \
echo "server {" >> /etc/shiny-server/shiny-server.conf && \
echo " listen 3838;" >> /etc/shiny-server/shiny-server.conf && \
echo " location / {" >> /etc/shiny-server/shiny-server.conf && \
echo " site_dir /srv/shiny-server;" >> /etc/shiny-server/shiny-server.conf && \
echo " log_dir /var/log/shiny-server;" >> /etc/shiny-server/shiny-server.conf && \
echo " directory_index on;" >> /etc/shiny-server/shiny-server.conf && \
echo " }" >> /etc/shiny-server/shiny-server.conf && \
echo "}" >> /etc/shiny-server/shiny-server.conf
# Switch to non-root user
USER shinyuser
EXPOSE 3838
CMD ["/usr/bin/shiny-server"]
Use .dockerignore for R Projects
.git
.Rproj.user
.Rhistory
.RData
.httr-oauth
.DS_Store
*.Rproj
packrat/lib*/
renv/library/
.env
.env.local
tests/
docs/
vignettes/
man/
README.md
Environment Variable Security
FROM rocker/shiny:4.3.1
# Set secure defaults
ENV SHINY_LOG_STDERR=1
ENV SHINY_LOG_LEVEL=WARN
# Disable R startup messages in production
ENV R_STARTUP_DEBUG=0
WORKDIR /srv/shiny-server
COPY . .
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3838/ || exit 1
EXPOSE 3838
CMD ["/usr/bin/shiny-server"]
5. Optimize R Performance in Containers
R applications require specific optimizations for containerized environments.
Memory and CPU Configuration
FROM rocker/tidyverse:4.3.1
# Optimize R for container environment
ENV R_MAX_VSIZE=8Gb
# Configure parallel processing
ENV MC_CORES=2
# Set BLAS/LAPACK optimizations
RUN apt-get update && apt-get install -y \
libopenblas-dev \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY renv.lock .
RUN R -e "renv::restore()"
COPY . .
CMD ["Rscript", "app.R"]
Docker Compose Resource Limits
version: '3.8'
services:
r-app:
build: .
ports:
- "3838:3838"
environment:
- R_MAX_VSIZE=4Gb
- MC_CORES=2
deploy:
resources:
limits:
memory: 4G
cpus: '2.0'
reservations:
memory: 2G
cpus: '1.0'
Optimized Plumber API Configuration
FROM rocker/r-base:4.3.1
# Install system dependencies
RUN apt-get update && apt-get install -y \
libcurl4-openssl-dev \
libssl-dev \
&& rm -rf /var/lib/apt/lists/*
# Install required packages
RUN install2.r --error \
plumber \
future \
promises \
&& rm -rf /tmp/downloaded_packages/
WORKDIR /app
COPY . .
# Configure for concurrent requests
ENV PLUMBER_MAX_REQUESTS=100
EXPOSE 8000
# Use optimized startup script
COPY start.R .
CMD ["Rscript", "start.R"]
# start.R
library(plumber)
library(future)
# Enable parallel processing
plan(multisession, workers = 2)
# Configure plumber
pr <- plumb("plumber.R")
# Start server with optimizations
pr$run(
host = "0.0.0.0",
port = 8000,
debug = FALSE
)
6. Master R Package Management with renv
renv provides reproducible package management for R projects in Docker.
Complete renv Dockerfile
FROM rocker/tidyverse:4.3.1
# Install renv
RUN R -e "install.packages('renv', repos = c(CRAN = 'https://cloud.r-project.org'))"
WORKDIR /app
# Copy renv configuration
COPY renv.lock renv.lock
COPY .Rprofile .Rprofile
COPY renv/activate.R renv/activate.R
# Restore packages from lockfile
RUN R -e "renv::restore()"
# Copy application code
COPY . .
# Set up proper permissions
RUN useradd -m ruser && chown -R ruser:ruser /app
USER ruser
EXPOSE 3838
CMD ["R", "-e", "shiny::runApp(host='0.0.0.0', port=3838)"]
renv Configuration Files
# .Rprofile
source("renv/activate.R")
# Set CRAN mirror for reproducibility
options(repos = c(CRAN = "https://packagemanager.rstudio.com/cran/2023-08-01"))
# Configure renv
Sys.setenv(RENV_CONFIG_REPOS_OVERRIDE = getOption("repos"))
Handling Different Package Sources
FROM rocker/tidyverse:4.3.1
WORKDIR /app
# Copy renv files
COPY renv.lock .
COPY .Rprofile .
COPY renv/ renv/
# Install renv and restore packages
RUN R -e "install.packages('renv')"
# Install packages from different sources
RUN R -e "renv::restore()"
# Install packages from GitHub if needed
RUN R -e "if('devtools' %in% renv::dependencies()$Package) renv::install('devtools')"
COPY . .
CMD ["Rscript", "app.R"]
7. Configure Health Checks for R Applications
Proper health checks ensure your R applications are running correctly and responding to requests.
Shiny Application Health Check
FROM rocker/shiny:4.3.1
# Install curl for health checks
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*
WORKDIR /srv/shiny-server
COPY . .
# Configure health check
HEALTHCHECK --interval=30s --timeout=15s --start-period=10s --retries=3 \
CMD curl -f http://localhost:3838/ || exit 1
EXPOSE 3838
CMD ["/usr/bin/shiny-server"]
Plumber API Health Check
FROM rocker/r-base:4.3.1
RUN apt-get update && apt-get install -y \
curl \
libcurl4-openssl-dev \
&& rm -rf /var/lib/apt/lists/*
RUN install2.r --error plumber
WORKDIR /app
COPY . .
# Health check for API endpoint
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
EXPOSE 8000
CMD ["Rscript", "plumber.R"]
R Health Check Endpoint
# plumber.R
library(plumber)
#* Health check endpoint
#* @get /health
function() {
list(
status = "healthy",
timestamp = Sys.time(),
r_version = R.version.string,
memory_usage = gc()
)
}
#* Main API endpoint
#* @get /predict
function(data) {
# Your prediction logic here
result <- your_model_function(data)
return(result)
}
8. Handle R Environment Variables and Configuration
Proper configuration management is essential for deploying R applications across different environments.
FROM rocker/shiny:4.3.1
# Set R-specific environment variables
ENV R_LIBS_USER=/usr/local/lib/R/site-library
ENV TAR="/bin/tar"
# Shiny-specific configurations
ENV SHINY_PORT=3838
ENV SHINY_HOST=0.0.0.0
ENV SHINY_LOG_LEVEL=WARN
WORKDIR /srv/shiny-server
COPY renv.lock .
RUN R -e "install.packages('renv') && renv::restore()"
COPY . .
# Use environment variables in startup
EXPOSE $SHINY_PORT
CMD ["/usr/bin/shiny-server"]
Configuration with config Package
# config.yml
default:
database:
host: localhost
port: 5432
name: myapp
api:
base_url: "http://localhost:8000"
development:
database:
host: db
name: myapp_dev
production:
database:
host: !expr Sys.getenv("DB_HOST")
port: !expr as.numeric(Sys.getenv("DB_PORT", "5432"))
name: !expr Sys.getenv("DB_NAME")
api:
base_url: !expr Sys.getenv("API_BASE_URL")
# app.R
library(config)
library(shiny)
# Load configuration
config <- config::get()
# Use configuration in your app
server <- function(input, output) {
# Connect to database using config
con <- DBI::dbConnect(
RPostgres::Postgres(),
host = config$database$host,
port = config$database$port,
dbname = config$database$name
)
}
shinyApp(ui, server, options = list(
host = Sys.getenv("SHINY_HOST", "0.0.0.0"),
port = as.numeric(Sys.getenv("SHINY_PORT", "3838"))
))
9. Implement Comprehensive Logging for R Applications
Proper logging is crucial for debugging and monitoring R applications in production.
FROM rocker/shiny:4.3.1
# Install logging packages
RUN install2.r --error \
logger \
futile.logger \
&& rm -rf /tmp/downloaded_packages/
# Configure logging directory
RUN mkdir -p /var/log/shiny-server
VOLUME ["/var/log/shiny-server"]
WORKDIR /srv/shiny-server
COPY . .
# Configure Shiny Server logging
RUN echo "preserve_logs true;" > /etc/shiny-server/shiny-server.conf && \
echo "sanitize_errors false;" >> /etc/shiny-server/shiny-server.conf
EXPOSE 3838
CMD ["/usr/bin/shiny-server"]
R Logging Setup
# logging_setup.R
library(logger)
# Configure logging
log_threshold(INFO)
# Set log format
log_formatter(formatter_glue_or_sprintf)
# Add file appender for persistent logging
if (dir.exists("/var/log/shiny-server")) {
log_appender(appender_tee("/var/log/shiny-server/app.log"))
}
# Custom logging functions
log_request <- function(session, input_id, value) {
log_info("User {session$user} triggered {input_id} with value: {value}")
}
log_error_safe <- function(error, context = "") {
log_error("Error in {context}: {error$message}")
# Don't expose sensitive error details to users
showNotification("An error occurred. Please try again.", type = "error")
}
Structured Logging for Analysis
# app.R with structured logging
library(shiny)
library(logger)
library(jsonlite)
# Set up structured logging
log_formatter(function(level, msg, namespace, .logcall, .topcall, .topenv) {
toJSON(list(
timestamp = Sys.time(),
level = level,
message = msg,
session_id = isolate(session$token),
user_agent = isolate(session$clientData$url_hostname)
), auto_unbox = TRUE)
})
server <- function(input, output, session) {
# Log session start
log_info("Session started")
# Log user interactions
observeEvent(input$button, {
log_info("Button clicked", button_id = "button", value = input$button)
})
# Log errors
tryCatch({
# Your application logic
}, error = function(e) {
log_error("Application error", error = e$message, trace = traceback())
})
}
10. Deploy R Applications with Docker Compose
Orchestrate complex R applications with multiple services using Docker Compose.
Complete R Application Stack
# docker-compose.yml
version: '3.8'
services:
shiny-app:
build:
context: .
target: production
ports:
- "3838:3838"
environment:
- SHINY_LOG_LEVEL=INFO
- DB_HOST=postgres
- DB_NAME=analytics
- DB_USER=ruser
- DB_PASSWORD=secure_password
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_started
volumes:
- shiny_logs:/var/log/shiny-server
networks:
- r-network
restart: unless-stopped
plumber-api:
build:
context: ./api
dockerfile: Dockerfile
ports:
- "8000:8000"
environment:
- R_MAX_VSIZE=2Gb
- DB_HOST=postgres
depends_on:
- postgres
networks:
- r-network
deploy:
replicas: 2
resources:
limits:
memory: 2G
cpus: '1.0'
postgres:
image: postgres:15-alpine
environment:
- POSTGRES_DB=analytics
- POSTGRES_USER=ruser
- POSTGRES_PASSWORD=secure_password
volumes:
- postgres_data:/var/lib/postgresql/data
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
networks:
- r-network
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ruser -d analytics"]
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
volumes:
- redis_data:/data
networks:
- r-network
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
- ./ssl:/etc/nginx/ssl
depends_on:
- shiny-app
- plumber-api
networks:
- r-network
volumes:
postgres_data:
redis_data:
shiny_logs:
networks:
r-network:
driver: bridge
Production-Ready Dockerfile
# Multi-stage build for R Shiny application
FROM rocker/shiny-verse:4.3.1 as builder
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
libpq-dev \
libcurl4-openssl-dev \
libssl-dev \
libxml2-dev \
libgdal-dev \
&& rm -rf /var/lib/apt/lists/*
# Copy renv files and restore packages
COPY renv.lock renv.lock
COPY .Rprofile .Rprofile
COPY renv/ renv/
RUN R -e "install.packages('renv')"
RUN R -e "renv::restore()"
# Copy application code
COPY . .
# Production stage
FROM rocker/shiny:4.3.1 as production
# Install only runtime system dependencies
RUN apt-get update && apt-get install -y \
libpq5 \
libcurl4 \
libssl1.1 \
libxml2 \
&& rm -rf /var/lib/apt/lists/*
# Copy R packages from builder
COPY --from=builder /usr/local/lib/R/site-library /usr/local/lib/R/site-library
# Create app user
RUN groupadd -r shinyuser && useradd -r -g shinyuser shinyuser
# Configure Shiny Server
COPY shiny-server.conf /etc/shiny-server/shiny-server.conf
# Copy application
COPY --from=builder --chown=shinyuser:shinyuser /app /srv/shiny-server/app
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=10s --retries=3 \
CMD curl -f http://localhost:3838/ || exit 1
EXPOSE 3838
# Switch to non-root user
USER shinyuser
CMD ["/usr/bin/shiny-server"]
Advanced R Application Configuration
# global.R - Application-wide configuration
library(shiny)
library(shinydashboard)
library(DT)
library(plotly)
library(pool)
library(RPostgres)
library(config)
library(logger)
# Load configuration
config <- config::get()
# Set up database connection pool
pool <- dbPool(
drv = RPostgres::Postgres(),
host = config$database$host,
port = config$database$port,
dbname = config$database$name,
user = config$database$user,
password = config$database$password,
minSize = 5,
maxSize = 20
)
# Ensure pool is closed when app stops
onStop(function() {
poolClose(pool)
})
# Configure logging
log_threshold(INFO)
if (dir.exists("/var/log/shiny-server")) {
log_appender(appender_tee("/var/log/shiny-server/app.log"))
}
# Custom error handler
options(shiny.error = function() {
log_error("Shiny error occurred")
})
Advanced Tips for R Production Deployments
Container Orchestration with Load Balancing
# docker-compose.prod.yml
version: '3.8'
services:
shiny-app:
build: .
deploy:
replicas: 3
update_config:
parallelism: 1
delay: 10s
restart_policy:
condition: on-failure
networks:
- r-network
load-balancer:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx-lb.conf:/etc/nginx/nginx.conf
depends_on:
- shiny-app
networks:
- r-network
Monitoring and Observability
# Add monitoring to your R container
FROM rocker/shiny:4.3.1
# Install monitoring packages
RUN install2.r --error \
prometheus \
httr \
&& rm -rf /tmp/downloaded_packages/
COPY monitoring.R /opt/monitoring.R
# Start monitoring alongside main app
COPY start-with-monitoring.sh /start.sh
RUN chmod +x /start.sh
CMD ["/start.sh"]
Key Takeaways
Following these Docker best practices for R development will help you:
- Achieve true reproducibility with renv and versioned base images
- Reduce deployment failures by up to 90% with proper containerization
- Improve application performance through optimized resource allocation
- Enhance security with non-root users and proper secret management
- Simplify scaling with container orchestration and load balancing
- Enable better monitoring with structured logging and health checks
Next Steps
- Implement CI/CD pipelines for automated R application deployment
- Explore Kubernetes for large-scale R application orchestration
- Set up monitoring with Prometheus and Grafana for R applications
- Consider RStudio Package Manager for enterprise package management
- Implement automated testing for containerized R applications