Nick Strayer | Visual Data Scientist
About
I am a principal software engineer at Posit building tools to help data scientists work better. In addition to being a software engineer, I have been extremely lucky to work in many different realms, including as a Journalist at the New York Times, data scientist at the Johns Hopkins Data Science Lab and Dealer.com in Vermont, and “data artist in residence” at startup Conduce in California.
I have a PhD in biostatistics from Vanderbilt University and an undergraduate degree from the University of Vermont where I majored in mathematics and statistics and minored in computer science.
When I am not in “work mode” I love to bike places, read science fiction, take photos of birds, and wander around gardens/museums.
Have a fantastic day!
Projects
For a much more up-to-date and topical list of my work, check out the data science/statisics/visualization blog that I run with Lucy D’Agostino McGowan: Live Free or Dichotomize.
DataDrivenCV package
- An R package for building a CV/Resume from a spreadsheet of information
- Built around the pagedown package in R
- Framework supplied by package is entirely self-sufficient so user’s are not dependent on package version changes.
Data Visualization Best Practices in R
data:image/s3,"s3://crabby-images/092c7/092c71f97cd8778507a3d6cd9da4b761013069e1" alt=""
- Intermediate level course exploring visualization best practices in R.
- Uses ggplot2 and the tidyverse packages.
- See the story A Multimillion-Dollar Startup Hid A Sexual Harassment Incident By Its CEO for information on why I no longer actively promote this course.
Making Nice Looking Websites Using RMarkdown
data:image/s3,"s3://crabby-images/fa999/fa99977248d6377979d5940afeafed86c344b5c4" alt=""
- A walkthrough from start to finish of making a website using RMarkdown and hosting it on Github.
- Made in collaboration with Lucy McGowan
- Presented at the statistical computing workshop for Vanderbilt Medical Center
- See sample site here
Conditional Survival Curves on Truncated Survival Data
data:image/s3,"s3://crabby-images/00af6/00af66267367cb50c6e3c0e527895efea7fcf8d5" alt=""
- A visual exploration of Kaplin-Meier survival curves on left-truncated survival data.
- Drag the conditional slider to see how the survival curve changes depending on the age of entry.
-
All logic for K-M curve written from scratch in javascript and much more performant than the
survival
package in R. - For more information on the algorithm to generate a K-M curve see the wikipedia page.
Reusable Statistics Plots in D3
data:image/s3,"s3://crabby-images/dc6a8/dc6a8db3e77b836a5dd11021458d10450d2adbb4" alt=""
- Also see my histogram made in the same way.
- My first attempts at making a d3 library.
- Ultimately will be tied with a companion R app for interactive visualization for statisticians.
- Uses the reusable d3 structure proposed by Elliot Bentely.
What’s In Season?
data:image/s3,"s3://crabby-images/5e0d8/5e0d8ffb3f2604c0509dc972bbf0976169aa365b" alt=""
- An interactive exploration of what produce is in season.
- Data scraped from here using python.
- Allows the user to select different in season ingredients and search for recipes containing them.
- Notebooks for scraping in github repo.
Binomially Distributed Fun!
data:image/s3,"s3://crabby-images/fe533/fe533544e6320d822d215aa673e862749fd0031f" alt=""
- Demonstrates how a sequence of independent Bernoulli Trials make up the Binomial Distribution.
- Allows the user to toggle the parameters of the Bernoulli and generate a samples.
- Calculates and displays a 95% confidence interval and wilson hypothesis test based upon the generated data.
- All statistics funtions are written from scratch in vanilla javascript.
Probability Integral Transformations
data:image/s3,"s3://crabby-images/d963d/d963d19cb76bbbed46fc9d882b793741b537da17" alt=""
- Made in an effort to visualize what happens when you transform a probability distribution with a function.
- Uses the normal distribution transformed by the normal cdf, resulting in a uniform distribution. See here for more info.
- Inspired by my course work in Probability at Vanderbilt.
Where Are Wildfires Burning?
data:image/s3,"s3://crabby-images/68b33/68b33b4d302324415189b63467c9481404c33d67" alt=""
- Uses open data from NASA satelites on global temperature anomalies.
- Fresh data is downloaded every day and pushed to the static page via shell scripts avoiding the need for servers.
- Data source.
Interactive Manhattan Plot R Package.
data:image/s3,"s3://crabby-images/3e604/3e604fe4746ce10303b93c6ce783b5ebb04456c2" alt=""
- An R package to generate interactive and embedable manhattan plots for genome wide association studies.
- Binds R and Javascript + D3 using the HTMLWidgets package.
State Farmers Market Profiles.
data:image/s3,"s3://crabby-images/6300a/6300a226f3f13004b4027eceaad2ab4772b183d8" alt=""
- Companion visualization to What Do Farmer’s Markets Sell?
- Explore different states path’s through different metrics relating to farmers markets.
- Uses equal sized states map as menu to reduce bias associated with normal projections.
- Data courtesy of Data.gov.
What Do Farmer’s Markets Sell?
data:image/s3,"s3://crabby-images/d3c0b/d3c0b215dc5d2c8c7ecda3074b228eeda589c4dc" alt=""
- Select different good types (e.g. Vegetables, Fruit) and see which markets sell them.
- Assemble different combinations of goods to explore regional trends.
- Dynamic layout adjusts to mobile or desktop views.
- Be patient with it, more than eight thousand points are being drawn to the screen. It will bog down older phones/computers.
- Data courtesy of Data.gov
Interactive Manhattan Plot Viewer.
data:image/s3,"s3://crabby-images/e5186/e51864ebb64cf451d8170ba464d8aab965b01e08" alt=""
- Developed as an experiment in exploratory data visualization.
- Select different controls for comparison, e.g. non-dominant arm growth to see linked snps.
- A manhattan plot is a commonly used tool in accessing genetic roots for traits
- Uses data from the FAMuSS study ( Thompson Et Al. 2004).
Experimental Leap Motion + D3.js project.
data:image/s3,"s3://crabby-images/0b7e0/0b7e069e378221329fd27c9bbcb42cd554fb196e" alt=""
- Wave your hands around and watch D3.js mirror you!
- Requires a leap motion device.
- In the future I plan on implementing ways to interact with D3 visualizations by recognizing gestures using machine learning algorithms.
- Video of it in action for if you don’t have a Leap.
- Note: When using, start by waving your hands around above the Leap Motion device and watch it calibrate!
Polio’s impact on the United States.
data:image/s3,"s3://crabby-images/636b5/636b5375edfdb2cf99eb0e910b807a1c77076868" alt=""
- A project for Data Science 2 (Math 295) taught by Professor James Bagrow at the University of Vermont.
- iPython notebook and data files available on my github.
Marvel Vs. DC in the theater.
data:image/s3,"s3://crabby-images/f5f18/f5f18f870c8b7436854c148613a60aba96d104e8" alt=""
- I extracted data from The Verge article ‘Marvel’s movie business is crushing DC’s and it’s not close.’
CV/Resume
Want a longer list of the stuff I’ve done related to my career? I have a CV!
Need a short and to-the-point single page annotation of my data-science career? Try my resume!
Interested in how I made these? Check out the repo: github.com/nstrayer/cv
Contact
I am always interested in getting involved in new projects or just connecting with others. Feel free to get in touch!
email: nick.strayer (at) gmail
twitter: NicholasStrayer
github: nstrayer