Victoria Stodden

Hi I'm Victoria! I'm a data scientist working on open data.

My group focuses on understanding the effect of big data and computation on scientific inference. We are interested in:

How effectively does statistical methodology translate to big data settings?

Instead of collecting data to test a particular hypothesis, researchers are now generating hypotheses by direct inspection of the data, then using the data to test those hypotheses. What counts as a significant finding in this case? Can we estimate how likely that finding is to be replicated in a new sample?

What information is needed to verify and replicate data science findings?

When computation is used in research, it becomes part of the methods used to derive a result. How should these steps be made openly available to the community for inspection, verification, replication, and re-use?

What tools and computational environments are needed for data science?

We have an opportunity to think about data science as a life cycle, from experimental design and databases through to the scientific findings, and design tools and environments that enable reliable scientific inference at scale.

Victoria Stodden

Research

Software

Blogging

Teaching

How effectively does statistical methodology translate to big data settings?

What information is needed to verify and replicate data science findings?

What tools and computational environments are needed for data science?