Data

Over the years I have benefited from scholars sharing their data with me; this page pays that kindness forward — alongside the open-source tools I build. If a dataset or tool here could help your work, please get in touch.

Type

Software & Tools

  • AI OS (public)

    Public release of the agentic operating system behind my AI-assisted research and teaching.

    AIAgentic AIResearch Infrastructure
  • RealResearch

    AI-guided research workflow for VS Code that tutors students through a staged research process while keeping pedagogical content encrypted.

    AIPedagogyVS Code
  • Research Lab Control Room

    Central control layer for research projects, RAs, shared skills, and a dashboard — a concrete instantiation of With Great Powers.

    Forthcoming Lab InfrastructureDashboardAgentic AI
  • Docs Typing Flow

    Chrome extension for Google Docs that surfaces how student writing was produced — typed vs. pasted — to help instructors judge authenticity, without ever reading the document text.

    Chrome ExtensionPedagogyAcademic Integrity

Datasets

  • Slovakia Replication Package

    Replication data and code, hosted on Harvard Dataverse.

    ReplicationTaxSlovakia
  • Global Government GenAI Observatory (G3O)

    A panel of generative-AI adoption signals across ~700,000 public institutions worldwide. Pilot site live; full release planned for 2026.

    AIGovernmentProcurement
  • Italy (1998–2021)

    A nationally representative tax-attitudes survey, itemized budget, contract, and public-official data for 8,000+ municipalities, and 2,800+ local electoral manifestos.

    SurveyAdministrativeElectoral ManifestosTaxLocal Government
  • Italy (1910–1935)

    Digitized historical data: 500K+ WWI soldier fatalities, 1910/1921/1933 census and town-budget figures, and town-level dissidence and repression (1900–1935).

    ArchivalWWI FatalitiesCensusDissidence
  • Australia (1993–2005)

    Municipality-level national electoral results (1993, 1998, 2001, 2004) and tax-collection data.

    ElectoralAdministrativeTax
  • Poland (2010–2021)

    A cross-sectional dataset of 2,400+ gminas, 2000–2020: local electoral results and itemized budget data (revenue, compliance, and spending by policy area).

    ElectoralAdministrativeBudget