Path                          Size   Description
gold                          589M   Updated `process_uber_summary` data with slightly better labels. Also pre-split into test/train sets
labels.tar                    1.2G   Detailed labels. More detailed, but harder to work with than whats attached to `process_uber_summary`
process_uber_summary.parquet  297M   Copy of gold `process_uber_summary`. Its here for convenience as its the most common table used
raw_sensor.tgz                 14G   Raw sensor detail. Very low level of detail and requires post-processing to be usable
rolling.tgz                    37G   Data processed by day, with each day being self-contained. Note, this also means some data may overlap between days
stdview-20240819-20240923      21G   All data summarized over the data collection period. This is the primary dataset/format used
stdview-20240819-20240923.tar  21G   Same as above, but in a conveient, single tar file
xdr                           188M   Experimental conversion of WinTap data model to Microsoft XDR structure. Note that only process and network events were converted

The simplest place to start for exploring is `process_uber_summary.parquet`, which has a single row per process execution, with more detailed events summarized.

From there, explore the fuller, relationally modelled data in `stdview-20240819-20240923`

For a complete copy, take:

labels.tar, raw_sensor.tgz, rolling.tgz, stdview-20240819-20240923.tar, xdr


