CSV/Parquet Data Structure for ADX
Data can be provided as NDJSON, CSV or Parquet. This document describes the CSV/Parquet data structure.
EVENT_ID
The unique identifier of the event.
CREATE_DT
The date and time the event was first seen by PredictHQ in UTC. Also called first_seen
in the Events API.
UPDATE_DT
The date and time the event was last updated in UTC.
TITLE
The title of the event.
CATEGORY
The category of the event.
LABELS
DESCRIPTION
The description of the event.
EVENT_START
The date and time the event starts, recorded in UTC. If the TIMEZONE
field is null, the time represents the same relative time across all timezones.
Additionally, if an event has a start time of midnight in its local timezone, this may indicate that the actual time is unknown. You may wish to omit the time when displaying such events.
EVENT_END
The date and time the event ends, recorded in UTC. If the TIMEZONE
field is null, the time represents the same relative time across all timezones.
PREDICTED_END
The date and time PredictHQ predicts the event will end, recorded in UTC. If the TIMEZONE
field is null, the time represents the same relative time across all timezones. This value is present where an actual EVENT_END
is unknown.
TIMEZONE
The time zone of the event in TZ Database format. This is helpful so you know which time zone to convert the dates to (if needed). If the time zone is null
, the start and end date should be regarded as time zone agnostic and already being in local time.
ENTITIES
GEO
IMPACT_PATTERNS
SCOPE
The geographical scope the events apply to. Possible values are:
locality
localadmin
county
region
country
PLACEKEY
COUNTRY_CODE
The country code in ISO 3166-1 alpha 2 format. This value is typically present, but in some cases such as events occurring outside any country (e.g. an earthquake in the middle of the ocean), it may be empty.
PLACE_HIERARCHIES
An array of place hierarchies for the event. Each hierarchy is an array of place ids. The final place in a hierarchy is a specific place the event applies to. Each place is a sub-place of the place immediately preceding it in the hierarchy. An empty array is possible and valid.
PHQ_ATTENDANCE
A numerical value that reflects the predicted attendance for supported attendance-based categories. Supported categories include concerts, performing arts, sports, expos, conferences, community, and festivals. Some academic and school holiday events may also include a phq_attendance value to indicate student numbers.
For multi-day events, phq_attendance represents total attendance across the entire duration, except for certain categories like conferences, where it reflects daily attendance.
PHQ_RANK
A log scale numerical value between 0 and 100 with a five-level hierarchical impact schema. It is designed to represent the potential impact of an event independent of its geographical location.
LOCAL_RANK
Similar to PHQ Rank, this is a log scale numerical value between 0 and 100 with a five-level hierarchical impact schema. It is designed to represent the potential impact of an event on its local geographical area.
Local Rank is calculated for events in the categories community, concerts, conferences, expos, sports, festivals, performing-arts. If local_rank is not intended to be available for an event, this field will be null
.
AVIATION_RANK
A log scale numerical value between 0 and 100 with a five-level hierarchical impact schema. Aviation Rank indicates how much an event will impact flight bookings by considering both domestic and international travel.
STATUS
The publication state of the event.
Possible values:
active
- The event is an active event.postponed
- The event is a postponed event, and is expected to occur at a later date.cancelled
- The event is a cancelled event and is not expected to occur at a later date.
BRAND_SAFE
Whether or not this event is considered brand-safe. Examples of brand-unsafe events include content that promotes hate, violence, or discrimination, coarse language, content that is sexually suggestive or explicit, etc.
PARENT_EVENT_ID
CANCELLED_DT
The date and time the event was marked as cancelled, presented in the UTC timezone. This field will be null if STATUS
is not set to "cancelled" or if the cancellation date is unavailable.
POSTPONED_DT
The date and time the event was marked as postponed, presented in the UTC timezone. This field will be null if STATUS
is not set to "postponed" or if the postponement date is unavailable. Note that this field does not represent the new date and time of the postponed event.
PREDICTED_EVENT_SPEND_ACCOMMODATION
PREDICTED_EVENT_SPEND_HOSPITALITY
PREDICTED_EVENT_SPEND_TRANSPORTATION
PHQ_LABELS
ALTERNATE_IDS
All alternate IDs for the event. Any event IDs that may have been used for this event in the past will be included here. It does not include the current event ID.
EVENT_START_LOCAL
The date and time when the event begins, expressed in the event's local time zone.
EVENT_END_LOCAL
The date and time when the event ends, expressed in the event's local time zone.
PREDICTED_END_LOCAL
The date and time when the event is predicted to end, expressed in the event's local time zone.
REGION
The region in which the event will be occurring. This field will be null
if the event covers more than a single region.
LOCALITY
The locality in which the event will be occurring. A locality is most commonly referred to as a city or town. This field will be null
if the event covers more than a single locality.
POSTCODE
The postal code or ZIP code in which the event will be occurring. This field will be null
if the event covers more than a single post code.
FORMATTED_ADDRESS
A full formatted address which can include street addresses, locality, postcode, region, and country.
ROW_INSERTED_DT
The date and time this row was inserted in UTC. The row may have been deleted and inserted multiple times so it does not reflect when an event is first seen, use CREATE_DT
for that.
ROW_UPDATED_DT
The date and time this row was last updated in UTC.
CHANGE_ACTION
Indicates if the record has been updated, deleted or inserted. Use when processing the data file to keep your database updated.
Possible values:
insert
- new record, not previously seen.update
- existing record, updated values.delete
- deleted record, remove from your dataset.
Last updated
Was this helpful?