Keeping Data Updated
It’s common for event information to change frequently. For example, a date changes, an event gets cancelled or ranking is updated. It’s important to keep your data set up to date.
Other Providers
Many event providers don’t remove expired or cancelled events from their data, requiring you to take a full data export each time, wasting time and effort. We help to make the process of staying updated easier.
Why is this important?
Events often change or get deleted because they’re cancelled, postponed, marked as duplicates or identified as spam. In February 2020, concerns about COVID-19 led to a 500% spike in cancellations and postponements of significant events when compared to February 2019. It’s important to make sure you regularly refresh your store of intelligent event data and keep your local database in sync with ours to stay up-to-date.
What is the updated field?
The updated field is a field in our event data that indicates when the event record was last changed in our system.
For example an event for a conference that starts on 1 April 2020 may originally be created in our system on 1 Jan 2020. Then the event’s details don’t change until the 5 Feb 2020 when the event is cancelled. The event’s state field is then changed to deleted
and the deleted_reason
is changed to cancelled
. The event’s updated field would show 2020-02-05T05:00:00Z
(for the cancellation change made to the event on 5 Feb 2020).
The updated field is often used by our customers who have a data lake to keep their copy of the data up to date. They will query our Events API to retrieve all events that have been updated since they last queried the API.
When we change the updated field?
The updated field is changed whenever any values in any of the event fields in our Events API endpoint change. All fields are documented in our in our endpoint documentation.
If a new field is added to the Events endpoint that provides new data that is not already present in the event, then the updated field will be updated once the new field is added.
If a new field is added to an event that is reserved for future use and doesn’t provide any new data, then the updated field will not be changed. For example, a new field is added to events but is empty or duplicates an existing field.
If a field is removed from an event then the updated field will be changed (this is rare or very unlikely to happen as we maintain backwards compatibility).
If internal data about the event in the PredictHQ system changes, but that data is not exposed via the Events API in any fields, then the updated field will not be updated.
See also API Changes.
How to do it
Choose whether you want to look at events that are still occurring, have been cancelled, or postponed.
Active Events
The active.*
parameter makes it easy to find events that are happening during a date range, including ones that start before and/or after the range. If an event overlaps with the range, it’ll be included.
curl -X GET "https://api.predicthq.com/v1/events?active.gte=2020-01-01&active.lte=2020-01-31" \
-H "Authorization: Bearer $ACCESS_TOKEN"
How to keep your copy of events updated
When you store events and are interested in retrieving a list of all events that have changed since a specific date and time, the updated.*
parameter is useful in conjunction with the state=*
parameter.
curl -X GET "https://api.predicthq.com/v1/events/?updated.gte=2020-01-01&updated.lte=2020-02-29&state=active,deleted" \
-H "Authorization: Bearer $ACCESS_TOKEN"
If you are storing an offline copy of PredictHQ data (for example if you are storing events in a data lake) then you should write code that checks the updated date periodically for events updated since the last time you called the API. For example, if you were calling the API for updated events every 24 hours then you would make a call to the Events API to check all events updated in the last 24 hours (e.g. current time - 24 hours).
Note that you can use the Features API to directly get features and aggregations on the data instead of downloading large volumes of event data and keeping it up to date. The Features API provides pre-built features that also save you from writing a lot of code to aggregate data and create features on your side.
Cancelled and Postponed Events
The most accurate forecasting requires constant updates as event data is so dynamic. It’s important to have real-time updates on events if they have been cancelled or postponed so you can react swiftly. If you want to only look at events that have been cancelled or postponed, use the parameter deleted_reason=cancelled,postponed
.